About Brian McGill

I am a macroecologist at the University of Maine. I study how human-caused global change (especially global warming and land cover change) affect communities, biodiversity and our global ecology.

The death knell for the “one right way” approach to statistics? and the need for Sturdy Statistics

Last week Jeremy linked to yet another study where expert researchers (social scientists in this case) were asked to analyze the same dataset. The key findings were: a) that the experts had broad disagreements about the best way to analyze the data, and b) that these differences were consequential in leading to totally different outcomes (positive, negative or no statistically significant effect). This should hardly be news; earlier studies have found this in another social science dataset and in fMRI scans in neurobiology.

You really have to pause and let that in. Established researchers each think a different method of analysis is best, and these different methods give completely different, even opposite answers for the same data on the same question. Even controlling for subtly different questions or experimental designs, the answer you get depends entirely on which person you give the data to for analysis!

This should be the death of “one right way” approach to statistics

It is hard to read these studies any other way than as a major blow to the view/approach that there is one right way to do statistics. Or at least it should be. I assume some will continue to think that there really is “one right way” (theirs) and that the variance occurs because most of the researchers (everybody else) are just plain wrong. But that is a bit too egocentric and lacking of perspective to my mind. I know I’ve offended people over my blogging history on this point (which has not been my intent), but I just find it really impossible to accept that there is only one right way to analyze any statistical problem. Statistics are (probabilistic) models of reality, and it is impossible to have a perfect model, thus all models are involved in tradeoffs. And these new studies just feel like a sledge hammer of evidence against the view there is one right way (even as they make us aware that it is really inconvenient that there is not just one right way).

When you look at the differences among researcher’s approaches reported in the studies, they’re not matters where on could dismiss an approach as being grossly wrong. They’re the kinds of things people debate all the time. Log transform or square root transform or no transform (yes sometimes log or no-log is clearly better, but there are a lot of data sets out there where neither is great and it is a matter of judgment which is better – I’ll give an example below). Or OLS vs logistic vs other GLM. Multivariate regression vs. principal component regression vs regression tree. AIC vs automated vs. researcher variable selection. Include a covariate or leave it out. And etc. There is no such thing as the “one true right way” to navigate these. And as these meta-studies show, they’re not so trivial we can ignore these differences of opinion either – conclusions can change drastically. So, again, these results really should give us pause. Ninety percent of our published articles might have come to an opposite conclusion if somebody else did the stats even with the same data! (And one person was “smart” and the other “dumb” is not really a viable explanation).

Is ecology and evolution different?

Or maybe the ecology literature is safe? For those of us in ecology and evolution, our time to find out is coming. A similar study is underway right now. I suppose logically there could be two outcomes. 1) Same results as previous studies – total researcher-dependent chaos. 2) Different results – the chosen question and dataset has a strong answer and lots of different methods recover the same answer (qualitatively – small changes in effect sizes and p-values are OK). A lot of people in response to Jeremy’s question of what to do about these studies seemed to be really thinking (hoping?) that ecology was different and would come out with outcome #2.

Personally, I doubt it. I don’t think fields are that different. Different questions within a field are the important difference. All fields sometimes chase large effect sizes, which will give outcome #2 (when you can see the pattern visually in the data, methods aren’t going to change the story), and sometimes fields chase small effects which will give outcome #1 (when the effect sizes are small and you have six control variables, it matters a lot how you analyze the data). But here’s the key: we don’t know after we’ve completed our study with a single analysis path whether our question and results are in outcome #1 (different paths would give different answers) or #2 (different paths would give similar answers). If we knew that, we wouldn’t have done the study!  Sometimes studies of weak effects come up with an estimate of strong effect, and sometimes studies of a strong effect come up with an estimated weak effect. So trying to use statistics to tell us if we are in #1 or #2 is circular. This is a really key point – it might seem that the only way to tell if we are in #1 or #2 is to do some giant metastudy where we get a couple of dozen researchers to analyze the same question on the same dataset. That hardly seems practical! And the study being done on evolutionary ecology and conservation ecology questions could end up either in #1 or #2 (in my hypothesis depending on whether they are giving researchers weak effect or strong effect datasets/problems), so that is not a comprehensive guide for all of ecology and evolution. What we really need is a meta-meta-study that does several dozen of these meta-studies and then analyzes how often #1 vs #2 comes up (Jeremy has these same thoughts). I’m willing to bet pretty heavily that ecology and evolution have publications both that are safe (scenario #2) and completely dependent on how the analysis was done (scenario #1). In my own research in macroecology, I have been in scenarios where #1 is true and in scenarios where #2 is true.

Couldn’t individual authors just explore alternative analysis paths?

If we can’t afford to conduct a meta-analysis with a few dozen researchers independently analyzing each data set for each unique question (and we surely can’t!), then what alternative is there? There is an obvious alternative. An individual researcher can explore these alternatives themselves. A lot of researchers already do this. I bet every reader of this post has at one time tried with and without a log-transform or OLS vs. GLM assumptions on residuals. And, nominally, a strong majority of ecologists think such robustness checks are a good thing according to Jeremy’s poll. So it’s hardly a novel concept. In short, yes, it is clearly possible for a single author to replicate the main benefits of these meta-analyses by individually performing a bunch of alternative analysis paths.

But there is a deep aversion to doing this in practice. It is labelled with terrible names like “p-hacking” and the “garden of forking paths” (with its implicit reference to temptation in the Garden of Eden). I know in my own experience as a reviewer, I must have had a dozen cases where I thought the outcome reported was dependent on the analysis method and asked for a re-analysis using an alternative method to prove me wrong. Sometimes, the editor backs that request up (or the authors do it voluntarily). But a majority of the time they don’t. Indeed, I would say editors are much more receptive to “the stats are wrong, do them differently” than “the stats are potentially not very sturdy, try it a 2nd way and report both”.

Thus even though it seems we kind of think such single author exploration of alternative analysis approaches is a good idea in a private poll, it’s pretty murky and cloaked in secrecy and disapproval from others in the published literature and peer review process.

And there are of course some strong reasons for this (some valid, some definitely not):

  1. The valid reason is that if an author tries 10 methods and then picks the one with the most preferred results and only reports that, then it is really unethical (and misleading and bad for science), although in private most scientists admit it is pretty common.
  2. The invalid reason is that doing multiple analyses could take a seemingly strong result (p<0.05 is all that matters right?) and turn it into a murky result. It might be significant in some analyses and not significant in others. What happens if the author does the requested robustness check by analyzing the data a 2nd way and loses statistical significance? This is a really bad, but really human, reason to avoid multiple analyses. Ignorance is NOT bliss in science!

So how do we stay away from the bad scenario (reason #1 above) while acknowledging that motive #2 is bad for science in the long run (even if it is optimal for the individual scientist in the short run)?

Well, I think the solution is the same as for exploratory statistics, take it out of the closet and celebrate it as an approach and brag about using it! If we’re supporting and rewarding researchers using this approach, they’re going to report it. And scenario #1 goes away. Unlike exploratory statistics which at least had a name, this statistical approach has been so closeted it doesn’t even have a name.

Sturdy statistics – a better approach to statistics than “one true way”

So I propose the name/banner that I have always used in my head: “sturdy statistics”. (Robust statistics might be a better name but that has already been taken over for a completely different context of using rank statistics and other methods to analyze non-normal data). The goal of sturdy statistics is to produce an analysis that is, well, sturdy! It stands up against challenges. It weathers the elements. Like the folk tale of three little pigs – it is not a house/model made of straw that blows over at the first big puff of challenge (different assumptions and methods). I seek to be like pig #3 whose statistics are made of brick and don’t wobble every time a slightly different approach is used, and – important point – not only am I’m not afraid to have that claim tested, I WANT it tested.

A commitment to sturdy statistics involves:

  1. Running an analysis multiple different ways (an experienced researcher knows what alternative ways will be suggested to them, and we can help graduate students learn these).
  2. If the results are all qualitatively similar (and quantitatively close), then, great!, report that the analyses all converged so the results are really, truly sturdy
  3. If the results are different then this is the hard part where commitment to ethics come in. I think there are two options:
    1. Report the contrasting results (this may make it harder to get published, but I’m not sure it should – it would be more honest than making results appear sturdy by only publishing one analysis path thanks to shutting off reviewers who request alternative analysis paths)
    2. A more fruitful path is likely digging in to understand why the different results happened. This may not be rewarding and essentially leave you at 3a. But in my experience it very often actually leads to deeper scientific understanding which can then lead to a better article (although the forking paths should be reported they don’t have to take center stage if you really figure out what is going on). For example it may turn out the result really depends on the skew in the data and that there is interesting biology out in that tail of the distribution.
  4. As a reviewer or editor, make or support requests for alternative analyses. If they come back the same, then you know you have a really solid result to publish. If the authors come back saying your suggestion gave a different answer and we now understand why, then it is an open scenario to be judged for advancement of science. And if they come back different, well, you’ve done your job as a reviewer and improved science.

Sturdy statistics – An example

I’m going to use a very contrived example on a well-known dataset. Start with the Edgar Anderson (aka Fisher) iris dataset. It has measurements of sepal length and width and petal length and width (4 continuous variables) as well as species ID for 50 individuals in each of 3 species (N=150). It is so famous it has it’s own Wikipedia page and peer reviewed publications on the best way to analyze it. It is most classically used as a way to explore multivariate techniques and to compare/contrast e.g. principal component analysis vs. unsupervised clustering vs discriminant analysis, etc. However, I’m going to keep it really simple and parallel to the more common linear model form of analysis.

So let’s say I want to model Sepal.Length as a function of Sepal.Width (another measure of sepal size), Petal.Length(another measure of overall flower size and specifically length) and species name (in R Sepal.Length~Sepal.Width+Petal.Length+Species). As you will see this is a pretty reasonable thing to do (r2>0.8). But there are some questions. If I plot a histogram of Sepal.Length it looks pretty good, but clearly not quite normal (a bit right-skewed and platykurtic). On the other hand, if I log-transform it, I get something else that is not terrible but platykurtic, bimodal and a bit left-skewed (by the way Box-Cox doesn’t help a lot – any exponent from -2 to 2 is almost equally good). One might also think including species is a cheat or not, so there is a question about whether that should be a covariate. And of course we have fixed vs. random effects (for species). I can very easily come up with 6 different models to run (see R code at the bottom): simple OLS as in the formula already presented, same but log transform Sepal.Width, same as OLS but remove species, or treat species as random (one would hope that is not too different), or use a GLM with a gamma distribution which spans normal and lognormal shapes (but the default link function for gamma is log, but maybe it should be identity). And tellingly, you can find the iris data analyzed most if not all of these ways by people who consider themselves expert enough to write stats tutorials. Below are the results (the coefficients for the two explanatory variables – I left out species intercepts for simplicity – and r2 and p-values where available – i.e. not GLM).

Results of a sturdy analysis of the iris data predicting Sepal Length

What can we make out of this? Well Sepal Width and Petal Length are both pretty strongly positively covarying with Sepal Length and combine to make a pretty predictive (r2>0.8) and highly statistically significant model. That’s true in any of the 6 analyses. Log transforming doesn’t change that story (although the coefficients are a bit different and remain so even with back-transforming but that’s not surprising). Using Gamma-distributed residuals doesn’t really change the story either. This is a sturdy result! Really the biggest instability we observe is that relative strength of Petal Length and Sepal Width change when species is or isn’t included (Petal Length appears more important with species, but Sepal Width is relatively more important without species*). So the relative importance of the two variables is conditional on whether species is included or not – a rather classic result in multivariate regression. If we dig into this deeper we can see that in this dataset two species (virginica and versicolor are largely overlapping (shared slope and intercept at least), while setosa has higher intercept but similar slope vs Sepal Width, but vs. Petal Length, the slope for setosa also varies substantially from the other two so slope estimates would vary depending if you control species out or not (and maybe a variable slope and intercept model should be used). So that one weak instability (non-sturdiness) is actually pointing a bright red sign at an interesting piece of biology that I might have ignored if I had only run one analysis (and additional statistical directions I am not going to pursue in an already too long blog post). This paragraph seems simultaneously like a sturdy result, but at the same time, a sturdiness analysis caused me to dig a bit deeper into the data and learn something biologically interesting. Win all around!

And in a peer review context, that exercise hopefully saves time (reviewers not needing to request additional analyses), is fully transparent on the analyses done (no buried p-hacking), and draws convincing conclusions that leave science in a better place than if I had just chosen one analysis and doubled down on insisting it was right.

Conclusions

TL;DR: Sometimes the answer to a question on a dataset is sturdy against various analysis approaches. Sometimes it’s not. We can’t know a priori which scenario we are in. The logical solution to this is to actually try different analyses and prove our result is “sturdy” – hence an inference approach I call “sturdy statistics”. To avoid this turning into p-hacking it is important that we embrace sturdy statistics and encourage honest reporting of our explorations. But even if you don’t like sturdy statistics, we have to get over the notion of “one right way” to analyze the data and come up with a solution to finding out if multiple, reasonable analysis paths lead to different results or not, and what to do if they do.

What do you think? Do you like sturdy statistics? Do you already practice sturdy statistics (secretly or in the open)? Do you think the risk of sturdy statistics leading to p-hacking is too great? Or is the risk of p-hacking already high and sturdy statistics is a way to reduce its frequency? What needs to change in peer review to support sturdy statistics? Is there an alternative to sturdy statistics to address the many, many reasonable paths through analysis of one data set?

*NB: to really do this just by looking at coefficients I would need standardized independent variables, but the mean and standard deviation of the two variables are close enough and the pattern is strong enough and I am only making relative claims, so I’m going to keep it simple here.

R Code

data(iris)
str(iris)

#simplest model
mols<-lm(Sepal.Length~Sepal.Width+Petal.Length+Species,data=iris)
# log transform sepal length?
mlogols<-lm(log10(Sepal.Length)~Sepal.Width+Petal.Length+Species,data=iris)
#role of species as a covariate?
mnosp<-lm(Sepal.Length~Sepal.Width+Petal.Length,data=iris)
#species as random instead of fixed (shouldn't really differ except d.f.)
library(lme4)
mrndsp<-lmer(Sepal.Length~Sepal.Width+Petal.Length+(1|Species),data=iris)
#Gamma residuals (a good proxy for lognormal) 
# with default log transformation on dependent variable
mgamlog<-glm(Sepal.Length~Sepal.Width+Petal.Length+Species, data=iris,                                                            family=Gamma(link="log"))
#No log transformation on 
mgamident<-glm(Sepal.Length~Sepal.Width+Petal.Length+Species,
data=iris, family=Gamma(link="identity"))
# Is Sepal.Length better log transformed or raw?
hist(iris$Sepal.Length)
hist(log(iris$Sepal.Length))
#hmm not so obvious either way
#do these choices matter?

#pickout relevant pieces of result from either GLM or OLS objects
#return a row in a data frame
report1asdf <- function(mobj) {
  co=coef(mobj)
  if (!is.numeric(co)) {co=as.numeric(co$Species[1,]); co[1]=NA} #GLM var intrcpt
  s=summary(mobj)
  #handle GLM with no p/r2
  if (is.null(s$r.squared)) s$r.squared=NA
  if (is.null(s$fstatistic)) s$fstatistic=c(NA,NA,NA)
  data.frame(
    #CoefInt=co[1],
    CoefSepW=co[2],
    CoefPetL=co[3],
    r2=s$r.squared,
    p=pf(s$fstatistic[1],s$fstatistic[2],s$fstatistic[3],lower.tail=FALSE)
  )
}

#assemble a table as a dataframe then print it out
res<-data.frame(CoefSepW=numeric(),CoefPetL=numeric(), r2=numeric(),p=numeric())
res<-rbind(res,report1asdf(mols))
res<-rbind(res,report1asdf(mlogols))
res<-rbind(res,report1asdf(mnosp))
res<-rbind(res,report1asdf(mrndsp))
res<-rbind(res,report1asdf(mgamlog))
res<-rbind(res,report1asdf(mgamident))
row.names(res)=c("OLS","OLS Log","OLS No Sp.","Rand Sp","GamLog","GamIdent")
print(res)

PhD and postdoc openings in McGill lab at UMaine and postdoc opening in Niles and Gotelli labs at UVM

First the ad for a PhD and Postdoc at UMaine

Two positions are open to work in Brian McGill’s lab at the University of Maine as part of a larger group of eight faculty in Maine and Vermont on a large grant: Barraccuda (Biodiversity and RuRal communities Adapting to Climate Change Using Data Analysis) (OK the acronym is a stretch and misspelled, but it gets the main ideas across!). We are a team of ecologists, social scientists and spatiotemporal data scientists. Goals include: 1) modelling the response to climate change in birds, trees, certain crops, and zoonotic hosts of certain diseases, 2) understanding how rural communities will adapt to the changing environment, 3) improving the toolset for ecologists and social scientists working with spatiotemporal data, and 4) learning how to better communicate these complex results to stakeholders (especially in agriculture). Our subteams are organized around these four themes.

  • The PhD position will have as its primary focus working on theme #1 modelling responses of organisms to climate change. Funding is for one year with extensions possible for two more years depending on satisfactory performance and funds availability. There would also be teaching assistantships and the possibility for other grants to fill in up to 5 years of funding total. Requirements including a bachelors in ecology or related field and either existing skills or a strong desire to learn more data science. Opportunities to work and learn in other areas of the project also exist. A MSc is beneficial but not required. Stipend is $26K/year with annual cost of living increases plus coverage of tuition and health care. Start date is summer 2021.
  • The postdoc position will be more integrative and would work across all four themes of the project (ecological modelling, social sciences, stakeholder engagement and data science). This position would also work closely with the Waring lab at UMaine. Again funding would be for one year with extensions possible for two more years depending on performance and funds availability. Requirements include a PhD in a relevant field (e.g. ecology or related, social sciences related to rural communities, or data science or related) and a strong desire to learn and work in the other two areas. The ability to work both collaboratively and independently is also essential. Salary is $48-55K/year commensurate with experience and with annual cost of living increases plus a strong package of benefits according to the UMPSA agreement including healthcare and retirement plan contributions. Desired start date is summer 2021 (although earlier or slightly later is negotiable).

The University of Maine is located in Orono, ME which has a low cost of living, supports a walkable/bikable lifestyle, and has exceptional access to the outdoors ranging from a river, a lake and a trail network in town to national parks and wilderness not far away. Being part of the Bangor Metropolitan area and a university also results in good access to cultural events and services like health care, restaurants and shopping. We have an airport with direct connections to most East Coast cities and are a four hour car or bus ride away from Boston. We also have great K-12 schools if you are at a life stage where that matters. If you’re looking for clubbing until 2AM and eating in a different restaurant every night of the week, it might not be a fit, but most everybody else finds the quality of life excellent here (it’s pretty cute what they consider to be a “traffic jam” here).

The University of Maine is an equal opportunity employer and members of underrepresented minorities are encouraged to apply. To apply please submit a cover letter explaining fit to and interest in the project as well as a CV as a single PDF to mail@brianmcgill.org. Graduate student applicants should also include a transcript (GRE scores are optional but may be submitted if the student wishes and the same for TOEFL scores if not a native English speaker). Please note that if selected, the graduate student applicant will also need to apply to either the School of Biology and Ecology or the Ecology and Environmental Studies PhD program, but this can be done later. Review of applications will start February 19th and continue until the positions are filled. Please contact Brian McGill at mail@brianmcgill.org with questions.

And the ad for a postdoc at UVM (University of Vermont)

Post-Doctoral Position- Species Distribution Modeling of Biodiversity and Adaptation of Farmers and Rural Communities To Climate Change

The University of Vermont is seeking qualified applicants for a two-year post-doctorate position, with potential for renewal for another two years, to use species distribution modeling to understand how biodiversity, farmers and rural communities adapt to the challenges of climate change.  The project includes the aggregation and development of largescale datasets of biodiversity, farmer behavior and perceptions across US states, construction of mechanistic, spatially explicit models of range shifts with climate adaptation, and application of these models to farmer and rural community responses to climate change.

Background

Funded through a National Science Foundation grant, the research project with collaborators at University of Vermont (Dr. Meredith Niles, Dr. Nicholas Gotelli, Dr. Laurent Hébert-Dufresne) and University of Maine (Dr. Tim Waring, Dr. Brian McGill, Dr. Kati Corlew, Dr. Matthew Dube), seeks to understand how both rural human communities and species populations will respond to challenges posed by climate change [1]. The project will synthesize large amounts of data and develop new species distribution models to predict climate-driven shifts in species ranges as well as the responses and cultural adaptations of human communities. The project will also work with farmers and rural communities to understand their perspectives of the projected outcomes and responses. A successful applicant will work with a multidisciplinary team of biologists, social scientists and complexity researchers in Maine and Vermont.

Aims

The two main aims of this position are 1) to develop mechanistic, spatially explicit models of species range shifts, and 2) to develop a better understanding of the interaction of humans with biodiversity change and the ability of farmers and rural communities to adapt to climate change. This requires the assembly and analysis of species occurrence data (birds, trees, crops, and diseases), and datasets related to land use and farmer behavior.  Tasks include the identification of existing public datasets, the curation, aggregation, and synthesis of multiple data types, and the generation of novel species distribution models and indicators of climate adaptation and associated behaviors.  In addition, the post-doctorate will  help to integrate these data with evolutionary models of cultural adaptation to climate change and engage with agricultural and rural communities, including in presentation of results to diverse stakeholders and policy makers.

Position

The position is one of five new hires that form the core of the four-year research project funded by the National Science Foundation. The post-doctorate will be co-advised at the University of Vermont by Dr. Meredith Niles (www.meredithtniles.com) in the Food Systems Program of the Department of Nutrition and Food Sciences, and Dr. Nicholas J. Gotelli (http://www.uvm.edu/~ngotelli/homepage.html) in the Department of Biology. The Niles and Gotelli labs have a strong commitment to interdisciplinary research, biodiversity modeling, food systems science, and open access principles.  Salary range will be $48,000-$52,000), depending on experience.  There are a number of generous benefits associated with the position, which can be found at: https://www.uvm.edu/hrs/postdoctoral-associates-benefits-overview .  The post-doctorate will also have opportunities for professional development and travel associated with the project, as relevant, as well as engagement with other professors on the project, especially Dr. Tim Waring, Dr. Laurent Hébert-Dufresne, and Dr. Kati Corlew.

Requirements

Essential

  • Successful completion of a PhD in a relevant field of biology, social science, or data science
  • Demonstrated research and academic excellence evidenced by existing publications in relevant topics
  • Experience constructing, fitting, testing, and comparing species distribution models with species occurrence data
  • Excellent data science and social science quantitative skills
  • Experience with data aggregation and curation, especially across diverse types of datasets
  • Significant experience with Python and familiarity with other languages such as R, SQL, Stata, etc.
  • Excellent communication skills and ability to work with an interdisciplinary team across multiple institutions
  • Self-directed and ability to lead projects and learn new skills
  • Mature, organized, professional and courteous

Desired

  • Experience in interdisciplinary approaches to human behavior, especially in social-ecological systems
  • Experience working with farmers or rural communities
  • Strong interest and experience in data visualizations
  • Understanding, or interest, in stakeholder engagement
  • Understanding, or interest, in qualitative methods, including focus groups
  • Enthusiasm for open data and science practices

Application:

Please address questions and completed applications electronically to Dr. Meredith Niles (mtniles@uvm.edu) and Dr. Nicholas Gotelli (ngotelli@uvm.edu). Applications should include:

  1. A cover letter detailing your interest in the position, how you meet the essential and desired requirements, and details of past research projects
  2. A CV or resume, including three references (with name, phone, email).

Review of materials will begin February 15th 2021 and continue until the position is filled.

Here we go again – the planet is practically dead

So the 2020 version of the Living Planet Report has been released to massive headlines blaring catastrophe. The central claim is that vertebrate (i.e. fish, amphibian, reptile, bird, mammal) local populations declined, on average, by 68% from 1970 to 2016 (the report is released 4 years after the end of the data). The authors of the report have done a much better job of getting out the notion that this is an average decline. IE they’re not claiming that there are 68% fewer vertebrate individuals on the planet, but that the average decline is 68% (but see footnote)*.

To invert their claim, the average vertebrate population in 2016 is 32%  (100%-68%) of the size that it was in 1970. If we look at the 2018 report it says that the average vertebrate population in 2014 is 40% of what it was in 1970. And the average vertebrate population in 2010 is 48% of what it was in 1970. So if a population in 1970 was of size N then, 2010=0.48N, 2014=0.40N, and 2016=0.32N. Wow! That is a 52% decline in the 40 years from 1970 to 2010, a 16.6% decline in four years from 2010 to 2014 and a remarkable 20% decline from 2014 to 2016. The math is a little complex because it is exponential, not linear, decline but that gives a 1.82% decline per year from 1970 to 2010, a 4.46% annual decline from 2010-2014, and a 10.6% per-year decline from 2014-2016. So not only are there huge declines, but the declines appear to be accelerating (admittedly with small samples for recent years). If we are conservative in the face of this accelerating trend and hold declines constant for the next 10 years (from 2016 so to 2026) at 10.6%/year and start in 2016 at 32% of 1970 numbers then we are down to 10% of the 1970 numbers by 2026. Do you believe that! In 6 years from now the average population will be just 10% of what it was in 1970. (To be clear, the LPI authors did not make this claim – I did, but it is just a 10 year extrapolation from their numbers). You would think such a decline would be more obvious to the casual observer. I’m old enough to remember 1970 and have spent a lot of time in the woods in my life. If there was a 20% decline (or increase) I’m not sure my fallible memory would reliable detect the change (in fact I’m pretty sure it wouldn’t). But if there were 90% less birds on average than my childhood, I would have thought I would have noticed. You would also think the world would be absolutely exploding with things vertebrates eat (e.g. insects and plants).

If this isn’t happening, then what is going on? Well for starters, it is pretty dicey to take short term rates and extrapolate them when things grow or decline exponentially. If you do that you are liable to find out everything is extinct or at infinity pretty quickly. So lets go back to the core claim straight from the report – there has been a 68% decline in the average vertebrate population since 1970. Not quite as extreme, but you would still think I (and a lot of other people) would have noticed declines in vertebrates of this extent not to mention the boom of insects and plants as they’re freed from predation.

If you don’t trust my fond recollections of my childhood nor my extrapolation to what should have happened to insects and plants (as you definitely shouldn’t!), then how about this. The LPI core result is completely different than other studies (not cited in the Living Planet Report for what it is worth). Several, like the LPI, track thousands of populations over decades. All (like the LPI) suffer from some observer bias – scientists have more data in temperate regions and near cities and for bigger animals, but there has been no evidence to date that this fact is biasing the results of any of the three studies. First, here is a plot very similar to the LPI plot but for invertebrates in the UK by Outhwaite and colleagues in Nature Ecology and Evolution:

Now this is invertebrates, not mammals, but what we see is 3 broad groups have abundances higher than they did in 1970 (freshwater species showing a spectacular recovery possibly due to clean water laws), and one broad group that is down just a smidge. The overall balance across all 4 groups is a 10% INCREASE.

Here is a paper by Dornelas and colleagues in Ecology Letters (disclosure I am a co-author):

They (we) used a slightly different method – we calculated the slope of the timeseries and then plotted histograms of the slopes. Note that there is a lot of variability with some real declines and real increases, but the overall trend across populations is strongly centered on (i.e. averages on) about zero (neither up nor down). In fact the title of that paper is “A balance of winners and losers in the Anthropocene” and finds that 85% of the populations didn’t show a trend significantly differently from zero, 8% significantly increased, and 7% significantly decreased. A lot of churn of which species are up or down, but NOT an across the board catastrophic decline. Maybe this is because Outhwaite and Dornelas didn’t study vertebrates? Unlikely. Dornelas et al did pull out different taxa and found that reptiles, amphibians and mammals skewed to more increases than decreases and no real difference from zero in birds and fish (their Figure 4). Or check out Leung et al who analyzed a subset of the LPI data (hence all vertebrates) focusing on the well sampled North American and European regions using a different methodology who got more groups increasing than declining. Or check out Daskalova et al who also found winners and losers were balanced (and most species were neutral). Even the most extreme result of the studies that exclusively use longer term data to look at this question that I am aware of (van Klink et al) shows a 35% decline over 45 years for terrestrial insects and 60% increase over the same period in aquatic insects. I think it is an interesting and challenging question why these studies received little press (despite also being published in high profile journals), but the LPI gets enormous coverage every time it comes out.

These 5 other studies more closely match my childhood memories. There could be weaker trends (+ or – 10 or 20%). And for sure I could be seeing different species (winners replacing lowers). But these 5 studies completely contradict the LPI result (all 5 find a robust mix of increases and decreases and most find something like a balance between increases and decreases). So what is going on?

For one thing, I think the LPI bites off too much – it tries to reduce the state of vertebrates across continents and species to a single number (aka index). That has to sweep a lot of complexity under the rug! There is underlying variability in the LPI too – they just don’t emphasize it as that is not their point. And to a large extent these other papers are just unpacking that complexity by exposing the underlying high variability in trends.

But those other papers find a more neutral balance while the LPI most definitely does not. Something more has to be going on. It could be their data (but some of the aforementioned papers used the same or a subset of the data). Or it could be their methodology (but some of the aforementioned papers used similar methodologies). Personally, I think it is a complex interaction between the data they are putting in and the weaknesses of the methodology (in the sense that every methodology has weaknesses, not that their methodology is fundamentally flawed or wrong). There may be more to say about this in the future. But for now, I hope we can at least pause and think and do a sanity check.

I want to leave no doubt that I am convinced humans are hammering the planet and the vertebrates (and invertebrates and plants) that live on it. We’re removing >50% of the [terrestrial] primary production each year, have removed more than 50% of the tree biomass, modified >50% of the land, use more than 50% of the freshwater, have doubled the amount of nitrogen entering the biosphere each year and nearly doubled the amount of CO2 in the atmosphere since pre-industrial times. But I also don’t think it is possible for there to be a 68% decline in 46 years leading to a projection of a 90% decline over 56 years (10 years from now) nor does a 20% decline in the last two years seem possible. The consequences of 68-90% gone is just too large not to be observed anecdotally and through indirect effects. And the 68-90% decline story just doesn’t align with other major, comprehensive, 1000s of datasets analyses of this question.

What I believe the data show is we’re creating winners and losers – some really big winners and some really big losers and a lot in between, and that’s bad – humans ARE massively modifying the planet in ways that all but the most biodiversity-hating people care about, and the extinctions we are causing are irreversible,so please don’t cite this blog as evidence that “everything is OK”. Its not. Is there room for an “in between”  (bad but not catastrophe) message?

But either way, please think twice before reporting that vertebrates are disappearing from the planet at these incredible rates. Because the logical conclusion is that nothing will be left in a very short time (decade or two) and that doesn’t pass the common sense test. This is not an “all scientists agree” scenario. I personally think the balance of evidence  (such as cited above) points pretty strongly against the LPI conclusion. I worry how many more years scientists (and reporters) can report catastrophic trendlines that predict little to no life of any sort on the planet within our lifetimes and not have people notice that this isn’t actually happening.

 

Note: I am indebted to many colleagues who have talked about this topic with me over the years, some of them co-authors on the paper cited here, some of them co-authors on forthcoming papers, some of them not co-authors, but I want to stress that the opinions here are controversial and my own so I am not listing them here.

 

* The report averaged rates of decline in populations, not total decline in number of individuals (unlike this catastrophic headline). But shouldn’t they be the same thing? Well yes if there were the same number of individuals in each population and each species then a 68% decline of 100 here (to 32) and a 68% decline of 100 there (to 32) would still result in a 68% decline (from 200 to 64). But we know in fact number of individual varies wildly (100x-1000x) across populations and species. So It would be a 68% of 1000 to 320 and a 68% decline of 10 to 3.2 giving 1010 to 323.2 which is STILL 68%. But now the fact the 68% is an average comes in. What if the 1000 declined by 60% to 400 and the 10 declined by 76% to 2.4 or 1010 to 402.4. That’s not a 68% decline but a 60.2% decline even though average the rates 60% and 76% still give an average 68% decline. We don’t know for sure whether large populations are more likely to decline or small populations are more likely to decline, but we do know that at least in birds abundant species are declining while rare species are increasing, so if you assume that it would mean things are actually even worse than the 68% decline in terms of total number of vertebrate individuals increasing, but we don’t know for sure. But I don’t think this is the central reason why the LPI numbers don’t match my childhood memories, nor other studies. With such large data and no truly strong correlations between abundance and decline, most of this comes out in the wash. So theoretically this could be a mathematical reason the total number of individuals has decreased by less than 68% even when the average decline across all populations is 68%. But I don’t think it likely. In fact I think in a weird way, arguing this is a way of distancing the LPI from what it is really claiming/implying.

Ecologists discussing science of coronavirus pandemic – open thread

I don’t know about you but as an ecologist, I am not an expert in disease dynamics nor part of the inner community rapidly exchanging ideas and data. But as an ecologist I have a better handle on notions of population growth, species interactions, individual encounter rates, etc than the average population (and probably the average scientist) and I have felt in a frustrating vacuum of information.

To address this, we’re trying something new here at Dynamic Ecology – an open thread, the main purpose of which is to have a place for the community to have a conversation. Our comments sections have long been the most interesting part of the blog, so now we’re creating a direct path to comments without your having to read 1000s of words of bloviation from me!

First, a few thoughts to give some common terminology/framing to the questions. I think ecologists all know about the power of exponential growth (although this is new and still poorly grasped to most of the world). R0 is the discrete growth rate with no immunity (naive population) and no efforts at social distancing. Best estimates I have seen for Covid 19 is about R0=2.5 which is a good bit higher than flu (and a good bit lower than measles). It seems to be becoming clearer that R0 is as high as it is because people can be infectious before they show symptoms (or even if they never show symptoms like children). Once immunities start to build up or quarantine/social distancing measures start to be put in place a lower growth rate Re (effective growth rate) is observed.  So as far as I can tell there are three strategies.

  1. Squeeze it – extreme social distancing to reduce Re<1. This seems to be what China as well as Japan and South Korea are doing (probably not coincidentally all Asian countries that got hit most by SARS and MERS).
  2. Let it burn – do nothing to lower Re=2.5. Sadly many (all?) countries started down this road – with exponential growth the speed of reaction required seems to be faster than governments can handle.
  3. Stretch it – social distancing to get Re~1.2 (nb 1.2 is an example, not a carefully calculated number, just a wild guess proxy as it is about what influenza does) so that the case load does not exceed hospital capacity. This is what everybody is talking about as “flattening the curve”.

With the stretch it and let it burn strategies the number of people who get sick and then have immunity rises to about 1-1/R0 or about 60% of the population (assuming getting sick once confers immunity – assumed right now but a few counter examples are out there). Then the effective growth rate Re drops below 1 and “herd immunity kicks in”. Individuals can still get sick but it can’t become a self-sustaining epidemic. The primary difference between let it burn and stretch it is the rate at which people get sick which is inversely correlated with how long the epidemic lasts.

I’ve posed several questions below to get this started. I’m not an expert. So the answers to some of these may be obvious in which case, I’d love to know the answer. But I have not seen the answers to any of these despite voracious reading. If they’re not so obvious I expect we could all learn from discussing them.

If you want to respond to a question stay in the same thread (even if the nesting stops at 3 levels). If you want to pose a new question, start a new thread. This is NOT a place for politics, so anything stronger than “many governments have been incompetent at X” (e.g. naming specific individuals, blaming one party or another, or getting distracted off science) will be deleted.

Nominate somebody for International Biogeography Society Awards!

As has been pointed out on this blog before, it does matter who we recognize for society awards. And one of the strongest filters on that is who is nominated. Award committees can’t give an award to somebody who isn’t nominated. It does take a little time and effort to nominate somebody, but not a lot (comparable to writing a letter of reference).

The International Biogeography Society gives out awards at its biennial conference. There is an Alfred Russell Wallace award for lifetime achievement and the MacArthur & Wilson award which targets “relatively early career” (<12 years from PhD).

You can find details on the awards and how to nominate somebody at: https://www.biogeography.org/news/news/2019-call-for-awards/

The deadline is November 29th to nominate somebody for the awards given at the next IBS meeting in Vancouver January 2021 (put it on your calendar to attend too!).

So what are you waiting for? Nominate a deserving biogeographer.

Did North America really lose 3 billion birds? What does it mean?

The journal Science released an article entitled “Decline of the North American avifauna” by Rosenberg et al today (Sep 19, 2019), and already disaster laden headlines are appearing in major newspapers (I’m not going to bother to link to them because they’ll probably change by tomorrow but I bet you’ve already seen this in your favorite news source).

Continue reading

Statistical Balkanization – is it a problem?

Aside from the question about what statistical methods are appropriate to use in ecology, there is a mostly independent question about how many statistical methods is optimal for use across the field of ecology. That optimum might be driven by how many techniques we could reasonably expect people to be taught in grad school and to rigorously evaluate during peer review. Beyond that limit, the marginal benefits of a more perfect statistical technique could easily be outweighed by the fact only a very small fraction of the audience could read or critique the method. To the extent we exceed that optimum and are using too many different methods, I think it is fair to talk about statistical Balkanization. Balkanization is of course a reference to the Balkans (the region in the former Yugoslavia) and how the increasing splintering into smaller geographic, linguistic and cultural groups became unsustainable and led to multiple wars. I think there is a pretty clear case that too many statistical methods in use is bad for ecology and thus the label of that state as Balkanization is fair (I’ll make that case below). I am less sure if we are there yet or not.

Continue reading

How badly do authors want open access? What priorities do authors really have? Bringing data to the discussion

If you believe the press, scientists are desperate to publish open access. Is this really true? Turning our scientific method onto ourselves and our peers, let’s see what kind of actual data there is. Every 3 years Ithaka SR (a consulting firm for non-profits) publishes a survey of US faculty for attitudes and behaviors that can help university libraries serve their faculty (https://sr.ithaka.org/publications/2018-us-faculty-survey/). The whole survey is well worth a read. There are interesting questions about social media, data storage, attitudes towards books, etc. But I want to home on their Figure 31 which summarizes data about what kind of journals faculty want to publish in.

Continue reading

Does the media seize on cases where humans benefit nature?

The single biggest fact about human impact on nature is that it is highly variable. We’re net cutting down forests in the tropics. But we are net increasing forest cover in eastern North America. Farmland birds are in decline in the US and Europe, but that is because farmland – a fairly intense human land use – is decreasing in area in those countries. Eutrophication is harmful to many organisms, but helpful to some. Local biodiversity is trending down in some places but trending up in others. In North America beaver and turkeys, after having been completely eliminated from most of their ranges, have made amazing recoveries trending towards near pre-European levels. Regional diversity, especially in plants, is often increased due to invasive species. Island diversity in birds is often flat or down.

None of those statements contradict the fact that humans are massively changing nature, in many ways for the worse. We have half the tree biomass today compared to what existed pre-human. We appropriate half the fresh water and terrestrial NPP annually. Extinction rates are elevated significantly. We have doubled the rate nitrogen is being introduced to the biosphere. Deer are above pre-European levels in the eastern US with devastating impacts on the structure of forests. Scientists have gotten very good about communicating these negative impacts and maybe have even evolved to a symbiotic relationship with much of the press in communicating this (media loves a disaster whether environmental or human).

But what do we as ecologists do about those facts that can be seen as positive impacts listed in the first paragraph? Continue reading

Bold opinion pieces, RIP?

A few weeks ago, I lamented the passing of papers like Janzen’s Why mountain passes are higher in the tropics (1969) or Janzen’s Herbivore and richness hypothesis (1970) (the Janzen half of Janzen & Connell hypothesis) or the Hairston, Smith & Slobodkin (HSS 1960) paper best known as “why is the world green” even though that is not really the title. These papers were highly speculative, waved a little bit of data around, but mostly put out a hypothesis that attracted researchers for decades. But you don’t really see these kinds of papers any more. Hence my question of whether we should assume this category of paper has come to rest in peace (RIP) (i.e. are dead). Continue reading