Listen to Jeremy talk about ecology and blogging on Utah Public Radio Wed. 9 am Mountain Time

I’m in Utah right now giving a couple of talks at Utah State University. Which led to me getting a very flattering invitation to tape an interview for Utah public radio. The interview will be broadcast on Wed. Sept. 16 at 9 am Mountain Time. It should be available online around the same time at the above link, under the Programs tab (either the Access Utah program, or the Science Questions program, not sure which).

In the interview, I talk about how I got into ecology, microcosms, spatial synchrony, my scientific role models, the biggest recent advances in ecology, and the blog. See if you can guess which questions I’d prepared an answer for in advance, and which ones caught me out and forced me to stall for time.* :-)

This was my first radio interview (indeed, my first interview for any non-ecologist audience), so it was a new experience. I tried to draw on some long ago, vaguely-remembered media training I had, and also on my limited experience explaining myself in other venues (e.g., I reminded myself of my elevator pitch). And I listened to a previous podcast from the show. My main goal was to avoid jargon and try to keep things non-technical.

I’m sure many of you have much more training and experience at this sort of thing than I have. So if you do decide to give it a listen, I’d love any feedback you have, especially what I could do better. I think it went pretty well, but I’m not the best judge, and I’m sure there’s room for improvement.

*This should not be difficult.

Detection probabilities, statistical machismo, and estimator theory

Detection probabilities are a statistical method using repeated sampling of the same site combined with hierarchical statistical models to estimate the true occupancy of a site*. See here for a detailed explanation including formulas.

Statistical machismo, as I define it in this blog, is the pushing of complex statistical methods (e.g. reviewers requiring the use of a method, authors claiming their paper is better solely because of the use of a complex method) when the gains are small or even occur at some cost. By the way, the opposite of statistical machismo is an inclusive approach that recognizes every method has trade-offs and there is no such thing as a best statistical method.

This post is a fairly technical statistical discussion .If you’re interested in detection probabilities but don’t want to follow the details, skip to the last section for my summary recommendations.


I have claimed in the past that I think there is a lot of statistical machismo around detection probabilities these days. I cited some examples from my own experience where reviewers insisted that detection probabilities be used on data sets that had high value in their spatial and temporal coverage but for which detection probabilities were not possible (and even in some cases when I wasn’t even interested in occupancy). I also discussed a paper by Welsh, Lindenmayer and Donnelly (or WLD) which used simulations to show limitations of detection probability methods in estimating occupancy (clearly driven by their own frustrations of being on the receiving end of statistical machismo for their own ecological papers).

In July the detection probability proponents fired back at WLD with a rebuttal paper By Guillero-Arroita and four coauthors (hereafter GLMWM). Several people have asked me what I think about this paper including some comments on my earlier blog post (I think usually in the same way one approaches a Red Sox fan and asks them about the Yankees – mostly hoping for an entertaining reaction).

The original WLD paper basically claimed that in a number of real world scenarios, just ignoring detection probabilities gave a better estimator of occupancy. Three real-world scenarios they invoked were: a) when the software had a hard time finding the best fit detection probability model, b) a scenario with moderate occupancy (Ψ=40%) and moderate detection probabilities (about p=50%), and c) a scenario where detection probabilities depend on abundance (which they obviously do). In each of these cases they showed, using Mean Squared Error (or MSE, see here for a definition), that using simple logistic regression only of occupancy ignoring detection probabilities had better behavior (lower MSE).

GLMWM basically pick different scenarios (higher occupancy Ψ=80%, lower detection p=20% and a different SAD for abundances) and show that detection probability models have a lower MSE. They also argue extensively that software problems finding best fits are not that big a problem**. This is not really a deeply informative debate. It is basically,” I can find a case where your method sucks. Oh yeah, well, I can find a case where your method sucks.”

Trying to make sense of the opposing views

But I do think  stepping back, thinking a little deeper, framing this debate in the appropriate technical context – the concept of estimation theory, and pulling out a really great appendix in GLMWM that unfortunately barely got addressed in their main paper, a lot of progress can be made.

First, lets think about the two cases where each works well. Ignoring detection worked well when detection probability, p, was high (50%). It worked poorly when p was very low (20%). This is just not surprising. When detection is good you can ignore it, when it is bad you err to ignore it! Now WLD did go a little further, they didn’t just say that you can get away with ignoring detection probability at a high p – they actually showed you get a better result than if you don’t ignore it. That might at first glance seem a bit surprising – surely the more complex model should do better? Well, actually no. The big problem with the detection probability model is identifability – separating out occupancy from detection. What one actually observes is Ψ*p (i.e. that % of sites will have an observed individual). So how do you go from observing Ψ*p to estimating Ψ (and p in the case of the detection model)? Well ignoring p is just the same as taking  Ψ*p as your estimate. I’ll return to the issues with this in a minute. But in the detection probability model you are trying to disentangle Ψ vs. p just from observed % of sites with very little additional information (the fact that observations are repeated on a site). Without this additional information  Ψ*p are completely unseparable – you cannot do better than randomly pick some combination of  Ψ and p and that together multiple to give the % of sites observed (and again the non-detection model essentially does this by assuming p=1 so it will be really wrong when p=0.2 but only a bit wrong p=0.8). The problem for the detection model is that if you only have two or three repeat observations at a site and p is high, then most sites where the species is actually present it will show up at  all two or three observations (and of course not at all when it is not present). So you will end up with observations of mostly 0/0/0 or 1/1/1 at a given site. This does not help differentiate (identify)  Ψ from p at all. Thus it is actually completely predictable that detection models shine when p is low and ignoring detection shines when p is high.

Now what to make of the fact, something that GLMWM make much of, that just using Ψ*p as an estimate for Ψ is always wrong anytime p<1. Well, they are correct about it always being wrong. In fact using the observed % of sites present (Ψ*p) as an estimator for Ψ is wrong in a specific way known as bias. Ψ*p is a biased estimator of Ψ. Recall that bias is when the estimate consistently overshoots or undershoots the true answer. Here Ψ*p consistently undershoots the real answer by a very precise amount Ψ*(1-p)  (so by 0.2 when Ψ=40%  and p=50%). Surely this must be a fatal flaw to intentionally choose an approach that you know on average is always wrong? Actually, no, it is well known in statistics that sometimes biased estimator are the best estimator (by criteria like MSE).

Estimation theory

Pay attention here – this is the pivotal point – a good estimator has two properties – it’s on average close to right (low bias), and the spread of its guesses (i.e. the variance of the estimate over many different samples of the data) is small (low variance). And in most real world examples there is a tradeoff between bias and variance! More accurate on average (less bias) means more variance in the guesses (more variance)!  In a few special cases you can pick an estimator that has both the lowest bias and the lowest variance. But anytime there is a trade-off you have to look at the nature of the trade-off to minimize MSE (best overall estimator by at least one criteria). (Since  Mean Squared Error or MSE=Bias^2+Variance one can actually minimize MSE if one knows the trade-off between bias and variance).This is the bias/variance trade-off to a statistician (Jeremy has given Friday links to posts on this topic by Gelman).

Bias and Variance - Here estimator A is biased (average guess is off the true value) but it has low variance. Estimator B has zero bias (average guess is exactly on the true value) but the variance is larger. In such a case Estimator B can (and in this example does) have a larger Mean Squared Error (MSE) - a metric of overall goodness of an estimator.

Figure 1 – Bias and Variance - Here estimator A is biased (average guess is off the true value) but it has low variance. Estimator B has zero bias (average guess is exactly on the true value) but the variance is larger. In such a case Estimator B can (and in this example does) have a larger Mean Squared Error (MSE) – a metric of overall goodness of an estimator. This can happen because MSE depends on both bias and variance – specifically MSE=Bias^2+Variance.

This is exactly why the WLD ignore detection probabilities method (which GLMWM somewhat disparagingly call the naive method) can have a lower Mean Square Error (MSE) than using detection probabilities despite always being biased (starting from behind if you will). Detection probabilities have zero bias and non-detection methods have bias, but in some scenarios, non-detection methods have so much lower variance than detection methods that the overall MSE is better to ignore the detection method. Not so naive after all! Or in other words, being unbiased isn’t everything. Having low variance (known in statistics as an efficient estimator) is also important. Both the bias of ignoring detection probabilities (labelled “naive” by GLMWM) and the higher variances of the detection methods can easily be seen in Figures 2 and 3 of GLMWM.

When does ignoring detection probabilities give a lower MSE than using them?

OK – so we dove into enough estimation theory to understand that both WLD and GLMWM are correct in the scenarios they chose (and that the authors of both papers were probably smart enough to pick in advance a scenario that would make their side look good). Where does this leave the question most readers will care about most – “should I use detection probabilities or not?”  Well the appendix to GLMWM is actually exceptionally useful (although it would have been more useful if they bothered to discuss it!) – specifically supplemental material tables S2.1 and S2.2.

Let’s start with S2.1. This shows the MSE (remember low is good) of the ignore detection model in the top half and the MSE of the use the deteciton model in the bottom half for different samples sizes S, repeat visits K, and values of Ψ and p. They color code the cases red when ignore beats use detection, and green when detection beats ignore (and no color when they are too close to call). Many of the differences are small, but some are gigantic in either direction (e.g. for Ψ=0.2, p=0.2, ignoring detection has an MSE of 0.025 – a really accurate estimator – while using detection probabilities has an MSE of 0.536 – a really bad estimate given Ψ ranges only from 0-1, but similar discrepancies can be found in the opposite direction too). The first thing to note is that at smaller sample sizes the red, green and no color regions are all pretty equal! IE ignoring or using detection probabilities is a tossup! Flip a coin!  But we can do better than that. When Ψ (occupancy) is < 50% ignore wins, when Ψ>50%, use detection wins, and when p (detection rate) is high, say>60% then it doesn’t matter. In short, the contrasting results between WLD and GLMWM are general! Going a little further, we can see that when sample sizes (S but especially number of repeat visits K) creep up, then using detection probabilities starts to win much more often which also makes sense – more complicated models always win when you have enough data, but don’t necessarily (and here don’t) win when you don’t have enough data.

Bias, Variance and Confidence Intervals

Figure 2 – Figure 1 with confidence intervals added

Now lets look at table S2.2. This is looking at something that we haven’t talked about yet. Namely, most estimators have, for a given set of data, a guess about how much variance they have. This is basically the confidence interval in Figure 2. In Figure 2, Estimator A is a better estimator of the true value (it is biased, but the variance is low so MSE is much lower), but Estimator A is over confident – it reports a confidence interval (estimate of variance) that is much smaller than reality. Estimator B is a worse estimator, but it is at least honest – it has really large variance and it reports a really large confidence interval. Table S2.2 in GLMWM shows that ignoring detection probabilities is often too cocky – the reported confidence intervals are too small (which has nothing to do with and in no way changes that ignoring detection probabilities is in many case still a better or equally good estimator of the mean – the conclusion from table S2.1). But using detection probabilities is just right – not too cocky, not too pessimistic – it’s confidence intervals are very accurate – when there’s a lot of variance, it knows it! In short Figure 2  is a good representation of reality over a large chunk of parameter space where method A is ignore detection (and has lower MSE on the estimate for Ψ but over-confident confidence intervals) and method B is use detection-based methods (and has worse MSE for the estimation of Ψ but has very accurate confidence intervals)..

(As a side-note, this closely parallels the situation for ignoring vs statistically treating spatial, temporal and phylogenetic autocorrelation. In that case both estimators are unbiased . In principal the variance of the methods treating autocorrelation should be lower, although in practice they can have larger variance when bad estimates of autocorrrelation occur so they are both roughly equally good estimators of the regression coefficients. But the methods ignoring autocorrelation are always over-confident – their reported confidence intervals are too small.)

So which is better – a low MSE (metric of how good at guessing the mean) or an honest, not cocky estimator that tells you when its got big error bars? Well in some regions you don’t have to choose  using detection probabilities is a better estimator of the mean by MSE and you get good confidence intervals. But in other regions – especially when Ψ and p are low you have to pick – there is a tradeoff – more honesty gets you worse estimates of the occupancy. Ouch! That’s statistics for you. No easy obvious choice. You have to think! You have to reject statistical machismo!

Summary and recommendations

Let me summarize four facts that emerge across the WLD and GLMWM papers:

  1. Ignoring detection probabilities (sensu WLD) can give an estimate of occupancy that is better (1/3 of parameter space), as good as (1/3 of parameter space) or worse than (1/3 of parameter space) estimates using hierarchical detection probability models in terms of estimating the actual occupancy. Specifically, ignoring detection guarantees bias, but may result in sufficiently reduced variance to give an improved MSE.These results come from well-known proponents of using detection probabilities using a well-known package (unmarked in R), so they’re hard to argue with. More precisely, ignoring detection works best when Ψ is low (<50%) and p is low, using detection works best when Ψ is high (>50%) and p is low, and both work very well (and roughly equally well) when p is high (roughly when p>50% and certainly when p>80%) rgardless of Ψ.
  2. Ignoring detection probabilities leads to overconfidence (reported confidence intervals that are too small) except when p is high (say >70%). This is a statement about confidence intervals. It does not affect the actual point estimate of occupancy which is described by #1 above.
  3. As data size gets very large (e.g. 4-5 repeat visits of 165 sites) detection probability models general get noticeably better – the results in #1 mostly apply at smaller, but in my opinion more typically found, sample sizes (55 sites, 2 repeat visits).

And one thing talked about a lot which we don’t really know yet:

  1. Both WLD and GLMWM talk about whether working with detection probabilities requires larger samples than ignoring detection probabilities. Ignoring detection probabilities allows  Ψ to be estimated with only single visits to a site, while hierarchical detection probabilities requires a minimum of 2 and as GLMWM shows really shines most with 3 or 4 repeat visits. To keep a level playing field both WLD and GLMWM reports results where the non-detection approach uses the repeat visits too (it just makes less use of the information by collapsing all visits into either species seen at least once or never seen). Otherwise you would be comparing a model with more data to a model with less data which isn’t fair. However, nobody has really full evaluated the real trade-off – 50 sites visited 3 times with detection probabilities vs 150 sites visited once with no detection probabilities. And in particular nobody has really visited this in a general way across the whole parameter space  for the real-world case where the interest is not in estimating  Ψ, the occupancy, but the β’s or coefficients in a logistic regression of how Ψ varies with environmental covariates (like vegetation height, food abundance, predator abundance, degree of human impact, etc). My intuition tells me that with 4-5 covariates that are realistically covarying (e.g. correlations of 0.3-0.7) getting 150 independent measures of the covariates will outweigh the benefits of 3 replicates of 50 sites (again especially for accurate estimation of the β’s) but to my knowledge this has never been measured. The question of whether estimating detection probabilities requires more data (site visits) remains unaswered by WLD and GLMWM but badly needs to be answered (hint: free paper idea here).

So with these 3 facts and one fact remaining unknown, what can we say?

  1. Detection probabilities are not an uber method that strictly dominates ignoring them. As first found by WLD and now clearly shown to be general in the appendices of GLMWM, there are fairly large regions of parameter space where the primary focus – the estimate of Ψ – is more accurate if one ignores detection probabilities! This is news the detection probably machismo-ists probably don’t want you to know (which could be an explanation for why  it is never discussed in GLMWM).
  2. Detection probabilities clearly give better estimates of their certainty (or in a lot of cases uncertainty) – i.e. the variance of the estimates.
  3. If you’re designing data collection (i.e. estimating # of sites vs # visits/site before you’ve taken measurements – e.g. visit 150 sites once or 50 sites 3 times), I would recommend something like the following decision tree:
    1. Do you care more about the estimate of error (confidence intervals)  than the error the estimate (accuracy of Ψ)? If yes then use detection probabilities (unless p is high).
    2. If you care more about accuracy of Ψ, do you have a pretty good guess that Ψ much less or much greater than 50% or that p is much greater than 70%? If so then you should use detection probabilities if Ψ is much greater than 50% and p less than or equal to 50-60%, but ignore them if Ψ much less than 50% or p clearly greater than 50-60%.
    3. If you care more about accuracy of Ψ and don’t have a good idea in advance of roughly what Ψ or p will be, then you have really entered a zone of judgement call where you have to weigh the benefits of more sites visited vs. more repeat visits (or hope somebody answers my question #4 above soon!).
    4. And always, always if you’re interested in abundance or species richness, don’t let somebody bully you into switching over to occupancy because of the “superiority” of detection models (which as we’ve seen is not even always superior at occupancy). Both the abundance and species richness fields have other well established methods (e.g. indices of abundance, rarefaction and extrapolation) for dealing with non-detection.
    5. Similarly, if you have a fantastic dataset (e.g. a long term monitoring dataset) set up before detection probabilities became fashionable (i.e. no repeat visits) don’t let the enormous benefits of long term (and perhaps large spatial scale) data get lost just because you can’t use detection probabilities. As we’ve seen detection probabilities are (a good method, but also a flawed method which is clearly outperformed in some cases just like every other method in statistics. They are not so perfect that they mandate throwing away good data.

The debate over detection probabilities have generated a lot more heat and smoke than light, and there are clearly some very machismo types out there, but I feel like if you read carefully between the lines and into the appendices, we have learned some things about when to use detection probabilities and when not to. The question #4 still remains a major open question just begging for a truly balanced, even-handed assessment. What do you think? Do you use detection probabilities in your work? Do you use them because you think they’re a good idea or because you fear you can’t get your paper published without them? Has your opinion changed with this blog?


*I’m aware there are other kinds of detection probabilities (e.g. distance based) and that what I’m really talking about here are hierarchical detection probabilities – I’m just trying to keep the terminology from getting too thick.

**Although I have to say I found it very ironic that the software code GLMWM provided in an appendix, which uses the R package unmarked, arguably the dominant detection probability estimation software,  apparently had enough problems finding optima that they rerun each estimation problem 10 times from different starting points – a pretty sure sign that optima are not easy to find.

Friday links: falsificationist vs. confirmationist science, transgendered scientists, lizard vs. iPhone, and more

Here’s what Meg and I did this week:

Also this week: when flunking tests is good for you, don’t fight sexism by pretending nobody has kids, bad advice for graduate students, active learning the easy way, the two cultures of mathematics, a picture (of your methods) is worth a thousand words, the calculator that will never die (apparently), a devious trick to stop students from complaining about their marks, and MOAR. You really ought to take a long lunch today so you can click through on all this stuff. Our links 1, your productivity 0. :-)

From Meg:

This is a very cool map of the distribution of woody biomass in the US.

I like the main point of this post from Athene Donald: yes, it is sexist when headlines and articles point out the number of children a woman has, but the solution to this problem is not to omit the children of successful women, but to include those of successful men. As she writes:

If we sanitise all articles about people of both sexes so that parenthood is taboo, we deny our young the chance to work out how they might want to balance their own lives. Young girls want to know whether it is possible for them to aspire to be senior professionals in whatever field and have children. Young men should be thinking about this too. Maybe we should push the media to have more stories about high-powered men who are dab hands at nappy-changing and the school run rather than berate them for mentioning the fact that a senior woman has given birth at some point.

I enjoyed this post from PsychGrrrl on mentoring. It raises lots of good points. And here’s a Chronicle Vitae post on bad advice that grad students get, which includes (as an example of bad advice) “Depression is normal among doctoral students, so you should just tough it out/exercise more/throw yourself into your work/do some yoga.”

This post from David Colquhoun is a reminder that academia has certainly made a lot of progress on gender equality since the 1960s (even if it still has work to do!) The post focuses on the integration of University College London’s common room in the late 1960s, and was inspired, in part, by a tweet about Kathrine Switzer’s ground-breaking Boston Marathon run in 1967. It also includes anecdotes about women who ignored the rules and used the room anyway, which I enjoyed reading.

Here’s some excellent spoken word poetry from Emily Graslie on the need for more voices in science, and the cues that send girls away from science.

I also enjoyed this piece on the experiences of transgendered scientists, who have a unique perspective on gender bias. It includes an interview with Joan Roughgarden, who said “men are assumed to be competent until proven otherwise, whereas a woman is assumed to be incompetent until she proves otherwise”.

Finally, FemaleScienceProfessor had a piece on writing tenure letters. I know this is in my future, and found this interesting to read.

From Jeremy:

Ecologist Stephen Heard has a fun new open access paper on whimsy, jokes, and beauty in scientific writing. He addresses the various objections scientists have to writing in anything other than a bone-dry style, and argues that we could all make our papers more enjoyable to read without sacrificing (and indeed, sometimes even enhancing) clarity. Here’s a question, inspired by Stephen’s paper: how will blogs and Twitter affect the writing style scientists adopt in their papers? Blogs and tweets often are casual, whimsical, funny, and forceful. As we all become more comfortable with that sort of writing in our daily lives, will that sort of style start to creep into our papers? Maybe somewhat like how dry newspaper journalism is being replaced by alternative models written in a more entertaining style? Or alternatively, will blogs and Tweets function as an outlet for scientists’ jokey, entertaining urges, so that it becomes even less acceptable to write papers in anything other than a dry style? “Save the jokes for your tweets and the colorful analogies for your blog,” we might imagine a reviewer saying.

Andrew Gelman highlights a nice exchange he had with Deborah Mayo on falsificationist vs. confirmationist approaches to science. I think Gelman’s right that, in practice, a fair bit of null hypothesis testing in science is actually confirmationist rather than falsificationist. And that two big reasons for this are vague scientific hypotheses, and investigators “testing” those scientific hypotheses by checking predictions that are only loosely and indirectly derived from them. Gelman’s examples are from psychology and social science, but I think this is an issue in ecology as well. See for instance this guest post on the hypothesis that species interactions are stronger and more specialized in the tropics. A hypothesis that, judging from the comment thread, seems to be poorly defined (since otherwise why would be people be arguing about what exactly the hypothesis is and whether piles of empirical studies actually test it?)

Speaking of poorly defined hypotheses…Andrew Hendry with a post on how the currently-popular hypothesis that adaptive plasticity promotes adaptive evolution isn’t being tested properly. Because however the data come out, people seem to interpret it as favoring the hypothesis. I’m not qualified to judge it, but I found it provocative in a good way.

Terry McGlynn with a great post on how to do active learning the easy way. The main thing you have to give up on is teaching students a great breadth of material. Your goal instead is to have them really grasp and retain a few key concepts. And a side benefit is that you actually spend less time on teaching prep.

Semi-relatedly: when flunking a test is good for you (or your students). When it’s a pretest, of course. (ht Marginal Revolution)

Why students (probably including yours) still use TI-84 calculators. (ht Economist’s View)

Joan Strassmann on what kinds of evolutionary biology questions her undergraduate students struggle with, and why. I sometimes avoid problems with students assuming they know more about the organism than they do by asking questions about silly, made-up organisms like jackalopes.

Here’s a great essay on the two cultures of mathematics: theory builders vs. problem solvers. Worth reading both from an anthropological, how-do-other-fields-work perspective, and because it applies to theoretical ecologists as well (more on that in a forthcoming post). Includes wonderful discussions of what makes a mathematical problem “interesting”, what makes for work of lasting value to future generations of mathematicians, the various ways a field can be more than just a stamp collection of special cases, and more. (ht Mike, via the comments)

I don’t usually highlight ecology papers, figuring y’all can filter those without any help from me, but this was particularly provocative: DNA barcoding as a method of species identification may be on the brink of irrelevance even before it fully emerges, thanks to advances in sequencing technology and the resistance of barcoding’s advocates to those advances. I know nothing of barcoding, so don’t have any useful comments–anyone care to chime in? (ht The Molecular Ecologist)

Acclimatrix of Tenure, She Wrote with some good advice on surviving year 1 on the tenure track. I’d add that it helped me to keep in mind that lots of other people were going through, or had gone through, the same stuff. Keeping that in mind helped me to deal with those times when I felt overwhelmed. Everybody feels overwhelmed sometimes. It doesn’t mean you’re in over your head. It just means you’re in your first year in a new job in a new place.

A commenter on Joan Strassmann’s blog has a devious trick to keep students from complaining about their marks.

The EEB and Flow passes on some good advice for how to do peer review. My own advice is here.

Theoretical ecologist John Harte is everywhere these days. Well, two places, anyway: here’s another interview with him, focusing on his MaxEnt work. (ht Not Exactly Rocket Science)

Your methods section will be much easier for readers to understand if you include a figure or two illustrating what you did.

Should universities change how they calculate grades in order to eliminate dramatic variability among professors? (ht Mathbabe)

Camera trap images of a hitchhiking genet. (ht Chris Klausmeier)

And finally, here’s a video of a lizard using an iPhone. Which will have Terry McGlynn screaming “Oh god, the carnage!” :-)

Graduate positions available in Jeremy Fox’s lab for Fall 2015

You read my blog–want to join my lab? I’m currently seeking 2-3 graduate students (M.Sc. or Ph.D.) to start in Fall 2015.

My own work addresses fundamental questions in population, community, and evolutionary ecology, through a combination of mathematical modeling and experiments. In the past, I’ve worked on metacommunity dynamics, food webs, diversity-stability, character displacement, biodiversity-ecosystem function, and other problems. I’m open to inquiries from students with a broad range of interests, but I’m particularly keen to hear from students interested in the following ongoing or planned projects:

  • Causes and consequences of spatial synchrony of population cycles. This project uses protist microcosms to conduct experiments that would be impossible to conduct in nature at the relevant spatial and temporal scales.
  • Local adaptation in space and time. This project involves using a -80 C freezer as a ‘time machine’ to reciprocally transplant lake bacteria forwards and backwards in time (as well as in space) to test for local adaptation to spatial and temporal environmental variation. One big question is whether temporal or spatial variation is more effective at generating local adaptation and so maintaining diversity. The simplest theory says that spatial variation should be much more effective, but nature may not be that simple…
  • Local adaptation, species interactions, and elevational range limits of alpine plants. This project combines reciprocal transplant experiments within and beyond species’ elevational range limits with neighbor removal experiments to address several questions about how local adaptation and species interactions shape, and are shaped by, species distributions along steep environmental gradients.

For more on my lab, please visit my homepage.

The University of Calgary is one of Canada’s top research universities. The Dept. of Biological Sciences has ~60 faculty and ~180 graduate students, of whom ~1/3 are in ecology and evolution.

Guaranteed funding of at least $21,000/year (more than that in practice) is available for 2 years (M.Sc.) or 4 years (Ph.D.) (note that Canadian graduate programs are a bit shorter than in the US). Funding is provided through a combination of TAships, RAships, and scholarships.

Calgary is a safe, vibrant city of over 1 million people, located close to the Canadian Rockies with all the opportunities for research and recreation that implies.

If you are interested, please email me an introductory note, along with a cv, transcripts (unofficial is fine), and contact details for 3 references.

-Jeremy Fox (

Tractable != easy

When I give seminars about my work and when I write grant proposals, I often talk about how tractable it is – the hosts are see-through, allowing you to see internal parasites! You can maintain individual genotypes in the lab for years! Population dynamics are fast, so you can watch evolution in a natural population over the course of a single season!

That’s all true, but it ignores something else: even tractable systems are hard. I think of this especially at this time of year because we’re in the midst of field season (our field season runs from July through November) and because there are new folks in the lab who need to be trained. I feel like I talk a lot about the cool things we can do with our system when I’m recruiting folks to the lab, so then I feel a little bad when they arrive, eager to get involved in our field survey, only to find out that they can’t help with counting samples.* In order to be able to count field samples, the person needs to be able to identify seven different species of host (and Daphnia are literally a textbook example of phenotypic plasticity), plus a whole suite of parasites (there are ~8-10 that we see routinely, plus various other ones that we see less commonly). That’s a lot to learn how to identify all at once. Actually, it’s too much to learn how to identify all at once, and, as a result, grad students (and technicians) usually spend a full year in the lab before they can count any of the field samples. And, as I’ve written about before, lab experiments often have lots of failures, sometimes for inexplicable reasons. And, when piloting new experiments, sometimes things that should be really straightforward don’t work, and it can take years to figure out why. (As one example, we’ve spent the past two years troubleshooting something that seemed like a slam-dunk experiment and still haven’t figured out the problem.**)

I am not claiming that my study system is especially hard. I think everyone’s systems have advantages and disadvantages, even if it’s easy to focus more on the advantages and disadvantages and end up with system envy when you see someone’s polished talk or publication. But I sometimes wonder if I’ve oversold the tractability of my system. In the end, I have to remind myself that, unfortunately, tractable is different than easy.


*I do say this during the recruitment process, too.

** But I have a new hypothesis for something that might be causing the problem! Hope springs eternal.

Where do you look for academic jobs in ecology?

It’s job season! If you’re on the market for an academic position, where do you look?* Here are the resources I know of. Suggest others in the comments!

Ecology and Evolution-specific resources:
Ecology jobs wiki, 2014-2015 edition

More general resources:
Chronicle Vitae
Higher Ed Jobs
Science Careers

I think the ecology and evolution-specific resources (and word of mouth) are probably responsible for most people learning about positions. I know some schools think it’s important to put a job ad in Science, but I personally don’t think that matters. People I know are using wikis and listservs to find positions, and a print ad in Science isn’t likely to even be on their radar; to me, that money seems like it could be better used for something else.

But that’s just my impression. Here’s a quick poll for those of you who’ve been on the job market in the past 3-4 years:

It will be interesting to see how people are finding jobs to apply for!

Thanks to Katie Boes for the post idea!


*In the future, I’d like to do a post on where to look for non-academic positions. Leave suggestions in the comments!

Friday links: is ecology’s explanatory power really declining that much, John Harte vs. Tony Ives, and more

Also this week: the wisdom of Randall Munroe, banning students from emailing you, the benefits of active learning, and more. Oh, and buried in one of the entries is the story of how “functional groups” are a statistical artifact.

From Meg:

This piece in the NYTimes highlights a new study showing that active learning benefits all students, but especially black and first generation college students. More specifically:

Active learning raised average test scores more than 3 percentage points, and significantly reduced the number of students who failed the exams, the study found. The score increase was doubled, to more than 6 percentage points, for black students and first-generation college students.

From Jeremy:

Will Pearse reanalyzes the raw data from that new study of the declining explanatory power of ecology, and finds that while the mean R^2 value reported in ecology papers is declining over time, there’s so much scatter around the trend that the trend arguably isn’t worth worrying about. Click through for a good discussion in the comments over there. (ht downwithtime, via the comments)

Finally, we replaced [the] data with random numbers and continued to find very large numbers of apparently statistically significant effects.” Ouch. This check actually is quite generally useful, especially for exploratory analyses or any sort of model selection–randomize the data (or replace it with random numbers), repeat the exploration or model selection process, and see if you still find any patterns. Here, this check reveals that sorting study subjects into two groups generates “significant” effects in subsequent analyses, even if the groupings are completely arbitrary. Reminiscent of the findings of Owen Petchey, and Wright et al., that standard ways of assigning plant species to “functional groups” have no more explanatory power for ecosystem function than arbitrary assignments. It’s the mere fact that you’re lumping species into groups that does all the “explaining”. (Protip for students: if you read that last sentence and went “Wait, functional groups are an artifact?!”, that’s a sign that you should click through.) (ht Not Exactly Rocket Science)

I really like this quote from an interview with xkcd creator Randall Munroe:

But I’m also wary of people saying “everyone should know” some skill from their area of expertise, because people have their own stuff to deal with. It’s easy for me to imagine an abstract person and then say, “Wouldn’t it be better if that person knew how to program?” And maybe it would. But real people are complicated and busy, and don’t need me thinking of them as featureless objects and assigning them homework. Not everyone needs to know calculus, Python or how opinion polling works. Maybe more of them should, but it feels a little condescending to assume I know who those people are.

Keep this in mind the next time you want to argue that “everybody” (like “all ecologists” or “all ecology students”) should learn X, or more of X. Popular values of “X” for ecology include “natural history”, “programming”, “math”, and “statistics more advanced than GLMs“. In general, if you think that ecologists should learn more of anything, I think that you should also say what ecologists should learn less of in order to free up the time. Curriculum design is always about hard choices. I’ll probably post more on this soon.

BioDiverse Perspectives has an interview with John Harte. Here’s one choice quote:

[E]verywhere we look we see uniqueness, but being a scientist I refuse to accept that and I look for what general underlying patterns and principles govern this wealth of phenomena.

Which makes for a contrast with Tony Ives:

It’s the differences among lakes that I think are interesting. As a theoretical ecologist, you might think that I’m motivated by general laws. But I don’t find general laws very interesting. I really like solving problems.

I’ll try to do a post on this contrast in the near future.

One way to use online preprints is as a way to get feedback on an idea that might be brilliant, or might be wrong/known/silly. Blogs can be used the same way (e.g., this). Of course, whether it works depends on how many people read your preprint or blogpost and are willing to comment on it.

Would you ban undergraduate students from emailing you, except to set up a face-to-face meeting?


Start-of-the-year advice for everyone!

Over the years we’ve done lots of advice posts on all sorts of things, aimed at everyone from prospective grad students to postdocs to faculty. Here are links to some advice posts that are relevant for the start of the academic year. This is just a sample; see the “advice” category or use the search bar for more!

For prospective grad students:

Here’s a bunch of advice from Meg for prospective grad students in ecology, and their mentors. And here’s some advice from me on choosing an M.Sc. vs. a Ph.D., choosing an advisor vs. choosing a program, and the importance of contacting prospective advisors in advance of applying (and contacting them in the right way). Check out the comment threads too, they’re full of win.

For grad students:

If you’re just starting grad school, one of the first things you’ll need to do is develop a research proposal. Here’s some advice on good ideas for a research proposal (good ones do tend to run to type), and here are some advice on common mistakes to avoid. Note that the two posts are a pair and you should read both. You can’t avoid the mistakes described in the second post simply by trying to follow the advice in the first post. And here’s Brian’s advice on how to survive your comprehensive exam.

For postdocs:

Here’s Meg’s big compilation of advice for those of you who are on the academic job market. Here’s the straight dope on how faculty search committees work.

For faculty:

If you’re brand new, here’s some advice on the postdoc to PI transition. See the “teaching” category for lots of teaching tips, such as Meg’s two posts on the use of clickers in the classroom. And here’s Meg’s ongoing compilation of videos for teaching ecology.

For others:

If you’re an ecologist looking to pursue a non-academic career, we have a series of posts from ecologists who’ve gone on to do all sorts of things. Here’s the first one.

What should editors do when referees disagree?

Journal referees often disagree. Referee disagreements can be challenging for editors to handle. How should editors deal with them?

One common approach, especially among editors at selective journals, is to just reject the paper. That is, anything other than unanimous approval or near-approval of the referees is fatal. This is the path of least resistance for editors. It’s usually justified on the grounds that there are lots of good, or potentially-good, papers to choose from and so decisions have to be made somehow. Plus, editors often aren’t specialists on the topics of the papers that they handle, so when faced with a disagreement between specialists they may feel it’s safest to reject the ms, figuring that if that’s the wrong call the ms can just be published somewhere else.

Personally, while I appreciate the motivations for that approach, it’s one I tried to avoid taking during my days as an editor at Oikos. I felt like my job as an editor went beyond simply choosing the referees. I took the view that my job was to make decisions, informed but not dictated by the reviews. I also felt that it was my job to give authors clear direction as to how the ms could be improved. Just saying “address the concerns of the referees” isn’t a clear direction if the referees disagree with one another. So if the referees disagreed on an important point, or on their overall view of the ms, I saw it as my job to decide who was right, and then explain my reasoning to the author.

These two approaches are really two ends of a continuum. For instance, probably few editors literally just compile the “votes” of the referees and reject any paper that doesn’t get a unanimous “yes”. But I do think editors vary a fair bit in how willing they are to dig into the details of explicit or implicit disagreements among referees*, and how willing they are to take sides when referees disagree (particularly when doing so would involve overruling a negative review).

Of course, another approach is for editors to hedge their bets by rejecting the paper if referees disagree, but with the possibility of resubmission as a “new” ms. We’ve talked about that approach before.

Another approach, if two referees disagree, is for the editor to get a third “tiebreaker” review. I don’t know how common this practice is, although anecdotally I think it’s rare in ecology. There are various reasons why an editor might want a “tiebreaker” review. An editor might be genuinely unsure how to resolve a substantive disagreement between two referees, and so want additional input before deciding who’s right. Personally, I think that’s the best reason to want a “tiebreaker” review. At the other end of the continuum, the editor might simply want a tiebreaking vote on whether to accept the ms or not. Personally, I think that’s an abdication of editorial responsibility; I hope nobody ever does that.

But while I certainly have my own preferences and my reasons for them, there are principled and pragmatic arguments for various approaches. So as an author, what approach do you prefer editors to take when referees disagree? What about when you’re one of the referees? What about about when you’re the editor? What about when you’re a reader of the journal? Why? Looking forward to your comments.

*An implicit disagreement being a case where referee A sees some problem with the ms that isn’t mentioned by referee B. This is common. Sometimes this occurs because referees have different interests and so read the same ms with different eyes (e.g., a specialist in system X might care a lot about how the results relate to previous work in system X, while another referee might not care). Sometimes this occurs because one referee spotted something another missed. Neither of those are really disagreements. But sometimes it occurs because one referee thinks something is a problem but another referee doesn’t and so doesn’t mention it. It’s for this reason that, when I’m reviewing a paper, I try to be explicit about where I think there aren’t problems, as well as where I think there are problems. Indeed, if there’s some aspect of an ms that I think is fine, but that I anticipate that other referees might object to, I sometimes go out of my way to explain why I disagree with those objections, so as to give the author ammunition against any reviewers who might raise them.

Friday links: Lego academics vs. admin, ESA vs. other summer conferences, greatest syllabus ever, a taxonomy of papers, and more

Also this week: Contrarianism! Academia isn’t broken! It’s not actually that important for the vast majority of data to be made available and accessible in a standardized form! And also lots of things that aren’t contrarianism but are still thought-provoking! And a video interpretation of my blogging! And a picture of Dr. Evil! And more! Ok I’ll stop now!

From Brian:

Most popular programming languages across all programming (not just academic ecology). R just edges out Matlab in terms of popularity, but Python crushes both (in a virtual tie between Java, C, C++ and Python). Personally, I am midstream switching from Matlab to Python with R pulled in for teaching or specialized stats (or to conform with norms in working groups). Python is clearly a juggernaut in scientific computing in general, but I’m curious how many other ecologists are using it?

From Meg:

First, a reminder as the semester begins that I have a post that contains links to videos for teaching ecology.

Times Higher Education had a very interesting article entitled “Work less, do more, live better”, which focuses on similar themes as my post on how you do not need to work 80 hour weeks to succeed in academia. It includes interesting information on historic work patterns, efforts by companies to restrict how much their employees work in an attempt to prevent burnout, and the importance of exercise and sleep. It says, “This article is the opposite of a call to arms, it’s a call to leisure, a call to lay down your keyboard and take up your knitting needles, your surfboard, your pleasure reading and, especially, your walking shoes.”

And, in a similar vein, Inside Higher Ed had a piece on working 40 hours a week and succeeding in academia. It’s a great piece. Many of the points mirror ones I made in my post on not needing to work 80 hour weeks: if you track your time, you are likely to be surprised at how little work you are actually doing, and, if you do put in longer hours, you are likely to become less and less efficient/productive during those hours. The author (Trish Roberts-Miller of the University of Texas-Austin) says that she discovered as a grad student that “I was spending a lot of time in a fairly draining world of neither work nor play — not fun, and so not a world that rejuvenated me in any way, but also not really work, and so not a world in which I was getting anything useful done. I wasn’t exactly the long-suffering martyr I was imagining — in fact, I needed to work more.” She also talks about how the amount of time required for specific tasks she performs (e.g., grading) increases as she works longer hours each week. All in all, a very interesting read. ht: @scitrigrrl

Ed Yong tweeted that it would be really helpful if lab’s included information on their major accomplishments on their webpage, which got an impressively quick response from Emilio Bruna:

I think Emilio’s page is great. I’m interested in giving this a shot!

I really liked this post by Sarah Bisbing on the lessons she learned in her first year on the tenure track. There’s lots of great advice there. (Jeremy adds: hey, that’s my link!)

And I’m late on this, since I haven’t sent in links in a few weeks, but The Guardian featured a new twitter account, @LegoAcademics, which has used the new women scientist lego set to create scenes of academics acting out their daily lives. It doesn’t include my favorite so far, which is this one:

From Jeremy:

Terry McGlynn with his impressions of the four conferences he attended this summer, including the ESA. Touches on everything from the prevalence of students (lots more of them at ESA, apparently) to the apparent paucity of new Big Ideas:

Another thing that I noticed about the meetings that, at each one, so many talks were about the same. exact. damn. idea. For example, I’m not joking when I estimate that about every other talk at the Tropical Biology meeting was about biodiversity along elevational gradients…When there is a new Big Idea out there, it makes sense that people are working on it and presenting that work. But here’s what caught my eye about most of these Big Ideas at the meetings. These are ideas that have already played out. The biggest discoveries tied to those Big Ideas have already happened.

Terry also talks about the difference between using talks to sell your science vs. using talks to sell yourself, and says he mostly saw the latter. I confess I don’t really see the distinction Terry sees. If it’s a distinction at all, it seems to me more like a distinction between different ways of selling one’s science. Some old thoughts on showmanship and salesmanship in scientific presentations and papers here and here.

Is ecology explaining less and less? Science has a writeup of a new study I was very impressed with at the ESA meeting. Quotes Brian. The EEB and Flow also comments (in a post that also hits on the theory vs. empiricism survey that we linked to last week).

Sarah Bisbing on the lessons she learned as a first year faculty member.

Academia–publishing, hiring, reviewing, granting–isn’t broken, at least not mostly. It’s merely competitive.

Here’s a pretty cogent argument from Arjun Raj that it’s not necessarily worth it to make any and all data freely available in a standardized form without any need to contact the original authors. Before y’all start flaming me (or him), click through and read it. He actually shares a lot of his own data (and code), and makes heavy use of data collected by others. His point is simply that there are no easy, dead-obvious answers here. Though note that the data sharing requirements many ecology journals have implemented seem less onerous than the sort of thing Raj is mostly thinking of. And at least some kinds of ecological data seem less likely to quickly be outdated or superseded than the sorts of data Raj works with.

Philosopher Deborah Mayo asks whether philosophical superficiality has harmed science. I asked the same question in an ecological context a while back.

A nice piece on Robert MacArthur, from the alumni magazine of his alma mater, tiny and unconventional Marlboro College. I didn’t know that MacArthur was part of their first class of students, and that his father started the science labs there. And this is a candidate for the most MacArthur sentence ever written:

But the college possessed two things the younger MacArthur brother found particularly to his liking: an excellent mathematics teacher in the person of Alan Paine, and an abundance of meadows and forests where he could do fieldwork to his heart’s content.

(ht Small Pond Science)

The Chronicle of Higher Education is going to try to track who gets every new North American tenure track job advertised this year in 11 fields–including ecology. (ht Small Pond Science)

The role of markets and incentives in the recovery of white rhinos in South Africa. Interesting, I didn’t know this. (ht Economist’s View)

A (humorous) taxonomy of papers. I just had a “Face Plant” myself not too long ago. :-)

The 11 funniest economics papers ever. Anyone care to compile a similar list for ecology?

The REAL reasons academics use Twitter. :-)

Sociologist Kieran Healy with the greatest syllabus ever, for “Sociology 710: Social Theory Through Complaining.” I love how it’s an equal-opportunity skewering. Commenters are invited to post suggestions for an ecology equivalent. You can get a good start just by using some of Healy’s, or slight variations thereon. “Ecology 710: Ecological Theory Through Complaining”. “Week 1: This class has nothing to do with my research”. “Week 3: It’s not like we can even predict anything”. “Week 9: What is theory without data?” :-)

Marmot vs. GoPro. I spit on lick your attempt to take time lapse video! :-)

In a similar vein, here’s a video of me blogging. Well, sort of blogging. And sort of me. :-)

And finally, I was procrastinating this week and decided to make a meme for Brian’s “insidious evils of ANOVA” post:

Have a good weekend! :-)