A while back we invited you to ask us anything. Here are our answers to our next two questions, from Andy Park:

- For undergraduate courses in which students are expected to read journal articles, are you finding it harder to find current papers that are relatively easy for an undergraduate to follow? The statistics in current papers are getting more complex.
- On the same theme: how many studies are being published these days in which the statistics are a black box to many of the authors?

**Brian:** I call this statistical machismo. I’ll go you one better. It’s increasingly hard to find papers that ecology faculty or ecology graduate students can read or understand. I teach an advanced graduate stats course. And one of the things I always point out to students is that they now know more statistics than 95% of their readers and communicating with your reader is more important than slight improvements in statistics. A t-test or regression goes a long ways. I think one problem is many people feel like if a paper has fancier statistics it is somehow a better paper. But that is ridiculous (if a statistical method appears in the title it should be a serious warning sign). Your experimental (or observational) design and inference is what makes a great ecology paper. In fact you could argue, the worse the design, the fancier the statistics you need just to save it.

And to your second point. Yes I am convinced that R has made it easy enough to access complex statistics that people who have no idea what the software is doing and whether what they’re using it for is appropriate are running it and publishing it. I hope to have more on this theme this fall.

**Jeremy: **Re: your #1, why do the undergrads need to read newly-published papers? I think for most bits of ecology, study design, and statistics that you’d want to teach to undergrads, you’ll be able to find examples they can comprehend from the recent-ish literature (so, not so old as to be totally outdated). I suppose the exception might be an upper-level undergraduate seminar on “current topics in ecology”; my undergraduate college had such a course. But even there, I bet with a bit of effort you could find papers using fairly simple stats.

Also, what I sometimes do for lab exercises in biostats courses is to give students simplified versions of datasets and analyses that figured in recently-published papers. For instance, right now the undergrads in my advanced biostats course are learning GLMs using a simplified version of the dataset of Waldron et al. 2017 Nature. They’re not reading the paper itself, but I’m giving them a synopsis and just enough background information to understand the scientific question and how to address it with GLMs.

I think that Steve Heard’s recent post is highly relevant here: https://scientistseessquirrel.wordpress.com/2017/11/06/statistics-in-excel-and-when-a-results-section-is-too-short/

Yes indeed – a great post that came out just yesterday long after we wrote this. Thanks for linking to it.

Via Twitter:

Via Twitter:

How often does ecology just “give” us gnarly data, and how often do we just ignore gnarli-ness that could be removed with more thought about experimental design?

I think this in part depends on where you are, which correlates with the amount of funding you have, which, at least thinking about landscape ecology, correlates with how well you can space our sampling sites, which in turn dictates whether you’ll need to use mixed-effects models with spatial autocorrelation or not. But I agree, even with limited funding and so on these issues are better solved by carefully choosing the sampling sites and perhaps reducing sampling size to ensure independence. The other option would be not doing landscape ecology studies unless you have lots of funds for it, but this doesn’t seem a good idea…

I’ve been reading more of the cell biology literature to specifically understand how science is done in this field and how they use statistics. Cells and Ecological communities/systems are both very complex systems but the science is very, very different in the fields. Cell biologists have enormous control with cell cultures, genotype specific strains (including knockouts or overexpressed genes, etc), exogenous antagonists/agonists, antibody marking and imaging, etc. etc. and as a consequence, the statistics are generally very simple t-tests. A good paper tends to be a series of many (many!) tests (some qualitative like western blots, others quantitative using t-tests) whose outcomes are predicted by the previous test. The series of tests tend to be mutually re-inforcing (often different experiments to show essentially the same thing, but manipulating something else about the system) and internally consistent with the emerging result. A good example of this is Boström et al. (2012). A PGC1-[agr]-dependent myokine that drives brown-fat-like development of white fat and thermogenesis. Nature 481, 463–468.

Very interesting remarks. Everybody knows about the amount of control cell biologists have, their ability to knock out or overexpress any gene they want, etc. But I wonder if many ecologists know about the simplicity of the stats in cell biology or the ways in which different experiments fit together.

Hmm…I can feel a post coming on.

During some micro/cell/molecular biology talks recent I pondered this interesting difference as well. Although there might be a similarity in the complexity of an ecosystem and a cell, the hypotheses tested in the talks I saw were much more focused on specifics than is typically the case in ecology. Having a dysfunctional version of protein X has consequence Y; researcher can think of explanations A, B, and C, and simple experiments test each one, often in a yes/no way, and the result leads to another step in the path where the same process is followed. Simple stats are perfectly suited to this situation. In ecology, we often want to know, for example, the consequences of warming an entire ecosystem, which involves a great many direct and indirect pathways of effect. It would be like starving fruit flies to different degrees, and then looking at all the different cellular changes: t-tests might not be up to the job.

“It would be like starving fruit flies to different degrees, and then looking at all the different cellular changes: t-tests might not be up to the job.”

True, but cell and molecular biologists often do those sorts of experiments too. Thought the stats are indeed more sophisticated for such experiments–e.g., controlling false discovery rate, using an empirical Bayesian approach to estimate changes in expression of each of thousands of genes, etc. But as far as I know (as a non-cell biologist), those sophisticated stats aren’t done to rescue messy data or substitute for lack of experimental control or lack of randomization or whatever.

I’m still thinking that a post comparing study design and statistical approaches in ecology vs. cell biology would be interesting. We ecologists like to congratulate ourselves as being among the best users of quantitative tools of any field of biology (e.g., we were among the earliest adopters of statistics). And in some ways I think we are. But in other ways I’m not so sure.

One way in which I think we have traditionally been better than other fields of biology is that ecologists traditionally know what’s going on under the hood with their stats. The stats aren’t just a black box to us. Contrast cell and molecular biologists who often run their data through a “pipeline”–often jargon for “a black box that processes my data and analyzes it for me without me having to think much or understand what’s being done”. Is ecology moving slowly to a “pipeline” model?

It’s worth reading Serengeti Rules by Sean B. Carroll – popular science book that emphasizes commonalities between different kinds of interaction systems (species in a community, enzymes and other things in cells), specifically how negative/positive feedbacks contribute to preventing or causing things to get out of control.

I do think the pipeline model arises at times in ecology, for example with respect to multivariate analyses of community data. It can only happen when the structure of data sets is very similar from study to study – sometimes the case, sometimes not at all.

The publisher sent me a free copy of Serengeti Rules but I haven’t gotten around to reading it yet. On my list.

So, what’s the _cure_ for statistical machismo? it seems clear as day to me that we can score points with reviewers and editors by using fancy stats, even when it’s not necessary. That’s a huge incentive to do so, and therefore it’s done. Describing and acknowledging the problem in blog posts (or maybe a future paper or two) is an important first step, so I’m pleased to see posts like this one. But it also just feels inevitable that we have to live with it, just like we have to live with the fact that we get more “credit” (in terms of getting a job or a promotion or a prize) if we publish a paper in a journal with a high impact factor rather than a low one (even though the paper itself is no different if the roll of the dice at journal X comes up in your favour or not). How to go about changing a culture? (I have no idea.)

I confess I’m not a fan of the growing trend for people to comment on our posts on Twitter rather than here…

I refer my learned friend to my recent blog post on the topic of Twitter v blogging…..

I agree with you. Papers are becoming more and more hard to follow. And it’s not only because of the statistics.

One thing that recently is annoying me, is that, often, I cannot understand the main ideas of a paper without going through the supplements. Mainly in the very top journals, because of a strong space limitation, supplementary materials often are not supplementary, are essencial. And leaving the main text to go to the supplements, completely breaks my reading flow.

I am not saying that I prefer large papers. I don’t. But hiding important information in the supplements is not a good solution. Also, it takes less time for me to read a large well written and complete paper, than a small confuse one.

Re: online supplements, I think they’ve wrecked Science and Nature papers in particular:

https://dynamicecology.wordpress.com/2015/10/26/online-supplements-have-ruined-nature-and-science-papers/

Further thoughts on online supplements:

https://dynamicecology.wordpress.com/2014/09/24/what-belongs-in-the-appendices-vs-the-main-text-in-scientific-papers/

This reminds me of the discussion by Andrew Gelman and Steven Pinker: “…most academic writing is bad for the same reason that most writing [in general] is bad: because writing is hard.” Stats are particularly prone to this. Stats are an integral part of the story, and describe how test our hypotheses, but many papers mush all the statistical tests into one “listicle” paragraph at the end of the methods that just rattles off tests. Statistical descriptions don’t seem to be crafted with the same level of detail and care as, say, the Introduction or Results, or treated as integrated parts of the methods. In my own writing, if a reader doesn’t understand what I did, I view that as a failure on my part as the communicator. I fail more often than not, but I am trying to get better. It takes practice. But I wonder how much of this is simply due to writing.

At the same time, this post also highlights to me that ecology is inherently a quantitative field that has long been perceived as non-quantitative, and most undergrad and grad curriculum shows it. I don’t think our students receive the necessary mathematical training to read or implement stats properly. I certainly didn’t.

“At the same time, this post also highlights to me that ecology is inherently a quantitative field that has long been perceived as non-quantitative, and most undergrad and grad curriculum shows it. I don’t think our students receive the necessary mathematical training to read or implement stats properly. I certainly didn’t.”

Nailed it.

My graduate work did go into intermediate level multivariate stats- but that was it. Nothing really in undergraduate. I think that multivariate stats I had could have been an upper level undergrad course and should have been. I graduated undergrad in 2008.

While overcomplicated statistical methods are sometimes used (or required by reviewers), I’d say it really depends on whether you have a (mostly) experimental setup or (mostly) observational setup, which relates to the fact that you can’t always have a design that solves all problems (and sometimes that design may change over time – no matter how unfortunate this is for the analyst).

Let’s say you are simply checking the differences between experimental treatments and you think about using a fancy GLMMs in order to account for variation in factor x and y because the counts are discrete / there are repeats per site you don’t care about etc. There simpler methods can in some cases work just as well if not better, e.g.

http://www.science.oregonstate.edu/~murtaugp/files/ecol2007.PDF

http://onlinelibrary.wiley.com/doi/10.1111/2041-210X.12386/abstract

(a variant of experiments being cutting standardized observational data into several pieces).

In a way it is just that the linear model is robust to some deviations from its core assumptions.

But now imagine – one example among many – that you want to reconstruct the population density of an elusive animal to estimate pop. size and you have three disparate datasets (e.g. capture-recapture data, genetics, smthg else). How can you go about this without using relatively elaborate statistics (and perhaps even a mathematical model tailored to that particular situation)?

Another case in point is analysing data with strong correlation structures (point patterns, time series, …) that one gets with any observational science. For typical observational data you are often condemned to using relatively fancy models, e.g. random fields, hidden Markov models, and there’s nothing wrong with that (or is there?).

Good point. And it got me thinking about what we teach in our biostats courses here in the Biological Sciences dept. at Calgary. We mostly teach simple frequentist stats–GLMs and special cases thereof, basically (t-tests, linear regression, ANOVA, etc.). That’s in part because you’ve got to start with the basics. But it’s also because we gear our statistical teaching towards the design and analysis of planned experiments. Which raises the question: *should* our undergraduate instruction focus primarily on the design and analysis of planned experiments? Or should we replace some of that material with statistics used for messy observational data, perhaps on the grounds that that’s where the field of ecology is going? Thinking about it a bit, I’m happy with our current emphasis on planned experiments, but I’m curious to hear counterarguments.

Of course, many places besides Calgary emphasize planned experiments when teaching undergrad biostats. Can anyone point to undergraduate programs in ecology (or biology more broadly) that emphasize analysis of observational data rather than planned experiments?

(Aside: our ecology majors do get some time series statistics in our upper level population ecology course. And they do get a bit of mixed models in our upper-level biostats course).

Can anyone point to undergraduate programs in ecology (or biology more broadly) that emphasize analysis of observational data rather than planned experiments?” – as an undergrad, I had a Quantitative Biology course focused on basic experimental stats (t-test, ANOVA, Mann-Whitney etc) and a Numerical Ecology course focused on multivatiate stats, mostly ordinations but also some hypothesis tests, with more emphasis on observational data. Federal University of São Carlos, BSc in Biology, 2008.

I totally agree with comment from Fred Barraquand above. As a quantitative ecologist working mostly with observational data from citizen science programs or even opportunistic observational data, these data often call for the use of complex statistics to analyse it properly (at least I try). And, this is definitely not for the pleasure of applying fancy statistical methods (sometimes hard for me to understand) or to satisfy some picky reviewer that I do so 😉

As for the question (from Jeremy) “should we replace some of that material with statistics used for messy observational data, perhaps on the grounds that’s where the field of ecology is going”:

I won’t advocate for one method or the other to be preferentially taught to students… I totally agree with the fact that students need/have to know (well!) the basics in statistics! But students also have to know that they won’t always find data that come from planned experiments out there, and they need to know that classic stat methods (the basics) often can’t be applied to these messy data for different reasons (e.g. observational biases, spatial autocorrelation, overdispersion, etc.). They need to have a sense of the recent elaborate statistics that have been developed to deal with such data. And methods are even currently developed to deal with data combination, from different kind of datasets (as Fred was mentioning).

It seems particularly important to me in the “Big Data / Open Data” era… People can just go, pick up some easily accessible data they don’t know sh** about (sorry about this one) and try to get out response patterns/dynamics out of this, the same way they’d do with data collected from standard field sites. Although, I have the feeling that most researchers dealing with such messy data try to handle better and better the most common biases underlying these data…

Just some personal thoughts 😉

“While overcomplicated statistical methods are sometimes used (or required by reviewers)…”

I have had problems with this.

I wonder if this is also a case of being able to handle ecological analyses using, sometimes, a relatively wide variety of methods. And folks have their favored method and when they review a paper they expect their preferred method to be followed (for a variety of reasons of course).

It does seem to be biased toward more complicated methods, though. And the more complicated methods, in my experience, often do not add much power or insight. Its just more work with no real point except to make a reviewer happy.

Agreed @Ken. Often it’s because Reviewers are into a certain type of statistical method so they push that method onto Authors even if it is not necessarily beneficial. I find this is especially the case with younger reviewers.

Great post!

I feel thar knowing about the existence of complex stats, but not how they work, gives the illusion that you can get away with a sloppy design. For example, in landscape ecology, one can have sampling sites too close to one another on the grounds of using mixed models with spatial autocorrelation for the analysis – yet the more I read about such models, the less I want to use them. Sometimes having large enough distances between sites is impossible for practical reasons – but in such cases shouldn’t we think of a different design before relying on complex stats?

Ecologists are very sensitive to pseudoreplication, and data are often not independent (strictly speaking). Oftentimes, we know appropriately accounting for non-independence would not change conclusions, but it is difficult to validate the use of a simple but (arguably) wrong method by saying that the conclusions do not change.

This discussion on statistical machismo is very difficult to follow without seeing specific examples. Murtaugh’s paper is good because it gives specific examples. But I think people agreeing here will start disagreeing once we start seeing specific examples, e.g., of reviewers’ comments. If there is any cure for statistical machismo, we need to start discussing specific examples rather than ambiguously criticizing statistical machismo.

We have old posts on specific examples. For instance, estimating detection probabilities:

https://dynamicecology.wordpress.com/2013/01/11/is-using-detection-probabilities-a-case-of-statistical-machismo/

https://dynamicecology.wordpress.com/2014/09/15/detection-probabilities-statistical-machismo-and-estimator-theory/

https://dynamicecology.wordpress.com/2014/09/17/detection-probabilities-back-to-the-big-picture-and-a-poll/

https://dynamicecology.wordpress.com/2014/09/25/detection-probability-survey-results/

Another specific example:

https://dynamicecology.wordpress.com/2015/02/05/how-many-terms-in-your-model-before-statistical-machismo/

The posts are all very good, and in fact specific examples allowed us to see some disagreements. But we need to see more if statistical machismo is a general problem many people are experiencing.

I agree that statistics are getting fancier in ecology – but often better addressing the problems of pseudo replication and p-hacking. And do students in ecology need to read recent papers, yes of course they do! We are training students to become the ecologists of the future they need to understand the current best practice in statistical methods and they also need to be prepared for where the advance of our field is taking us. They need to be well versed in the ecology of the past, but also what is being published currently and ideally be able to draw links between the two.

I have been discussing with some of my colleagues about how we are probably teaching statistics incorrectly. A lot of what is considered “fancy” statistics such as Bayesian approaches are just as simple/complex as the parametric approaches we generally teach to undergrads. A hierarchical model is not necessarily that much more complex than a simple linear model, if you aren’t teaching the mathematics behind it, and using a hierarchical model makes a lot more sense then trying to deal with hierarchical data in other ways. I was talking to some students about normality and rank tests the other day and the one student exclaimed, “Why don’t they tell us this from the start?” in frustration, when I was explaining that there is no one right or wrong way to deal with non-normal data. I was explaining the concept of a hierarchical model to another student a few weeks ago and he said: “You are blowing my mind, why don’t they teach us this stuff from the beginning”. I sense frustration out there in the field particularly with early career ecologists, but I think this can also be combated by changing the way we train the next generation.

I am still forming my thoughts on all of this, but I think after chatting with colleagues, probably the best place to start with the teaching of statistics is with understand data and distributions. Then one can cover the linear model including the hierarchical model from a Bayesian approach first. And, we need to teach coding first before statistics. We also need to be very careful with how we teach about p-values – the students in the program I teach on are in peril of becoming the p-hackers of the future without getting set straight there. What you learn first is often what you consider to be most “simple”. Then you can move on to cover other things. I think we need to flip what we consider simple versus difficult to understand, and then our students will have the training that I wish I had and the confidence to do and understand ecology in the modern day.

I had a meeting with one of my PhD students yesterday, who had run some GAMMs and wanted help interpreting them. They seem to be increasingly frequent in the literature, and I can see why they’re so seductive. You fit a spline to a data series, and suddenly: your predictor is highly significant! The residuals are normally distributed! The r-squared value is high! AIC values drop massively! And to top it all off, R will produce a nice line with error bars that looks lovely, especially if you add some pretty colours. She had understandably fallen into this trap because all the indicators we’re taught to look for in model evaluation come up fine. They have the statistical equivalent of ‘truthiness’.

The trouble is that a GAM isn’t a hypothesis test (or at least any terms to which you fit splines aren’t). Many of the papers I see using them don’t appear to recognise this, which makes me wonder whether the authors understood them, the reviewers, the editor, or indeed many of the readers. Often they tell us absolutely nothing beyond what the raw data would have shown.

For the record I have used GAMMS in a paper, and for reasons that I think are defensible (http://onlinelibrary.wiley.com/doi/10.1111/1365-2664.12902/abstract). There are circumstances in which fitting an arbitrary spline is necessary and useful. It’s just something that needs careful consideration, and I’ve seen a number of PhD students get sucked in by them. I’m sure this isn’t the only technique that has the same effect.

You make it sound as if GAMs will find some weird and wonderful nonlinear (partial) relationship between a covariate and the response that is spurious as a matter of course. That’s just not true.

I’d be interested to know why you think GAMs aren’t a hypothesis test? They’re as valid if not more so that other basis expansion-based approaches (like using polynomials of covariates in the linear predictor). I also wonder why you think splines are “arbitrary”?

Furthermore, if the partial relationships are nonlinear, approximating them with linear terms will result in a biased model. Approximating them with polynomial terms is often an exercise in ad-hocery, regularly results in poor fits and weird behaviour at the extremes of the data.

GAM(M)s are somewhat more complex than GLM(M)s and as such require checking of more parts of the model (concurvity and basis dimension size spring to mind as two things that are critical for GAM models but you’d never come across them in GLM-land).

I’ll grant that users may not understand how to use GAMs properly, but that’s their fault, not some failing of the approach in general.

Just to be clear: I agree that GAMs are useful in some circumstances, and have done so in my own work. I even linked to one recent paper. My choice of words wasn’t ideal, but the point is that you don’t predetermine the form of the response. My main point was that users, reviewers and readers often don’t know how to apply them properly, and it seems we agree on that.

So I agree in spirit with this post, but I doubt that we agree to the letter. What counts as “fancy stats”? To some folks, anything Bayesian is ipso facto “fancy stats”. But this is ridiculous! Arguably, the correct interpretation of Bayesian inference is more intuitive than classical NHST or frequentist intervals, etc. Likewise, to some folks, GLM(M)s seem like “fancy stats”, but they are really just a very very easy idea (linear modeling) applied in full generality (accomodating various error distributions, group-structure in data, etc.). By comparison, figuring out how to count up DF in an ANOVA analysis very quickly becomes an inscrutable exercise, and what does “there exists a three way interaction (p < e)" really even tell us? But to some folks, classical ANOVA always *seems* like the easiest way to go…

agree. agree. and agree.

Via Twitter:

Via Twitter, a question:

Pingback: Defending (to a degree) statistical ‘machismo’ – Community Ecology and Phylogenetics

I’ll admit to being quite frustrated by responses to this AUA, especially Brian’s trotting out of his “statistical machismo” critique again.

When I’m analysing a data set, frankly whether or not those reading the paper we later write describing the results is the last thing on my mind. Explaining how the analysis was done in terms appropriate for the intended reader is something that logically comes later. When I’m analysing data I’m choosing techniques that are appropriate for the data I have available — this is often observational data in my case.

An equally valid counter argument is that people may be reticent to move to more appropriate methods because they fear the community will not accept the new methods, despite them being demonstrably better.

I am yet to come across a situation where a t-test was an appropriate analysis approach to take. That’s not to say that those situations don’t arise, it’s just that those situations aren’t the one’s I’m faced with.

There’s a suggestion in Brian’s response to the AUA, and in the comments here, that ecologists are using fancy stats because they don’t know how to design experiments properly. 1) not everyone is doing experiments, 2) fancy stats won’t save crap data or design, there’s only so much you can polish a turd (as a former colleague of mine was often heard to utter), 3) this conflates those practicing stats poorly with those among the community that are doing this right and for the right reasons. The latter point there is important; there are a lot of good statistical ecologists (or ecological statisticians) around that researchers could engage with on projects and who will be able to advise on design and how to make the best use of resources and the data once collected. Throwing around terms like “statistical machismo” may leave the impression that this advice is not needed, may make researchers wary of reaching out for advice, etc. only for them to realise they actually needed this advice and that it’s too late once the field season is finished.

Despite what Tony Ives wrote, for example (because someone brought it up in the comments), I won’t be log-transforming my count data because when I’m fitting a model, invariably, I want to do a lot more with it than simply test regression coefficients. In fact, this “testing” approach is a key area that is facing considerable push back from statisticians because it focuses too much on p-values, which are often interpreted or used incorrectly. Prediction for example; if you’re fitting to log-transformed response data you’re no longer working on the scale of interest and prediction becomes more difficult to explain/communicate or requires back transformation to a scale that you didn’t fit on. Also, model fitting is more laborious; should I add a continuity correction of 1 or some other number? GLMs avoid all of this. (Which of course Ives does comment on in that paper of his.)

+1. As I alluded above, I think there is some slipperiness in what different researchers find intuitive or “simple”. For me, the comprehensive perspective of Bayesian inference is extremely intuitive – the extensions to hierarchical Bayes are equally so, and enable a broad comprehension of GLM(M)’s. By contrast, classical and frequentist-only approaches always seem to involve a grab-bag of tests and estimators, and various ad hoc corrections or transformations. I think most ecologists just need to learn one tool really really well: GLM(M)s. Bayesian approaches really facilitate learning how to fit and work with GLM(M)s easy: from interpreting parameters, to generating posterior predictive distributions. I will stand by the case that this should be the “simple and intuitive” approach, but I am sure would be classified by many as “statistical machismo”…

Agree. This post and a lot of the comments are very worrisome. Just run t-tests!

You make a good point about the psychological effects of the “statistical machismo” critique: it’s good not to throw out the baby out with the bath water…

And study design is perhaps the most useful stage for the statistician to be called in! (to put it more politely than Fisher)

About GLMs and dealing with discrete numbers: clearly if you have low counts 0,1,2,3… you’re right but this is not always the case. And I don’t buy that working on the scale of interest is a good defense of GLMs given the link function (which is often a log incidentally).

Yes GLMs (and GAMs) are fine in expert hands – the problem is that any student who already struggles to understand the linear model will be advised to use GLM(M)s now, no matter his/her design. That’s my experience at least. I can’t recall how many times I’ve explained that one should not put blind trust in the computation of p-values by GLMs packages because they are all approximations.

That said I completely agree (see above) with your point that observational data requires quite often more elaborate stats.

Re large counts: indeed, and I’ve used this on occasion. However, low counts are encountered routinely, especially when we consider rare taxa.

There is a significant difference between a GLM with a log link and fitting a linear model to log-transformed response. To see this try doing the latter with 0 counts. There are practical issues too, such as dealing with Jensen’s inequality.

Finally, the issue of poor or lacking training of our students is not the fault of fancy stats. The solution is not to throw up our hands in despair or trot out tropes. The solution is better training that is accessible to all.

I don’t see anyone saying that “blind trust” in anything is a good thing, especially in statistical practice.

I’m 100% with you here, Gavin.

The debate about whether certain statistical approaches are inherently more complex or less intuitive than others, vs. whether complexity/intuitiveness is subjective (and/or relative to one’s background training), kind of gets away from the original question. So does the debate over whether ecologists are committing statistical machismo, vs. merely using the statistics appropriate to their (often observational) data. Which is totally fine if that’s the direction the conversation flows, but I don’t want to totally abandon the original question.

The original question was whether it’s now more common than it used to be for readers (esp. student readers) and authors to not understand the statistics in newly-published papers. I think the question about readers could be rephrased as: is there now a greater gap in statistical expertise between typical undergrads and typical paper authors than there was, say, 20 years ago? And the question about authors could be rephrased as: is there now more variance in statistical expertise among authors of ecology papers than there was, say, 20 years ago?

I think the answers are “yes” and “yes”, but I’m going on gut feeling and would be interested to hear other views. And I think the answers are “yes” and “yes” because of increasing specialization in all areas of science.

I think you are correct here. Overall, I believe ecologists are generally trying to do more different things than in the past, and correspondingly, analytical methodologies have also diversified and complexified a bit to keep up. For instance, a colleague and I are working on a reforestation model parameterized with remote sensing data, where it is now logistically feasible to acquire almost arbitrarily long time series of pre-processed data courtesy of Google Earth Engine. But in order to properly use that data, you need to do a fair amount of modeling to account for various sources of measurement error, biasing due to phenology, etc.etc. As Andrew Gelman likes to say, ‘big data requires big model’.

FWIW, much of the discussion above was, I think, prompted by Brian’s response, which opens with the statistical machismo line. So, the post moved from description to prescription, at least by inference, fairly quickly. IOW, he started it 🙂 Of course, that is not to say that what he wrote was *incorrect*, just likely coming from a different perspective than I have, and apparently what some other folks have as well.

My own perspective is like Brian’s in that I often am asked for statistical advice by grad students who’ve run pretty simple experiments or have modest amounts of observational data they collected themselves (rather than from massive remote sensing networks or whatever). It’s more common for those students to want to run an overly-complicated analysis than to run an overly-simple one (though I’ve seen both). They also tend to be vague about their goals and want to confound exploratory and hypothesis-testing analyses. This old post of Brian’s captures my admittedly-anecdotal experience as well: https://dynamicecology.wordpress.com/2015/02/05/how-many-terms-in-your-model-before-statistical-machismo/

“They also tend to be vague about their goals and want to confound exploratory and hypothesis-testing analyses.” I think this hits the nail on the head!

I agree with the first gut feeling, that it can be difficult for undergraduate readers to be sufficiently trained to be able to follow many of the statistical methods that have become quite common in the ecological literature.

On the second point, if we think of statistical methods in general then I’d say quite a lot of that practice in ecology is black box, whether the authors fitted a linear model or Bayesian hierarchical model, or some form of GAN. It is quite clear that scientists misuse or misinterpret p-values for example, which betrays a lack of understanding that one might lump under “black-box” thinking. I don’t see this as being something restricted to the fancy-ness or otherwise of the technique. Though, yes, there are more ways to shoot oneself in the foot with some methods or approaches than others.

My one take home from all this is that as a field, we must be generally poor at demonstrating that understanding modern statistical methods is

asimportant as good lab or field skills, good communication skills, etc. Lacking in any one of these limits what you can do and the scope of what you might aspire to do. Furthermore, we aren’t doing a good enough job in training our students, or to put it more politely that our teaching objectives aren’t sufficiently aligned with the needs of students, if they aren’t able to follow the general principles or ideas of some of the more standard statistical approaches used in ecology.We’ve tried to document this lack (worldwide) with colleagues https://peerj.com/articles/285/ and others have done that more precisely for the US http://onlinelibrary.wiley.com/doi/10.1002/ecs2.1394/full, looking at what is actually taught vs. what is published (they make a rather clear case!).

Everybody seems to agree there is a mismatch, but as Jeremy and Brian suggest below there’s no quick fix…

Via Twitter:

I have to start apologising because I’m neither an ecologist nor a statistician, but a simple forester. I also was one of those commenting in Twitter rather than directly on your blog. Anyway, this part:

“…the worse the design, the fancier the statistics you need just to save it” reminded me of an old physicist colleague of mine that often repeated Rutherford’s “If your experiment needs statistics, you ought to have done a better experiment”.

The interesting part is that pretty much all of my work relies on designed experiments that run for many years, often costing hundreds of thousand of dollars. Despite that or, better, because of that, I will use the best statistical methodologies that I can afford to extract as much signal as possible from my experiments, which are affected by all sort of environmental noise. Hierarchical models? Check. Spatial residual trends? Check. Bayesian approaches to generalised models? Check. Simultaneous modelling of multivariate responses? Check. The point is that our projects deserve getting the most out of your field work. Spending time and money in the field and not squeezing the most out of the data seems a disservice to one’s research.

A different point is that one has to be able to both understand the techniques and properly explain them to the readers. Perhaps this is pointing towards the need for interdisciplinary research and/or getting a good statistician as collaborator.

There is, of course, a point of diminishing returns and there are probably many problems for which simple methodologies are good enough to make the point. I rarely have that luxury.

Via Twitter:

Brian: “In fact you could argue, the worse the design, the fancier the statistics you need just to save it.”

Me: Can’t agree more.

Great blog, awesome debate.

Gavin, Chris and others

I think one of the issues (raised by the original question) is that it is important for scientists to have a shared language with a shared conventional wisdom about what is appropriate and what isn’t appropriate practice.

We have that shared understanding around linear models. At this point we have it around AIC. We have it around principle component analysis.

I’m sure if we added one more thing to the toolkit we could build that shared understanding too. The problem is that we’re not trying to add one more thing to the toolkit. We’re trying to add 30 more things.

There is no way we as a community have a shared language and understanding around good practices in 30 new directions. Thus using one of them is tantamount to saying “trust me” because I know you have no clue what I’m doing.

And I guarantee you too that a lot of people using those 30 new things don’t have much of a clue what they’re doing either. Yes, most people may not know the numerical algorithms for solving an OLS, but they sure know a lot about diagnostics, have intuitions backed by detailed studies about how much of a deviation from normal is OK, what looks like a strong or weak signal, etc. To call OLS linear model black box in the same way hierchical Bayesian or GAMM is black box to most practitioners is missing something important.

If every single time somebody went “off the road” so to speak with one of those 30 new things it was only because they felt absolutely convinced they were fixing a fatal flaw, I might understand better. And that may well be the main reason you go “off the road”. But I can tell you from a lot of experience a lot of people are doing it because: a) they feel it will impress other people and make their paper do better, b) because reviewers find it easier to pick at stats than to address the core logic of a paper, or c) they are afraid of those reviewers. Those are lousy reasons to be trashing our shared scientific understanding.

No I have no wish to “trash our shared understanding”. I like general comprehensive frameworks- GLM(M)s are such a thing and as I said above, a) are what most ecologists need and solve most of their problems with analysis and b)are easier to learn and work with in a Bayesian setting (although honestly this is a bit less important). I agree there are tons of new things, new packages, and lots of misinformed users out there. This is why we need to train in how to recognize when models are not fitting well, overfitting, failing to recover parameters from well posed simulations, etc. there is a general framework for how to experiment safely and unfortunately it does take an investment to learn, even for those of us with good math backgrounds…But that’s just life.

@ Chris Wilson:

I agree with this and we struggle with it in our intro biostats course. Personally, I’d rather teach general linear models right up front rather than doing what we currently do, which is teach t-tests, regression, correlation, chi-squared tests, and univariate ANOVA as separate things. And then only teach ecology majors (who have to take a second, more advanced biostats course) that they’re all just special cases of a more general framework.

We have cogent reasons for doing what we do–basically, that students need to know those special cases to be able to read the literature, and because that’s the way that the textbook we want to use (Whitlock & Schluter) teaches things. But if someone could point me to a textbook that has all of Whitlock & Schluter’s strengths but teaches GLMs up front, I’d seriously consider it. Grafen’s GLM textbook is too advanced for our purposes. Haven’t had a look at Andy Hector’s new textbook yet…

Thanks Chris. So how, concretely would you train today’s graduate students up to a full expertise in Bayesian GLMM? How many semesters do you think that takes?

And are you prepared to leave out all the existing faculty (and government scientists) who realistically don’t have time to retrain at that level (many don’t even have time to learn R as much as they might wish to).

And what do you say to people who disagree with you and say they don’t really need BGLMM. They need to learn RDA/CCA and survivorship analysis and meta-analysis techniques and zero-inflated models and point process models and markov chain estimation and spatial statistics and time series analysis (the latter two which admittedly can overlap with BGLMM but require significant training in their own right). And discipline specific models be it mark-recapture & detection probabilities or rarefaction models for richness or … How much training does it take to spin people up to be fully competent practitioners in all those areas?

@Brian: How long should I spend on taxonomy, or natural history, or protein extraction, or writing & communication skills, or on data management, or computing? I would hope that an appropriate amount of time would be spent on each and yet it all too often seems that just about everything else gets prioritised.

When I’m teaching grad students in our 1 semester course, I do 1 lecture on regression (what Jeremy is calling the general linear model) as a recap. We launch into GLMs then mixed models and blend frequentist and Bayesian ideas and approaches as we go. The course is relatively new and we’re still tweaking the scope of the material covered in light of experiences running it a couple of times. After 1 semester do the students have “full expertise” in Bayesian methods? No, but they are far better equipped to handle the literature once the finish the course, and they are better able to decide how to approach new statistical problems and data sets.

Another big hurdle is getting them sufficient computing skills so that this is not a serious rate limiting step in future endeavours.

Expecting gradu students to have full expertise in

anythingseems overly ambitious to me and unattainable. I aim for producing well-rounded scientists with some emphasis on their specialism, because learning doesn’t stop once they graduate.@ Jeremy. In terms of text-books, I think that the re-vamped Gelman and Hill book is going to be the bee’s knees in this regard. Of course, we will have to wait and see. I like McElreath’s book as well, although I do not own it. I definitely think there is a need for an ecologist-friendly/centric textbook or set of teaching vignettes or something.

@ Brian. That’s easy. I do not think everybody should try to learn everything! I personally stick to various applications of hierarchical Bayes, whether GLM(M)s or various phenomenological-mechanistic spectrum models. As a consequence, I can readily interpret and use things like OLS, t-tests, and so on as ‘special cases’, or point estimate approximations of a fully Bayesian inference. And there indeed the conventional wisdom about diagnostics can be useful.

Recently, I’ve been getting into spatial statistics, but am treading fairly carefully because I know there is a huge field there, and I cannot possibly master everything. But having a general modeling toolkit is invaluable, in a ‘learning how to learn’ sort of way. That and collaboration of course 🙂

I see great value in general linear models but I struggle with understanding how one can teach them without some understanding of a simple t-test first. When one obtains parameter estimates in a general linear model (not generalized linear models) the parameter estimates are generally evaluated with t-tests or confidence intervals are generated for the parameter estimates and these confidence intervals are based on t-scores whenever we assume that we are dealing with a sample from some statistical population.

I think an understanding of t-tests and the logic behind t-tests is incredibly valuable for understanding more advanced modeling approaches including general linear models. At the very least it teaches students that if a null hypothesis is true (i.e., there really is no difference in mean estimates between two groups because the two samples were drawn from the same statistical population) we almost never expect to obtain identical parameter estimates for the two groups. Two random samples drawn from the same common statistical population will almost always differ due to random chance. This also gets students out of the mindset that all null hypotheses are wrong because one never obtains two mean estimates that are exactly equivalent to each other. We expect mean estimates to differ even when the null is true but the question is whether these differences are large in comparison to random chance for a given sample size.

The same is true if we use a one sample t-test; by random chance it is unlikely that we will ever get a parameter estimate from a sample that is exactly equivalent to that of the population from which it is drawn. I think the process behind the t-test helps in understanding the basis of what a p-value actual represents in frequentist statistics and this understanding is important before getting into more complicated modeling approaches that involve the possibility of more than one hypothesis test (e.g., tests of y-intercepts and slope estimates).

@ David:

I think a hybrid pedagogical approach could be to teach single-sample and two-sample t-tests first (and paired t-test, which can be taught as just a single sample t-test in disguise), and then teach general linear models.

@David & @Jeremey: I’d be more inclined to fit the two-sample t-test as a linear model, dissecting the statistical output as we went along the lectures/practicals etc. Given that the t-tests for regression coefficients are conceptually the same as Wald-tests for GLMs, I don’t see the point in starting with the specifics of t-tests, then doing linear models/ANOVA, then doing GLMs (sensu McCullough & Nelder). The underlying way of thinking, the idea of how frequentist inference is applied, is the same, it’s just the detail that’s different.

I think this is kind of a strange argument. I mean, we don’t train graduate students to have a strong understanding of stable isotopes, and eDNA analysis, and field sampling, and taxonomy, etc. for all possible techniques. No one ecologist can fully understand all of the methods in the field. That’s not a good argument for not introducing methods; instead, it means a given paper has to be reviewed by someone who understands the limitations of a given method. I don’t honestly see any difference here between statistical methods or experimental techniques. I don’t have any shared understanding of how hydroacoustics works, but that doesn’t mean I think people shouldn’t use it, or that the people who do use it are only doing it to impress reviewers. I think there are only three questions that need to be answered when using any method:

1. What is the ecological question you’re trying to answer?

2. Can this question actually be answered using this method?

3. Under what conditions does this method not give accurate answers, and have you controlled for or tested for those conditions?

Not everyone needs to understand every possible model. We need reviewers who can understand a given approach (and the pitfalls with it) so they can evaluate a given model; I hope this will get better over time as people who work in a given sub-field of ecology learn which statistical approaches work for their kinds of data, and familiarize themselves with both the experimental and field methods in their system, and the statistical methods appropriate to for those methods.

Honestly, after reviewing for a few years now, I think the most common issues I see statistically are not with overly complicated methods the authors don’t understand, but with people using simpler methods (lm, ANOVA, etc.) and stretching what they can interpret from those models past the breaking point. Errors like saying that a non-significant term means that there’s no effect, or that a small AIC means the model is true, or the complete absence of any tests for possible miss-specification of the model. These are the kind of issues that I think do affect the “core logic of the paper”, and don’t seem to be more or less common when working with “‘complex” or “simple” stats.

Eric I get your point. And I generally agree with you far more often than not.

But I don’t think isotope ratios is quite the same thing as statistics. Measurement methods tend to be corrrelated with specific ecological questions. Statistics not as much. Lets say as an editor I get a paper on a specific topic – the effects of global change on biodiversity just to pick my own research field. What you are saying means that if the authors chose to use advanced statistical method X I should then prefer (or at least need) to send the paper to somebody who has used advanced statistical method X than somebody who knows a lot about the effects of GC on biodiversity who may know six methods but not method X. Or alternatively send it to that GC on biodiversity expert and just trust the author on their stats. Its great to say get both types of reviewers but in this day and age of sometimes having asking 20 people just to get two to say yes, that is often not possible or realistic.

I also note that my argument could be used to prevent methodological innovation and advance so for sure its a balance.

But the incremental benefit from using obscure or cutting edge statistics that nobody else who understands the ecological question can understand is usually not worth it in my opinion.

Brian, I think you should aim to get a reviewer for the stats in the same way that you would clearly want a reviewer with expertise in global change & biodiversity.

If we can’t find those reviewers, well we have different problem (like let’s not treat the peer-review stamp of approval as being that meaningful). This could be solved by preprints or just publishing stuff and letting scientists use what results they trust. I mean if stats are important to the results being claimed and the stats aren’t reviewed, isn’t this basically the same as just throwing a preprint up on ArXiv?

I don’t want to move this off into a debate on publishing. I just find this inability to find reviewers thing an odd argument in favour of not using appropriate stats because they happen to be fancier than the stats knowledge of your GC reviewer?

“just publishing stuff and letting scientists use what results they trust.”

Oh god no. This is getting off into a different topic, but just for the record, I think it would be bad idea for both individual scientists and for science as a whole to just publish everything and then say to everybody “filter it and evaluate it for yourselves”.

Brian:

I agree with you that there’s a cost to using complicated statistics, in terms of understanding. But there’s also a cost to focusing on simple models, in that I have often seen people use a simple method that they think is answering their question, but is completely incapable of actually informing them about what they want to measure. For instance, let’s say someone is trying to estimate the probability of a lake getting invaded next year, based off a history of invasions of lakes in the surrounding area. The “simple” way to do this would be to use linear or logistic regression to regress current invasion status on lake traits, but this is actually answering the different question “what is the probability of currently being invaded”, and really won’t tell you how likely a given lake is to be invaded next year. Survival analysis is a bit more complex to fit and to explain, but it at least has a hope of actually answering the question you want.

As a graduate student who helped a lot of people with stats, I’ve seen quite a few situations like the latter one, where the simple approach can’t answer the question the student wants answered, but they were hesitant to try an approach that might actually work since their advisor was asking “why can’t you just do a t-test?”. I’ve also seen this type of error crop up in a bunch of papers I recently reviewed, where the authors used a simple approach that could not possibly answer the question they asked.

I think the answer needs to be that we get better at teaching data analysis paired with data collection. Too often, they’re taught as completely separate things, where you need specialized tools to do field or experimental work but that stats are a one-size-fits-all approach that you learn in “a stats class”.

For your question about choosing reviewers: honestly, I don’t think there’s any single right answer there. I think we just have to accept that with 2-3 reviewers to a paper, there’s just no chance that you’ll always have all the knowledge needed to fully critically evaluate a given paper. I do think that any reviewer should be able to evaluate the questions “did the authors clearly describe their question, and how their [field/experimental/theoretical/statistical] approaches answered that question?”, and “did the authors check for possible errors with their [f/e/t/s] approaches?”. I also think reviewers should clearly state what parts of the paper they’re comfortable evaluating, and identify parts of the paper they think need an outside expert to test. I just recently reviewed for a journal where one of the questions they asked was “does this paper need expert statistical review?”, which I thought was a great idea.

@ Eric (and Gavin, and others):

I just wanted to say that it’s hilarious that the man who wrote this: https://dynamicecology.wordpress.com/2012/11/27/ecologists-need-to-do-a-better-job-of-prediction-part-i-the-insidious-evils-of-anova/

is now being viewed as an ANOVA apologist.

Absolutely not accusing Brian of inconsistency or changing his mind, or commenters of not being aware of that old post when they should’ve been! And not criticizing the conversation so far, which I think has been productive and interesting. I’m just amused. 🙂

I think it’s a refection of how difficult it is for people to convey their full views on a complex, intertwined set of topics on which there’s scope for reasonable disagreement.

@Jeremy: I too was amused by this, having read that and other DE posts on prediction & papers on this area. It’s also why I have mentioned prediction in passing here a couple of times. You aren’t going to get the same results if you predict or simulate from i) a model fitted to log-transformed count response (say), and ii) a Poisson GLM fitted to the count response. You might get similar results from the two models, might be able to do testing of point estimates via model i) without too much trouble, but predicting and simulating data from those models is where things deviate and why I would prefer the more complex GLM.

Jeremy:

I definitely agree on this, that Gavin, Brian, and I likely agree on more than we disagree with regards to statistical approaches. I think this discussion between really comes down to relatively subtle disagreements about the relative frequency and cost of mistakes due to using too complex statistics, versus trying to pry an answer out of common methods that aren’t designed for a given task.

I do have to confess that I’ve started to get a bit sensitive to the term “statistical machismo”, as it can be a hammer used to hit anyone working on method development (although I definitely know that wasn’t the original intent). We should at least come up with an equivalent term to apply to people who’s approach to stats is “if it isn’t in Zar, why do it?”. Maybe “statistical fuddyduddyism”? “Everything looks like a nail-ism”?

@ Eric:

Usually, when people complain (understandably) about rhetoric like “statistical machismo” (or “zombie idea” or “p-hacking” or etc.) on grounds like yours, they suggest some more neutral, less-provocative term lacking any connotations of personal criticism. But personally, I’d rather be accused of statistical machismo than statistical fuddyduddyism! 🙂

Eric – statistical method development is important and useful. I would be dismayed if my coinage was being used against that.

Maybe it is helpful to go back to the very beginning of what led me to write the statistical machismo post.

Anecdote 1) reviewing a paper for a high profile journal that put the word Bayesian in the title and spent large amounts of space gushing that the major innovation was using Bayesian statistics (when in fact traditional statistics would have worked just as well and gotten an identical answer) and through a couple of rounds of review couldn’t get them to emphasize their biological finding more than their “major innovation” of using Bayesian statistics to do a simple analysis.

Anecdote 2) having reviewers of several papers I co-authored using BBS data tell me that I absolutely had to use detection probability models in complete ignorance of reality. The worst was one who said that I had to abandon abundance so that I could look at presence/absence with detection probability. But several more ignored the fact that I was comparing an index of abundance within one species across space and time and that the spatial scale of my questions (geographic ranges) had no existing data that could model detection probabilities and that obtaining such data would be logistically prohibitive and that I had done several things to address the most severe detection issues (e.g. eliminated night birds and water birds that are known to not be well detected by BBS issues, used a 5 year average, etc). Was it an imperfect analysis. Of course. But it was a very reasonable approach given the data available and question and it had very large effect sizes that were not going to change with methods.

That, to me, is what I call statistical machismo. And by the way I rarely see those issues with statistically sophisticated users like yourself. Its the converts who don’t really know what they are doing who cause the problems.

Brian,

I agree, and I can honestly see the motivation for the original statistical machismo post. I’m mostly being a grump. 🙂 As a methods person, I will confess that I get too easily excited about interesting new statistical techniques, and quite possibly have erred on the side of using something more complicated than needed (or recommended something more complicated than needed). I also think methods papers often do a poor job of describing the pitfalls or limitations of a given approach, explaining what sorts of questions a method is good for answering, or in acknowledging that there’s other methods out there that can do the same thing.

I did say in the machismo post that, as a reviewer, I try hard to recommend tests for possible errors with the method the author actually used, rather than tell them to “use method X instead”. Even when I think that a given method can’t actually answer the question the authors are asking, I generally recommend possible approaches that could, without saying “you have to do this”.

Eric – I also think a lot of this comes down to the difference between you getting excited about a new method and then either writing a methods paper or using it in an ecological results paper where you explain it well and don’t assume your ecology is better because of the stats (both of which I think is great) vs some one like you but not as smart or nice as you telling me I HAVE to use this method because its new and shiny and exists.

One is embracing diversity of methods, one is shutting down diversity of methods.

I’m sure you’re not suggesting this Brian, but it comes across as if we shouldn’t bother with newer (or “fancier”) methods because everyone knows what the older/established methods do and we shouldn’t expect people to change?

I have and do choose to fit linear models using Bayesian approaches. I could get effectively the same result using OLS. Am I a statistical masochist? I don’t classify myself that way because I see a real advantage to framing my analyses from a Bayesian perspective, in part to avoid the tortuous mental agility required to do hypothesis testing with frequentist p-values. I use GAM(M)s because in my world view nothing is linear, and if the truth is close enough to being linear I know that the GAM(M) will tell me that, because statisticians have done the work to show that with modern smoothness selection methods they do the right thing the appropriate amount of time.

The argument you are making with the ‘machismo’ thing is conflating a number of issues and I worry the result will not be ecologists going away with one or both of these ideas: i) understand a technique before you use it, or ii) don’t use technique X for the wrong reasons. instead I worry they’ll got away with the idea that fancy stats are bad or not worth the time to learn.

I disagree that we collectively understand AIC. I would agree with you if you’d said “as a field we collectively

don’tunderstand AIC”. How many times do we see data dredging — a popular R tool that crops all the time in the literature is named`dredge()`

— justified from a misunderstanding of IT approaches? How many ecologists would know that AIC places model selection very much on a prediction footing rather than other, equally valid, approaches to selection? And given the number of times I’ve seen linear models used where fitswithin the range of the observed datagive negative expected counts or interpret confidence intervals that exceed 0, or have issues with heterscedasticity, I’d wager that many practitioners in ecology don’t have the first clue what they’re doing when it comes to linear models. They may think they do, but clearly they don’t.To focus on specific techniques and what amounts to rote learning is, I believe,

thefundamental failing of statistical teaching in applied settings. All too often statistics is portrayed as a bag of procedures or tests that we can distill down to a flow diagram to be followed at all costs. Stats is more of an art form than this implies, where experience counts for a lot. I want my students to gain a conceptual understanding of some general statistical principles, not learn a bag of tricks that fails them the first time that, inevitably, their data isn’t as well behaved as the classroom examples (which is all the time), or them come across a new approach and have now way to appreciate it.Limiting ourselves to a core suite of tools only helps us communicate if all our problems look like nails. Most problems don’t look like nails, so communicating as if they were nails seems utterly pointless to me.

Gavin – I agree with much of what you say from overemphasis on p-values to the failings of AIC (https://dynamicecology.wordpress.com/2015/05/21/why-aic-appeals-to-ecologists-lowest-instincts/) to wanting innovation to using GAM myself (not so much GAMM – mixed models is unthinkingly way overused in my opinion).

I think we just disagree on more contextual issues:

a) how much there is pressure in ecology to use fancier techniques for non-good reasons (see Mark Brewer’s comment below and he is a statistician who works in many environmental and social science fields)

b) how much training we can reasonably expect 80% of the ecologists to have

c) how much (a) and (b) combined lead to people doing really poor analyses that nobody else can catch (which they probably could in a simple shared language world be it OLS or Bayesian linear models)

d) the relative importance of reaching 80% of the field instead of 10-20%

@Brian

I’m not disagreeing with you on a) — I’m just not as familiar with that mentality as you and others here suggest you are. Anything that is being done to in some sense game the system is anathema to me.

Re b) It’s not so much how much training but what training. We should be requiring training in Bayesian methods and GLMs (the real GLM allowing non-Gaussian responses too), and computing in R or similar environment. I don’t think this is unreasonable. Can we achieve this next semester? No. Should we aim for that? Yes.

Re c) I don’t doubt this, but restricting ourselves to non-fancy stats doesn’t solve the problem because people are still messing up the non-fancy stats and reviewers etc are still missing it. Collaboration should have a greater emphasis and we as a community should value those that do collaborative science that is often seen as less worthy . We should bring on board a statistician or well-trained statistical ecologist just as we’d go out and collaborate with taxonomic expert X or natural history expert Y or fancy-high-throughput-microbiological method Z.

On d) I don’t set out to exclude 80% of the field when I do my work. I’m pretty sure well over 80% of palaeoecologists have no idea what some of my GAM-based analyses are doing for example. But the alternative, in that sub-field at least, is to often not even do stats, or if they do stats, inappropriately claim findings when they fitted a straight line though 15-20 heterogenous and autocorrelated data points. So, I see it as doing my absolute best to not be wrong (realising that I may well get the wrong answer), then trying my absolute best to communicate how I came to those results, what the constraints, caveats and limitations of my methods are, and being very clear about uncertainties.

Lots of comment on training here. I’m going to take a devil’s advocate position on what is appropriate training in undergraduate biology major. What percentage of bio majors get biology-related jobs? Of these, what fields really require models more complex than a simple t-test? Environmental sciences, fisheries, marine sciences, wildlife biology, etc. have their own majors. Biology majors at big research universities tend to go into medicine. Training in glm and gam and multilevel models is far less important than a basic class to one that rigorously trains pre-meds on quantitative thinking, because our brains just aren’t very good at this (dichotomous thinking, base rate fallacy, what the hell a p-value is, etc. etc). Of those that go into biology, most are probably biotech related. A substantial fraction of those with only a bachelors are doing tech work: pipetting or cleaning zebrafish tanks. These students also do not need training in glm and gam and multilevel models. Of those doing any kind of analysis at the bachelor’s level, the bulk of it will be t-tests using standardized protocols in the lab/unit. These students also do not need training in glm and gam and multilevel models. Yes ecology PhDs are not trained well for being a modern ecologist at the undergraduate level but this doesn’t mean we need to require glm, gam, and multilevel model training at the undergrad level. Ideally, the coursework could be made available in a sequence of classes but this would be difficult to implement in many universities because the enrollments would be tiny. That said, I would love for the one stats class that undergrad majors take to replace the “use this flow diagram to determine the appropriate test and report your p-value” approach to just teaching the linear model and reporting coefficients, contrasts and SEs or CIs (sure a p-value is fine as one tool among many others).

Good points.

I totally agree with your last point.

I would just add, to all of those who say lack of expertise in GAMM and Bayesian Hierarchical and etc is a training deficit, what you are really suggesting is that every ecology graduate student should take about 3 semesters of statistics. That is more courses than some graduate students take total. And I say that as somebody who teaches an advanced graduate stats course (basically a 2nd semester grad stats course) that tries to cover most of this material, but realizes just how superficially I have covered it.

@ Jeff:

Here at Calgary, we’re fortunate that as a big uni we have a sufficient number of ecology majors to offer them a required upper-level biostats course that covers general linear models (and a bit on generalized linear models and PCA). But at the intro level, our intro biostats course is big (144 students/semester, in 1 lecture section plus a bunch of lab sections), and it’s majority premed students who are there because many med schools now require a stats course. So yes, the considerations you raise absolutely do constrain what sorts of stats we teach at the undergrad level.

I’m still waiting to hear from folks like Gavin or Chris what statistical topics they’d like to see ecology majors (or fish & wildlife, or envi. sci., or ecologically-oriented biology majors) cover as undergrads. In the context of a N. American undergrad curriculum in which students are obliged to take many courses in areas outside their major. And whether it’s a matter of students being taught the stats in dedicated biostats courses, or in the course of the labs in their ecology courses. If you want to de-emphasize stats associated with fairly simple experimental designs at the undergrad level, what do you teach instead? How advanced a level do you think the typical undergrad in those areas should be able to get to by the time they graduate? And if you think substantially more class/lab time should be devoted to statistical topics than is currently typically the case, what do you think we should axe from the curriculum in order to free up time? All honest questions, not rhetorical ones. It sounds like you (and other commenters) want to reimagine undergraduate statistical training in ecology and allied fields. What do you imagine?

Yep, these are the questions that are haunting me now, which I have some time to think about because I don’t run a big research program. My only correction to your questions is that I don’t want to de-emphasize simple experimental designs – but simply analyze them emphasizing a linear model and its coefficients instead of ANOVA and F-values.

All good questions, and I do not think the answers are simple. I basically agree with the ideas you and Jeffrey Walker are hashing out though. If I had to sketch out a learning flow: I would start rather conventionally with probability, but introduced in a way that leads to statistical inference more readily. Kind of like how you can either take *mathematically rigorous* linear algebra (which happened to me by accident:)), in which you will sink or swim depending on your abstract algebra aptitude, or computational linear algebra, which teaches you what you need for modeling. Then introduce likelihood and Bayes as frameworks for inference, but proceed with Bayes because of its easier generalizability (yes I understand this is to some extent taking a partisan stance in a foundational dispute. But it undeniably works.). Here the key idea is that you need to perform high-dimensional integrals computationally, and you give a heuristic over-view of MCMC as a tool for doing that. You then move from estimating parameters of distributions given data, to estimating functions of parameters of distributions given data, the simplest example of which is linear modeling. At that stage, you connect to some classical ideas like linear regression, ANOVA, t-tests, but move along rapidly to modeling parameters. There you use complete, no or partial pooling to teach what “random effects” actually means mathematically. At this point, you have imparted the essence of the entire field of GLM(M)s. You then work through a ton of examples and applications to try and teach as many of the nuances and details that matter the most for the kinds of applications students are likely to encounter. So you work on univariate and multivariate models, interactions, model checking, information theory basics etc. I don’t know if this is one or two semesters.

@Chris FYI, your class idea sounds almost exactly like the layout of Hobbs and Hooten’s Bayesian Models: A Statistical Primer for Ecologists. https://press.princeton.edu/titles/10523.html

Which I highly recommend to any Ecologists still interested in Bayesian approaches after this comment thread :p

hahaha, I should definitely have a look at that Hobbs and Hooten book! I have seen it around, but never dipped into it.

@Jeremy: Do I want to reimagine undergraduate statistical training in biology (or, my true academic home, in geography)? You bet I do 🙂

Intro Stats for Biologists would start with data and plots to get a feel for working with numbers. I’d start at the general linear model — I really don’t see any reason to deny my students the benefit of a synthesis in stats that happened 60+ years ago now — but I’d do it in a way that allowed me to link back to the classical tests (one sample t via intercept only LM, two sample t via LM, etc) where appropriate. I would teach them about statistical inference, and I’d stick with frequentist approaches here, Bayes can wait a little longer before he rears his head in earnest. We’d talk about interpreting the model outputs, simulating from models to get a handle on uncertainty & how well it approximates the observed data. We’d cover diagnosing problems with our model. We’d talk about effect sizes, power, confidence intervals and p values. We’d cover appropriate graphical techniques and how to report the outputs. We’d use lots of examples from across biology. I’d emphasise the practical aspects over the theoretical.

I’m assuming they’re going to get more probability somewhere else, and that we can cover the computing requirements needed appropriately.

If we had two semesters for this we could really double down on the examples and computing. But we could do this in one semester with a little trimming if needed given numbers of students in your programme etc. I also think we might need some refresher materials to cover things like the mean, standard deviation etc. that they’ll have gone through before but may need revising. These would be available as extra course materials; I wouldn’t see it as important to give up class time for those.

Sampling and experimental design could come later, in a second course, ideally reinforced by embedded elements in other lab classes being taken.

Multivariate methods and GLMs and (G)LMMs & Bayesian topics would be in later courses at 300 and 400 level. At these levels, especially 400, I anticipate this as giving a flavour of these methods in applied settings, in smaller classes. I’d make a 300-level GLM mandatory but a multivariate methods and (G)LMMs, Bayesian etc would be optional, perhaps with multivariate methods being required if you’re doing ecology say and you had an strong community ecology theme for example. (I know it may seem like heresy given my background and activities in the R community, but I’ve emphasised GLM(M)s in my teaching since coming to Regina and haven’t done any multivariate teaching.)

@ Gavin,

Thanks for sharing your views on this. I don’t agree with all of them, particularly putting sampling and experimental/study design off until later. But I agree that what you’re suggesting for your proposed first semester could be covered in 1 semester. And given that Calgary requires two terms of biostats courses of its ecology majors, by the end of which they’d get through generalized linear models, obviously I think it’s feasible to require undergrads in some biology majors to get up to that level.

Putting off sampling and experimental design probably reflects my background where observational data has been the norm. Areas like experimental design and sampling logically go hand-in-hand with lab or field classes, so whenever they got taught, having that linkage is critical. Personally I’m not totally convinced you need to know design and sampling to cover the topics I covered. Yes the math behind the methods relies on aspects of design or unbiased sampling etc, but can’t we assume that for is given for now(?), and then in a later course spend the time needed to explain how to design a good experiment, and how to do sampling well?

@ Gavin:

Even if you’re not planning to emphasize experiments, you don’t want to teach a bit about study design right up front? Especially importance of random sampling! It’s so easy to inadvertently sample non-randomly from the population of interest if you’re being casual about your sampling, and those errors are *so* consequential (way more consequential than, say, assuming normality when actually the residuals have some other distribution).

I’d bring up how we sample data briefly in the intro lecture. Then I’d be clear that we’re assuming the sampling was done correctly (random sampling) when we went through the material I outlined above. I’d rather focus on getting a good grounding established on the basic statistical ideas without having to also juggle sampling schemes or experimental design. Once we’ve covered these basics, I feel it’s easier to then talk about how we collect data.

@ Gavin,

Ok, let’s agree to disagree on this. Not that we’re teaching every possible aspect of study design up front in intro biostats. We don’t cover, say, hierarchical experimental designs. But we definitely cover more than just “here’s what random sampling is, assume for purposes of this course that it’s always done”. And it doesn’t take *that* long, honestly.

Gavin – the outline of the entry level course that you give *is* my applied biostats course, which is actually a 2nd course for the students but I envision it as a first course (they don’t get much out of the first course) – this was summarized by the last line in my initial comment in this thread where I wrote “I would love for the one stats class that undergrad majors take to replace the “use this flow diagram to determine the appropriate test and report your p-value” approach to just teaching the linear model and reporting coefficients, contrasts and SEs or CIs (sure a p-value is fine as one tool among many others).” We plot, we linear model, we model check with plots and fake data, and we interpret the biological consequences of effects including both boundaries of the CI. We talk about NP vs. Fisher, the usefulness or not of alpha and the phrase “statistical significance” and the fallacy of dichotomous thinking. We use the phrases “multiple regression” “anova” and “ancova” to link certain aspects of the model to classical ways of thinking. We introduce the bootstrap and cross-validation. If there is time, we end with short intros into GLMs for logistic or poisson type outcomes and LMMs for simple blocked and repeated measures designs. And we do this in R (not R-commander or any of the other entry level software because these just aren’t sufficient).

I should add following up on Chris’ post above that replacing the “flow diagram to which test” strategy with a “teach the LM” strategy in the one undergrad stats class gives all students a much better foundation for more advanced classes in GLM, multi-level, and even Bayesian, etc. because, as Chris emphasizes, it is the start of a general, comprehensive framework.

Ok, thanks, there’s one answer to my question of what to teach undergrads. Teach general linear models in intro biostats, instead of teaching t-tests, regression, ANOVA, etc. as separate things. I think that’s doable (challenging, but doable), and that on balance it might well be an improvement on current practice at Calgary (which is current practice at many places). Though you’d at least have to mention in passing that, for historical reasons, this special case is called ANOVA, this other special case is called linear regression, etc.

Totally agree; challenging but do-able. The flow-chart strategy is just so easy to teach and implement. I also don’t think at the present we can just switch to teaching using a general framework (lm) without making connections with classical tests regression and ANOVA. Our whole language describing designs uses these terms. An argument that I see is that we have to teach the flow-chart strategy because this is what 99% of applied biologists use. I disagree, as I think the only way to break the cycle is to start at the beginning. One could also argue that nothing is broken…science is progressing fine. This is a more interesting argument. Regardless, I’d like to see NSF/NIH pump some money into thinking about how to re-imagine statistical education.

Jeff, here’s a new drug that you need to treat condition X. It works, honest, because we grabbed a bunch of people and did a t-test, here look at this p value. Now tell me you’ll take that drug!?

Technicians often are expected to do preliminary analyses or work up data; they often aspire to progress up the career ladder and often end up using more of those skills. Also, what happened to wanting to train a better workforce? Why should having biology majors graduate knowing lots of detail on evolution and microbiology, or the minutiae of certain ecological principles, be more desirable than having them be a bit smarter than the previous generation when it comes to data analysis and stats?

Actually I’d feel great taking the drug based on a simple t-test if the experimental setup was strong (double blind doctor/patient, controlled placebo, good sample size, good completion rates, unbiased participant selection) and effect sizes/variance explained measures were reported as well as p-values (Jeff has also emphasized this last piece as well). Might have a preference for a simple regression that included control variables like age and gender as is standard in the medical field. But that is hardly fancy statistics.

The confounding was one area I was thinking of where we need more than a t-test. My general point is that, no t-tests are not sufficient even in simple cases like this. And yes, in the grand scheme of things this is not fancy stats, but it does require more than basic statistical understanding.

Hi Gavin – I don’t think anything I have written advocated even approximately teaching students to think like your drug story. I specifically emphasized teaching the fallacy of dichotomous thinking and to start statistics education using the general linear model as a framework for future expansion.

1. My point was that GLMM and bayesian modeling would only be of service to a tiny fraction of biology majors because most biology majors graduating from American universities do nothing approximating research or analysis after they graduate.

2. Even for those going into research, GLMM and bayesian modeling would only be of service to a tiny fraction of researchers. Most biological research is in cell physiology. The field has made ginormous progress using no statistics at all, but now often supplemented with little more than t-tests combined with remarkable control of cell systems. I absolutely agree that the field could use more statistical and especially mechanistic modeling but this is mere icing. As an aside, I would argue that the related fields that have made the least progress (genomics, transcriptomics, microbiomics, GWAS, Encode, etc.) are the ones relying the most of sophisticated statistical models in place of good experiments at the bench.

– why statistical modeling? In many areas of biology, training in mechanistic modeling would be much more fruitful. So this would require classes in linear algebra and differential equations.

– of course we would need to add classes on programming

– Given 1 & 2- how would we justify increasing the added course load to other department faculty, most of whom are perfectly fine with teaching t-tests because if you need more than this (or even this), you need to search for a better model system or think of a better experiment?

3. I’m not sure that biology students are taught much detail about microbiology or evolutionary biology, at least at most American universities. We have very delayed learning here. I don’t really think any learning at all goes on until college! Take a look at physiology textbooks that are used for upper level general physiology courses at most American universities. These are painfully elementary, barely (barely) above the level of an intro bio textbook and while there are a few equations there is nothing approaching the concept of modeling.

4. At most American universities, incoming biology students are woefully underprepared for math. I suspect most Biology PhDs had calculus in high school. At many universities, incoming biology majors cannot even start calculus because they cannot pass the pre-requisite pre-calc test. Many cannot even start with pre-calc because they cannot pass the “college algebra” test.

5. If we required a series of calculus-based modeling classes for biology majors, our number of majors when drop to that of physics departments.

* Most American universities are not MIT or Caltech or Williams.

@ Jeff:

1. this gets back to Brian’s old post on what math ecologists should teach: https://dynamicecology.wordpress.com/2014/10/20/what-math-should-ecologists-teach/

2. I do think that some of the comments on this thread about what we can teach undergrad biology majors in 1 or 2 required courses at N. American colleges and universities may be based on unfamiliarity with the background mathematical preparation of the bulk of N. American biology majors. Or at least on optimistic assumptions about the mathematical preparation of the bulk of N. American biology majors. But things may well be different in Europe and elsewhere, where it’s my understanding that undergrads specialize more in their major subject.

3. Go Ephs. 🙂

@Jeff Apologies, I was being flippant. I don’t think all Biology majors should do GLMMs and Bayesian methods. If I said that or it came across that way I’m sorry, some of the comments have covered intro stats classes and grad classes so I may have mixed up my thoughts.

I do think that biology majors should be getting GLMs (sensu McCullagh [spelled it right this time] and Nelder) as required contents.

I’m not advocating teaching a bag of tests or models. I’m advocating for teaching ideas and concepts. I don’t feel that you can do that by teaching the menagerie of ANOVAs, regression, t-tests etc as separate topics/tests.

Emphasis on math is, I feel, misplaced. I’m not teaching statisticians. Having good math background might help, but it’s certainly not a prereq for doing applied stats.

Whilst I haven’t worked in the US, I am at a small Canadian school now having studied and worked at one of the UK’s top research-intensive universities, so I have seen quite a range of abilities from incoming students. My experiences in this regard at what colours my views expressed through this comment thread.

Via Twitter. Too short for me to fully grasp, but I suspect Brian and this commenter will have to agree to disagree about statistical machismo…

I did find it a bit amusing/ironic that the two most recent Dynamic Ecology posts are:

“What new technology will soon change ecology the most?”

and

“are statistics in ecology papers becoming too difficult?”

As balance, you might need to write posts on: “what new statistical approaches will change ecology?” and “Is technology in ecology becoming too complicated to understand?”. Which actually would be pretty interesting posts as well…

(note: this isn’t meant as a critique of your topic choices! I just found the juxtaposition amusing).

An interesting* exercise would be to go back through all our posts looking for cases in which we said one thing at time t and another at time t+x without explaining ourselves.

*For “interesting” read “crazily obsessive”

Via Twitter, Mark Brewer reports that ecologists reviewing his papers think his statistical methods are too simple (thereby pushing him towards what he’d view as statistical machismo), but reviewers from other fields think his methods are too complicated:

Via Twitter. Ironically.

People’s experiences with reviewers pushing for complex vs simple statistics clearly varies. I’m gonna poll on this next week.

Via Twitter:

Our answer to that question is in this old post: https://dynamicecology.wordpress.com/2013/12/19/ask-us-anything-what-statistical-techniques-does-every-ecologist-need-to-know/

And in reply, Gavin suggests that I (and probably many other ecology faculty) should retire: 🙂

Ok, I’m kidding, of course. Mostly.

Applying for my free bus pass now…… 😦

Sorry I was one of the twitter commentors but I feel that I should probably ellaborate my position here somewhat.

I must admit that I am amazed to hear the responses on this blog post from people that say that they feel that need to use complex methods in order to get published/funded. Admittedly my sample size is small but my experience has been exactly the opposite: I deal primariliy with observational data rather than stuff that comes from nice planned experiments. As a result some of statistical models I need to use in order to deal with uncertainties surrounding those data can get reasonably complex (although I make no claim to be a stats wiz). In the few cases where I’ve had negative reviewer comments, I have had complaints that I should be using a simpler method. In one notable grant application I was rejected because one reviewer took great exception to us proposing to use Bayesian analytical methods (*not* because he/she didn’t like our model but they didn’t like our inference framework!).

Now regardless of whether you guys agree that I should be using the more complex models or not, my impression has been the default position of reviewers is too prefer simpler analytical methods, often to a fault. Yes, you should always justify your model complexity, but my feeling is the community is very comfortable with things like AIC/WAIC/regularization that already have these penalties built in.

When I was an undergrad I remember my close-to-retirement statistics professor complaining that the availability of SPSS and Minitab meant that students were too commonly using these fancy new two-way ANOVAs with interaction terms and that good experimental design should’ve meant that such analysis shouldn’t be needed. Fast forward to 2017 and we’re seeing the same arguments here except swapping SPSS with R and two-way ANOVAs for GAM(M)s/GLM(M)s etc.

Incidentally, saying “a t-test gives the same result as fancy method ‘X’ so you should use a t-test” is not a good argument. We could also reject a null hypothesis off the basis of a coin flip but nobody would argue that we should be using the much simpler coin flip model of statistical inference. The statistical model you use should be as complex as you need to accurately model the fundamentals of your system and we already have good methods for avoiding spurrious over-specification.

Clearly we need to do a poll on how often people are pushed by reviewers to simplify vs. complexify their statistics!

Wow; I’m stoked to see that my humble questions sparked off such a robust debate. I hope I’m not too late to the party to comment. I’ve found all the viewpoints expressed super-informative, but Nate’s comments about listicles and bad writing in statistics methods sections, and Gavin Simpson’s comments about ‘reimagining” undergrad statistics really resonated.

That said, let me fill you in a little on where my questions came from:

[1] – I teach a fourth year undergraduate methods course called ecological methods, with an enrollment of 15 – 20.

[2] – I adopt basically the graduated method that Gavin Simpson talked about above, emphasizing statistics as the set of methods we use to cope with the inevitable variability of nature. So students do a lot of exploratory analysis before getting into actual tests, which are focused on frequentist methods (simple linear regression, t-tests, ANOVA, ANCOVA, some non-parametric tests, and a personal analysis project at the end of the semester). I also introduce the idea of simple ecological models and nonlinear regressions. I usually find that student’s brains start to implode around about the time we talk about ANCOVA, and previous attempts to teach multivariate statistics were less than successful ( I have come to believe such stats need their own dedicated course).

[3] – I use real data sets throughout that are in the public domain, from my own research, or which have been donated to me. IF ANYONE KNOWS OF ANY SUCH DATA, I AM ALWAYS BUILDING MY COLLECTION

[4] – I teach using R, in part because that is what I use, and in part because SPSS has been progressively wrecked and changed over the years since it was takebn over by IBM.

[5] – I also inject a component of statistical and study design philosophy into the course, and most of the papers I include in the course reader deal with ecological issues that are controversial or under debate. American Scientist is a great source of such papers. I also include classics like Platt’s paper on Strong inference, and compare it to more “holistic” approaches typical of ecology. I also have a recommended text book, which is Gotelli and Ellison’s primer of ecological statistics, a solidly frequentist approach.

[6] – I think that it is important to know who our students are. In my case they are supposed to have done at least one course in statistics before coming to me, but that doesn’t mean that they have retained much of what they supposedly learned. This is revealed in a little informal test I give them ont he first day of class. About half do not know how to define or interpret a p-value, and a distressing number talk about “proving” hypotheses (I know, it’s horrific). In consequence I have found it necessary to devote at least 20% or the course to remedial review of materials that they supposedly learned already.

[7] – I have also noted that while they may have learned some statistics and then forgotten them, they may never have received adequate tutition into why we do data analysis in the first place. So they have had a lot of how but hardly any why. They also have received little training into how to handle and archive different types of data. In short, their understanding of how and why to analyze data sets is rudimentary to put it mildy. Maybe your students are different, but that is where mine are at.

So now you know where my questions were coming from. If anyone is interested to see my course outline and the reading list of papers I use, feel free to email me at a.park@uwinnipeg.ca.

One final note. As someone who reads a fair few papers, I have noted that the sophistication of statistics does not appear to be helping people explain more of the variability in their data. there seems to be a tendency of emphasizing the 4 – 5% of variance that is explained, usually hammering the point home with a confident declarative article title. But then when you look at the graphs, you see an explosion of points that makes it clear that really, the majority of variability in the data is unexplained (stochastic, or missing variables). And when I look at such graphs, I find myself saying, right the real story is one of individual variation….)

>>>>I use real data sets throughout that are in the public domain, from my own research, or which have been donated to me. IF ANYONE KNOWS OF ANY SUCH DATA, I AM ALWAYS BUILDING MY COLLECTION

Teaching Issues and Experiments in Ecology has a Figure Sets and Data Sets section for this exact reason (http://tiee.esa.org/vol/toc_all.html) – to build collections of data that have been curated for teaching. Using data in teaching has become much easier with the accessibility of supplements and Dryad, but it still takes a lot of work to clean it up for students to work with and identify the learning outcomes. The more we can share these resources, the better!

This is one of my new favorites: https://ourworldindata.org/

The visualizations are stunning and each figure has the data available right there

I will shamelessly promote a QUBES group that is working on this issue (increasing the sharing of data-centric teaching materials: https://qubeshub.org/groups/data_incubator). We had a session at ESA with some fantastic speakers talking about resources for teaching and all of their presentations are posted.

For making literature more accessible to students, particularly intro students, one of my favorite resources is Science in the Classroom: http://www.scienceintheclassroom.org/

Thanks for these links. I mine the Dryad repository but many/most of the data sets are from complex studies (not just complex analyses) so I also have to work to simplify the data for the kinds of analyses in applied biostats and sometimes I just create fake data modeled on that in a dryad data set.

As an agronomist who has tried to follow general trends in ecology, I have definitely noticed this and agree with Brian, “It’s increasingly hard to find papers that ecology faculty or ecology graduate students can read or understand” I end up wondering either if it is really that complicated, or if the author is trying to hide something.

Pingback: Taking statistical machismo back out of twitter bellicosity | Dynamic Ecology

Pingback: The Week in Botany 23 - Botany One

Pingback: Poll on experiences with statistical machismo | Dynamic Ecology

This.

https://www.nature.com/magazine-assets/d41586-017-07522-z/d41586-017-07522-z.pdf

I broadly agree with most of those recommendations. But I also have to note that you can find pieces like that even within ecology for the last 40 years. And yet nothing has changed.

There is an extreme cultural conservatism around basic inferential methods.

In general within the topic of this post, a lot of statistical practice is derived from culture rather than a deep understanding of statistics. I’m not aware of an academic that really studies that.

It’d be interesting to take a comparative approach to rates of change in different aspects of scientific practice, or even academic culture more broadly. What’s changing fast and what’s changing slowly? Are there any commonalities that explain why some things change fast and others slowly?

Would somehow need to put rates of change in different variables on a comparable scale.

You’d think there’d be a literature on this. There’s a literature on everything!

Opening bids for things that change slowly (compared to how fast other things change): basic inferential methods.* Journal hierarchies within fields. Others?

*But see that recent text-mining exercise showing that ecology had peak ANOVA back in 2002 and that AIC is taking off rapidly.

Brian, I absolutely agree that an understanding of statistics, including how to use and interpret analyses, is deficient in most fields. It just isn’t taught within that framework anymore. As well as the concept of sloppy data results in sloppy science, no matter how it is analyzed.