Here are the stage-setting readings and videos I use to kick off my intro biostats course. I want to give the students an engaging overview of what statistics is all about. I also want to get them thinking about how statistics connects up with their other biology courses, and with their lives. These readings and videos are an optional-but-strongly-encouraged supplement to my intro lecture.

David Vaux: Know when your numbers are significant. Overview of basic concepts of descriptive statistics and null hypothesis testing, aimed at biologists.

Amelia McNamara: Do you know nothing when you see it? Covers the same broad territory as Vaux’s piece, but in the form of a video. Also goes beyond Vaux’s piece to cover bootstrapping.

Mona Chalabi: Three ways to spot a bad statistic. TED talk by a data journalist. I like this for several reasons. It’s by someone who’s not an academic, and it’s about the use (and abuse) of statistics in people’s everyday lives. I don’t want students thinking that statistics is just for biology research. It engages seriously with, and rebuts, the claim that statistics–all of it–is inherently elitist and misleading, just a way for the powerful to bamboozle and control the powerless. Also engages seriously with the claim that all statistics are misleading because, by design, they fail to capture the uniqueness of individual lived experiences.

Kieran Healy: The kitchen counter observatory. Covers some of the same territory as the Chalabi video. It’s about how data can bring you closer to, rather than distance you from, the reality of individual human lives. Especially during a pandemic. Also a good piece for letting students know just how much data is available online, for free. And it’s lovely writing.

Andrew Gelman: A world without statistics. Here’s a statistician questioning whether statistics is actually all that important in the grand scheme of things. My hope is to surprise students a little, and so get them thinking, by giving them a contrarian piece from an unexpected source.

Joel Cohen: Mathematics is biology’s next microscope, only better; biology is mathematics’ next physics, only better. Not just about statistics, though statistics figures into it. Good contrast with the Gelman piece. I like giving the students readings expressing a range of different (and sometimes conflicting) views.

Ben Bolker: Other people’s data. This one probably resonates the least with undergrad students in intro biostats, because it’s aimed at Ben’s fellow quantitative ecologists. But I think it’s a good complement to the Cohen piece, because it’s about the human side of doing statistics (and other sorts of math) as a biologist.

Anyway, those are the pieces I use at the moment. But there’s a whole world of material out there, most of which I’m unaware of. So what readings, videos, or other materials do you use to kick off intro biostats? Looking forward to your comments.

I truly enjoyed the readings and videos you posted, Dr. Fox. As a non-academic statistician, I appreciate your plug for my ilk. I was also impressed by your proper use of the term “statistics” as a singular noun, as a great many academics have incorrectly criticised my use of the term in this singluar context.

Thank you Steph, glad you liked the linked readings and videos.

Re: “statistics” as a singular noun, it depends on the context, right? The scholarly field “statistics” is singular–there’s just one scholarly field, and that’s its name. But if I take a sample and calculate the mean and variance, I’ve calculated two sample statistics, plural. Right? I’m double-checking this because I hadn’t actually thought consciously about it until just now. So now I’m slightly worried that I only got it right by dumb luck in the post, and that I’ll get it wrong next time if I don’t stop and think about it!

Commenters may now commence debating whether “data” is singular or plural. Here’s what I’ll be doing during the debate: https://tenor.com/view/theitcrowd-mauricemoss-popcorn-ready-watching-gif-12592609 🙂

I’m putting together a new course for this fall and I’ve been trying to compile such a list, so thanks! I have in the past assigned short videos by Hans Rosling, simply because he was just so excited about statistics. His videos are a bit old now so the visual quality I’ve been able to find isn’t great. I have found a lot of nice short articles in “Significance” magazine that I may use. I don’t have the references with me right now but some articles about Galton and eugenics can jump start conversations. Anything accessible about the Sally Clark or other “cot death” cases are great examples of “the prosecutor’s fallacy” (and they bring in medicine, which helps engage the pre-health students). The case of Harold Shipman, a medical doctor who was a serial killer, is a fascinating look at how statistical analyses might (or might not) detect bad or sociopathic doctors. I’ve tried to find something pretty accessible about “the stats wars” (p-values, significance thresholds, etc) but so far nothing I’ve found has been pitched at the right level or it’s too long so for now I’m just using a short document I wrote myself for the class. But I’ve found that students are intrigued if you tell them that a very large proportion of what they’ve been old about those issues is either wrong or still being vigorously debated. It dispels the myth that “doing statistics” is just like following a cookbook recipe.

I talk about Sally Clark and the prosecutor’s fallacy in my intro course too. And I talk about the “stats wars” at the end of the course, though I’m not sure how much it resonates.

This case (and others) was recently discussed in the BBC 4 programme Sideways. You might find it interesting.

https://www.bbc.co.uk/sounds/play/m000s2ks

Thanks for the pointer.

And thanks for the reminder about Hans Rosling, you’re the second person who’s told me his videos are good. I need to check them out.

His gap minder website and “Factfulness” book could be good sources for data and questions that might resonate with students too. Spiegelhalter’s book “The Art of Statistics” is a good book for neat stats stories and the data and code are available. My biggest challenge so far is that I am rapidly recognizing that it would take me 10 courses to present all the “fun stuff” I’d hoped to originally! I suspect I’m not alone…

Yeah, I have many more fun examples I’d like to cover than I have time to cover.

Most popular link so far is the Ben Bolker piece, I assume because he’s a well-established ecologist.

Everyone who’s not clicking on the Kieran Healy piece is really missing out.

At the beginning of the biostats course I teach, I have the students watch “Everything wrong with statistics (and how to fix it)” by Kristin Lennox (https://www.youtube.com/watch?v=be2wuOaglFY&t=2701s). I enjoyed watching it and the students seemed to get a lot out of it too. It is a convincing critique of “cookbook statistics”, the idea that there is an objectively right test for any given data set, and that learning statistics is essentially memorizing a look-up table of which test to do when. I also like her definition of statistics: mathematical modeling under uncertainty. The idea that learning statistics could be more interesting than just memorizing a lot of formulas, tests, and assumptions was motivating for a lot of students in the class.

Thanks for sharing that. Will check it out.

Funny thing about that Gelman post. I actually commented on it when he posted it. More importantly, it was one of the motivations for a recent paper I published with Kathi Irvine designed to provide an introduction to explanatory statistical models. Not sure if it hits your target audience, but would be interested to know if you think it provides a contrast to Gelman’s post.

https://www.researchgate.net/publication/338163927_Scientist's_guide_to_developing_explanatory_statistical_models_using_causal_analysis_principles#fullTextFileContent

Thanks Jim. You’re right that this isn’t really what I’m looking for to kick off intro biostats, but I’ll check it out.

I’m not going to place here the readings as they are mostly in Portuguese, but these are some videos (TED talks) in English I use when teaching statistics for biology:

What to trust in a “post-truth” world, by Alex Edmans – https://www.ted.com/talks/alex_edmans_what_to_trust_in_a_post_truth_world

3 ways to spot a bad statistic, by Mona Chalabi: https://www.ted.com/talks/mona_chalabi_3_ways_to_spot_a_bad_statistic

How juries are fooled by statistics, by Peter Donnely: https://www.ted.com/talks/peter_donnelly_how_juries_are_fooled_by_statistics – I like how this ted talks shows non-intuitive probabilities and how lack of statistical knowledge may lead to wrong conclusions with serious consequences

Meet the inventor of the electronic spreadsheet, by Dan Bricklin https://www.ted.com/talks/dan_bricklin_meet_the_inventor_of_the_electronic_spreadsheet – I found it curious and shows how something we take for granted now wasn’t so obvious when it began

The linguistic genius of babies, by Patricia Kuhl https://www.ted.com/talks/patricia_kuhl_the_linguistic_genius_of_babies – I think it’s a nice example of probability distribution functions and very interesting as a whole.

As a sidenote, one activity I give at the end of my statistics courses is asking my students to read Brian’s posts on statistical machismo and write one or two pages about them. There are always some texts which are quite interesting!

Thanks Pavel!

I am not remotely qualified to teach biostatistics, so here is a perspective somewhere between that of an instructor and that of a student. I had one undergrad statistics course, taught by a superb instructor (Dick De Veaux), and I got an A-plus. AND YET… Somehow the fundamental difference between descriptive statistics and inferential statistics never sank in for me until 2017, when I was hired to write some statistics education songs, including a song about that distinction (https://www.causeweb.org/smiles/songs/inferential_descriptive_statistics/playback?listenOnly=1). I sure wish that I had appreciated the distinction as a student. For example, a typical student question — one that I myself had — was, “Should my error bars report standard deviations or SEM’s?” And the reply I got was, “Well, usually we use SEM’s because they’re smaller.” At the time, I did not realize that my choice of statistic should reflect the scientific point that I was trying to make!

“use SEMs because they are smaller”?! Dear lord, that’s a terrible answer to that question.

Speaking of terrible answers to stats questions:

Gonna start using this, from the Atlantic’s covid data project: https://www.theatlantic.com/science/archive/2021/03/americas-coronavirus-catastrophe-began-with-data/618287/

Here’s another candidate: https://www.newyorker.com/magazine/2021/03/29/what-data-cant-do