In a recent issue of Limnology and Oceanography Bulletin, Stuart Hurlbert reviews (UPDATE: link fixed, venue corrected) the new (4th, 2012) edition of Sokal & Rohlf’s classic biostatistical text, Biometry (HT Carl Boettiger). The first sentence of the review gives you the flavor:
Reader be forewarned: were it allowed the title of this review would be “A readable but overblown, incomplete and error-ridden cookbook”.
Tell us how you really feel, Stuart! And to think that sometimes I worry if I’m too tough on other people’s work…
You should click through and read the whole thing. But if you’re not so inclined, here’s a brief summary of Hurlbert’s beefs with Sokal & Rohlf (the book, not the people; I’ll refer to the book as Sokal & Rohlf because that’s what everyone does). Hurlbert says his beefs apply to all editions, not just the most recent one:
- No coverage of experimental design, or sampling design of observational studies. Relatedly, and worse, incorrect or confusing implications about experimental design and sampling design. For instance, there are no formal definitions of key terms like “experiment”, “experimental unit”, “block”, “repeated measures”, etc. Worse, observational studies often are described using experimental terms like “treatment”, “control”, and “randomized block design”. This leads to serious confusion, even about matters as basic as what an experiment is.
- Too much emphasis on “statistical gimmickry” of little or no practical use, such as standardized effect sizes.
- Superficial, cookbook-type treatment of many procedures, with no conceptual framework for understanding why one might want to use those procedures.
- Incorrect, incomplete, and confusing coverage of other matters, from when it’s appropriate to use a one-tailed test, to whether to correct for multiple comparisons (Hurlbert apparently believes you should never do so, and so slams Sokal & Rohlf for insisting on this), and many more.
- Rigid adherence to Neyman-Pearson null hypothesis testing, at the expense of estimation and more refined, quantitative assessment of the evidence for or against any given hypothesis.*
The only value Hurlbert sees in Sokal & Rohlf is as a reference manual for the “recipes” for how to calculate various statistical procedures. He concludes by blaming the popularity of Sokal & Rohlf for what he sees as decades of poor statistical practice in biology. He also laments that no current biostatistical textbook teaches an appropriately-modern philosophy of statistics, in a clear way with a focus on principles, with no errors.
What do you think of all this? I have to say I found it kind of surprising, but not because I revere Sokal & Rohlf. I’ve mostly used it as a reference manual myself. I’d certainly never try to teach from it at any level, if for no other reason than it’s way too voluminous. I guess I always assumed, without really thinking about it, that it was always intended, and mostly used, as a reference manual. Was I wrong to assume that? And while I find Sokal & Rohlf old-fashioned in some ways (e.g., randomization, bootstrapping, and generalized linear models render classical non-parametric tests and data transformations largely irrelevant), that never really bothered me. The first edition came out in 1969; of course it’s going to be old-fashioned. And I don’t know that it’s fair to pick on Sokal & Rohlf and blame it for the purportedly terrible statistical practices of modern biologists, even though the book certainly is popular. Insofar as our statistical practices are terrible (and I don’t know if they are or not), there’s surely plenty of blame to go ’round. And can’t you also give Sokal & Rohlf credit for helping to encourage more biologists to use statistics in the first place? But I’ve never really thought about Sokal & Rohlf all that much, and I actually haven’t cracked it open in years, so I’m sort of a curious bystander here.
As an aside, I found it interesting that such vociferous criticism of Sokal & Rohlf came from someone from basically the same school of statistical thought. Hurlbert isn’t a Bayesian of any stripe, nor is he advocating for computationally-intensive methods, for instance. His criticisms of Sokal & Rohlf mostly aren’t criticisms of what the book sets out to do, they’re mostly criticisms of the book’s execution.
What do you think? Does Sokal & Rohlf deserve the criticism Hurlbert heaps on it? More broadly, what do you see as the biggest problems with how modern biologists teach and use statistics? And what textbook(s) should we be using in our courses in order to fix those problems? (Again, Hurlbert says there’s no biostatistics textbook that’s readable, strong on general principles, and error-free!)
My interest in this isn’t purely academic. I’m not just looking to grab some popcorn and watch proponents and detractors of Sokal & Rohlf argue. 😉 As I noted in a previous post, this fall I’m taking over teaching the introductory undergrad biostats course in my department. So for the first time, I need to think seriously and in great detail about exactly what introductory biostatistical material to teach and how to teach it. I’ve settled on a textbook (Whitlock & Schluter), and I have a tentative list of lectures and the major changes I want to make to the existing labs. But nothing beyond that. And even getting that far has required a lot of thought, in particular about precisely the issues Hurlbert raises. How much emphasis to place on general, unifying principles vs. coverage of specific tests. How much emphasis to place on black-and-white rules of good statistical practice vs. equipping students to make informed judgment calls. Etc.
It occurs to me that teaching biostatistics is something like teaching children good behavior. You start out by teaching kids black-and-white rules, like “don’t lie” and “don’t hit your sister.” And it’s only later that kids learn that good behavior often isn’t black-and-white. Sometimes it’s not only ok to lie (or to hit your sister!), it’s positively a good idea, morally. Heck, there are lots of tricky moral situations that you aren’t even taught about at all until you’re older. And that’s without even getting into competing, mutually-incompatible philosophies as to what good behavior consists of, and what makes it good! So you tell me–what should we be teaching our “kids” about biostatistics if we want to start them down the road towards responsible “adulthood”? (“Don’t
hit your sister fail to correct for multiple comparisons!”)
*Hurlbert actually thinks Sokal & Rohlf should’ve based their book on what Hurlbert calls the “neoFisherian” approach. I confess I’d never heard the term “neoFisherian”, which is Hurlbert’s own recent coinage. Hurlbert has a 2009 paper if you want to find out what he means by “neoFisherian” and why he thinks Neyman-Pearson hypothesis testing is so outdated that it should no longer be taught (UPDATE: link fixed). As far as I can tell, what Hurlbert means by “neoFisherian” doesn’t sound too far from Deborah Mayo’s notion of “error statistics” (which itself is actually not all that far from Neyman-Pearson, or even from some forms of Bayesianism). But it’s a little hard to tell because much of Hurlbert’s paper focuses on what seem to me to be rather nit-picky details of current practice (like conventions for reporting P values). Anyway, I think it would’ve been helpful for Hurlbert to briefly elaborate his own philosophy in his review, rather than just refer to it using a term of his own recent coinage.
I recently had a conversation about the declining status of book reviews, and how nobody ever writes critical reviews. So well done Hurlbert for sticking his neck out.
I’ve not looked at S&R for a while, and I won’t comment on whether the new version is error-prone. But I used to like to for the fact that is *was* old-fashioned. There’s a huge set of really excellent books on modern biostatistics, and IMO it would be wrong for S&R to compete with them.
Re: the declining status of book reviews, do you mean in science, or more generally?
I know that some folks who review books online basically only review books they like, so if they don’t like a book they just don’t review it. I am not one of those people, as my planned review of Arditi & Ginzburg’s new book is likely to reveal (I haven’t read it yet, and I’m going to read it with an open mind, but judging by what I’ve read about it, and my views on Arditi & Ginzburg’s previous work, I’m not likely to like it…)
Declaration of interest: I’m a book reviews editor, so I would contend whether book reviews have declined in quality. There are certainly some fiercely outspoken reviews, both in the media at large (see http://hatchetjoboftheyear.com/ for some outstanding examples) and in the scientific literature. I’ve certainly published some critical reviews though they undergo more careful scrutiny to make sure that they are fair, accurate and not unnecessarily inflamatory.
More of a problem is the lack of journals willing to devote page space to proper reviews these days, or of authors to devote the necessary time. A book review requires the same commitment and attention to detail as any other manuscript. It’s not the same as reviewing a paper, but this inappropriate tone does sometimes penetrate reviews. For example, my personal opinion is that the reviews in Ecology can be a bit too nit-picking (there is a typo on p.232 etc.), rather than concentrating on the content and contribution that a book makes. Beyond that, it’s down to the journal and the editor. Some only publish a smattering of reviews and therefore only tend to feature selected books of great importance.
Want to see some good reviews? Head over to Frontiers of Biogeography http://escholarship.org/uc/fb. They’re all open access too :o)
Thanks very much for the perspective, Markus. Personally, I’m agnostic on whether book reviews in the scientific literature have changed in quality over time, and if so, for what reasons.
I agree that the book reviews that Ecology publishes tend not to be very useful. They tend to include too much summary of the book’s content, and as you say any critical commentary tends to be limited to nit-picky matters. That’s not always true (John Fryxell’s recent review of Arditi & Ginzburg, in it’s understated way, lets the reader know that John has serious misgivings about the book).
You’re certainly right that doing a book review well is really time consuming. For starters, one has to read the book. On this blog, I get around this by doing short, not-especially-detailed reviews (the sort that I doubt would pass muster at most journals), and by reviewing books I’ve already read (so I’ve already made a good chunk of the required time investment). This comes at the cost of my reviews not being very timely. But it’s my hope that timeliness isn’t essential for the sort of books I’m reviewing. Hopefully, the books I’m reviewing are just as worth reading now as when they were first published.
The discussion about book reviews was specific to science. An author complained to me that reviews on his recent book were positive but uncritical, and that he’d rather have a bad review than a bland one.
I totally agree with you: “You start out by teaching kids black-and-white rules, like “don’t lie” and “don’t hit your sister.” And it’s only later that kids learn that good behavior often isn’t black-and-white.” That’s why I prefer to teach introductory Biostatistics courses based on books such as Zar’s and Sokal & Rohlf’s. Are there more modern approaches, more complex tests, alternative schools of thought, and so on? Sure! But I think the classics lay a solid ground for beginners, who later can choose their own path (conservative or “modern”), after practicing hard and getting some real-world experience. I consider Hurlbert’s critique a bit exaggerated.
Not sure if I would agree that the classical approach is necessarily the best way to teach statistics. I learned much more (both in terms of practical use as well as in foundational understanding of statistics in general) in a course that taught me about bootstrapping than I did in the intro to stats class that tried to tell me about t-tests. … of course that could be the reason why I still throw some form of bootstrapping at any statistical problem that comes my way 😉
One of the changes we’ve decided to make to our intro stats course is to dump classical nonparametric tests in favor of teaching students about randomization tests, which of course are closely connected to bootstrapping. I’ll probably even mention bootstrapping briefly. Conceptually, the basic idea isn’t at all difficult, so even intro level students can grasp it. And it’s a really clever idea. My hope is that students will find this one of the more fun topics covered in the course.
Exactly. I teach bootstrapping before teaching parametric estimates of standard deviation and standard error. It is taught as a pedagogical tool to understand what a standard error is – it’s what you would get if you sampled N items an infinite number of times from the distribution. A parametric bootstrap is just a short version of this (not sampling an infinite number of times).
On the topic of resampling and randomization, there is a free book by Julian Simon (famous for his wager against Paul Ehrlich) about resampling stats:
I haven’t read everything, but the initial chapters are very good at introducing the idea, and the examples given on his own computed program can be easily adapted to R.
In the 1990’s I taught statistics from Sokal and Rohlf. My approach to teaching stats (4th year undergrad and 1st year grad) is to watch what works with students, and teach from there. Student rating of S&R began dropping turn of the century. So I evolved to a new course, centered on the stat model. At the same time I used the standard models (as covered in S&R). And I used S&R as reference text. (classics as solid ground for beginner)..
S. Hurlbert gave a new name (pseudoreplication) to long known problem (confounding). When a student shows up in your office, paper rejected as ‘pseudoreplication,’ having done a latin square design, one can only curse the seeds of confusion sown by Hurlbert.
If you believe that you should test for normality before obtaining residuals and fits, think again.
If you believe that you should use Neyman-Pearson Decision Theoretic Hypothesis testing to check assumptions for Gaussian ANOVA and regression, think again.
If you believe that the only solution to confounding is replication, think again.
I don’t have kids and I don’t teach first year students biostatistics but what I found lacking in my statistical education was the philosophical foundation of why, when and how you do statistics. I got treated to detailed calculations involved in a t-test, looked up values in tables by hand etc etc. The application of this to solving scientific questions was tenuous at best and all this just bored me. I had to learn it later by myself and so I bought Sokal & Rohlfs (which was mentioned as comprehensive and state-of-the-art in the late 90’s!) and Crawleys R-book at the same time. After the first look, I never re-opened Sokal & Rohlf again because of the way Crawley introduced me to what a statistical model is, what it does, why you have it and how to use it. And his book is not even exceptionally good at this, as I discovered later. But finally I saw the connection between ecological questions, experimental/observational design, data and conclusions/interpretations supported by statistics.
For me a perfect statistics course gives the model and everything it entails the central place right from the start and talks a lot about the philosophy of not only statistical testing/modelling but how to do science in general. This way the links between the different steps of going from question to conclusion are best taught, I feel. I think students are not really kids, and should be treated as being able to understand shades of grey. And while a t-test or ANOVA often enough is a reasonable model, it is also equally often not enough and the modelling has to be much more biological. Moreover, the link is never made in any text book I ever saw when the classics are introduced. So I still use Crawley from time to time, especially given that he, too, covers the classical tests and see Sokal & Rohlf mainly as a reference I don’t use.
“I think students are not really kids, and should be treated as being able to understand shades of grey.”
In revamping this course, I’m actually trying to foreground unifying concepts in just the way you suggest. I’ll let you know if it works–I haven’t done it yet!
Part of the challenge here is that “students” isn’t really a single category. It’s a big class I’ll be teaching, with a wide range of students in terms of their motivation, background preparation, interest in the material, and future goals. Probably a fair number of them are capable of dealing with shades of grey in principle, but in practice may actually prefer to be told black and white rules. So that they can regurgitate them on the exam, get the mark they want, and move on (a decent fraction of the class is premeds). I’m not saying that as an instructor I plan to cater to that attitude–but I can’t ignore it either. One of the big challenges for me is going to be finding a way to make fundamental, abstract concepts compelling and interesting to the full range of students in the class.
“actually prefer to be told black and white rules. So that they can regurgitate them on the exam”. I know I am hopelessely naive and a terrible hypocrite but shouldn’t we overcome that attitude? It led to me not understanding statistics for a long time and feeling frustrated about it. And I honestly believe I am only partially to blame for that. But as I said, I never taught this stuff to first years, only to advanced, kind of hand picked students. I just wish I had a statistcs teacher like you.
That is a wise approach Jeremy. I think it is also important to keep in mind that first and second year university students have a heavy course load with labs and tutorials. In my experience, students want the black and white pedagogical approach at this stage in their program to minimize the guess work in understanding the material and the course expectations. I wish this wasn’t the case, but I think there has to be some consideration of a student’s “learning curve” as they progress through their undergraduate program. What you are proposing sounds like a nice balance Jeremy. I look forward to hearing how your first crack at the course goes. Maybe write a post-mortem for Dynamic Ecology?
Yes, I’m sure I’ll post at least once about how things go.
One of the hard things about teaching any intro course is teaching it in a way that’s interesting, engaging, and challenging to a wide range of students, in terms of their background preparation and interests. As an instructor, one always has to keep in mind that, as a student, you were almost certainly an extreme outlier in terms of both background preparation and interest in the material.
While I agree with most of the points Hurlbert makes that you listed above, in terms of how to introduce undergrads to statistics (and I taught a Foundation level course earlier this year), I don’t agree that problems with these are the fault of S&R’s book. I don’t even think introducing students to the concept of N-P hypothesis testing is a bad thing, as long as it’s part of an introduction that will later be developed into a more mature, comprehensive approach to statistical model comparison in a Frequentist (and/or Likelihood) framework. I’m also happy to let more advanced students know about Bayes, I’m just not really in a position to teach them about it.
But I’m not convinced you can (or should) cut straight to the chase when teaching UG stats courses. It’s absolutely worth starting with the basic stuff, that will allow students to understand and interpret an awful lot of previous literature, before developing their own model building skills with more advanced stats. They don’t need to use outdated methods when they come to design their own experimental work, but they probably do need to understand why others have made the choices they made.
Perhaps the point that Hurlbert should have made was that releasing new editions of S&R without updating it to reflect current best practice is poor publishing practice. Strangely, he does point out that many useful techniques have been added to the book since the 1st edition, but complains that this increases the length of the book.
I do have a gripe about Hurlbert’s dismissal of correcting for multiple comparison and 1-tailed testing though. The couple of recent articles he’s written about this just sent me round in circles when I tried to follow citations to get a deeper understanding of when and why we should or can ignore these issues. I still don’t understand why he thinks we shouldn’t correct for multiple comparisons, beyond the inconvenience of not having one’s hypotheses supported* so easily because biology is messy.
As for alternatives, Mick Crawley’s various ‘R’ books (Arne mentioned them as well) are excellent resources for teaching principles and coding, but perhaps only to those of us who were taught the statistical basics as undergrads. Hurlbert & Mangel’s ‘Ecological Detective’ is another great resource, but I wouldn’t have a hope of following it if not for my previous introduction to the basics.
Hurlbert’s correct: S & R is a cook book, to dip into when required. But it’s not necessarily a great introduction to statistics for those without sufficient background training. And maybe it was never intended as such. What an author intends and what a publisher publishes are not always the same thing. And quotes from dust jackets are not always reliable guides to what’s actually going on the inside. I’m sure there’s a snappier way to say that…
*I know, I know, that’s not how it works.
Re: Mick’s books, I only have his older S-plus book, which is very much a coding cookbook that mostly assumes the reader understands statistics and just needs to be told how to conduct the chosen analysis in S-plus. Which is exactly what I wanted when I got the book. I wouldn’t use it to teach intro biostats to undergrads.
I must say, it worked great for me. Together with friends we worked through it page by page. Nowadays I mainly use it as a cook book, that’s true.
It’s Hilborn and Mangel, not Hurlbert and Mangel.
One has to consider the source. Stuart Hurlbert’s most famous papers are all rants. I would love to see him produce a stats textbook.
That said, I too have ranted against the cookbook mentality of teaching stats and I think Sokal & Rohlf (and the other classic undergrad book by Zar) epitomize this. While I agree with your analogy and you don’t necessarily start undergrads off in the deep end, teaching the cookbook method only to have to unteach it at the graduate level is not the right way either.. I also think a more modelling oriented approach (y=f(X)+ε is more appropriate and translates more easily into AIC and more advanced techniques. Finally, I think focus on calculation details is misplaced in this day and age.Even if not using R, the R formula mentality is the right one to teach these days.
The one place I have to majorly agree with Hurlbert is that is wrong to emphasize statistics so much without covering experimental design/sampling theory. The two are intimately linked and if students are only going to get one course, they should get both.
In my opinion a modern stats course should stop teaching t-test, ANOVA, regression and their non-parametric alternatives as separate techniques, lump them all together as GLM/modelling approach (with the calculation details left to the computer) and use the extra time to cover experimental design.
We considered going straight to generalized linear models for our intro stats course, but decided it was a bridge too far at that level. But I certainly do want to go some way down that road. That’s why I want to try foregrounding general concepts, and then teaching specific statistical techniques as examples of those concepts.
GLM was ambiguous. I meant general (no “-ized”) linear model – i.e. regression/ANOVA/t-test treated as one topic/analysis. Drives me crazy that we still teach these 3 as separate concepts. This is mostly rooted in the history of how we calculate these. Conceptually they fit very nicely together in a modelling framework – just that sometimes the explanatory variables are binary, categorical or continuous. But a modelling formula of dependent variable as a linear function of independent variables plus error time with an F test to get overall signficance related by sum-squares to r2 for predictive power and t-test for individual term significance and residual analysis for assumption checking provide a common framework to look at all of these.
I agree that generalized linear model (logistic, poisson, gamma regression etc) is too far (squeezing in logistic is done in some undergrad courses, but a stretch in my opinion).
In the past, we’ve taught regression and ANOVA as two separate topics and then gone back and explained the link between them (without using the term “general linear model”). So in the past we haven’t been *too* far off what you’re suggesting, though as I say I do want to move the course even more in the direction you suggest here.
Re: considering the source, I admit I’m not familiar with Hurlbert’s work, beyond his famous pseudoreplication stuff. When you say “I would love to see him produce a stats textbook,” I think you’re probably getting at a big reason why (as another commenter noted) one so rarely sees negative book reviews in science. It’s very difficult to write a book, especially a textbook, it’s a tremendous amount of work, and most people never even try to do it. So as a reviewer you really want to be cautious about slamming somebody’s book, especially somebody’s textbook, I think. I’m going to really try to keep this in mind when I get around to reviewing Arditi & Ginzburg, since as I noted in another comment, I’m probably predisposed not to like it.
D’oh! This reminds me that it wasn’t Stuart Hurlbert, but Ray Hilborn who co-wrote The Ecological Detective. Apologies to both.
Re: a focus on calculation, I’m of two minds. You certainly want to know what calculations the software is doing, and why it’s doing them. This matters a lot down the road, especially when you get into models that have to be fit via numerical optimization techniques. But on the other hand, yeah, in this day and age there’s no point in teaching rote calculation for calculation’s sake, or teaching students how to read F-tables, or etc.
(Apologies coming to this one late) I struggle with this; part of me agrees that you want the students, at some level, to understand the calculations going on behind the scenes. However, the calculations you might show are very rarely done on a computer because they don’t do maths exactly. So then your in the position of showing how the calculations are really done, usually with some code to illustrate that you get the same results, but you are teaching the students something that is never done and is potentially dangerous. Then what? You can’t possibly start teaching them about statistical computing and the various matrix decompositions and operations that really underlie the software they might be clicking buttons in (or typing code in, hopefully) in an intro or applied stats course.
I tend to avoid going into the calculations for those reasons. I hope to teach the students enough to be critical of the models they fit, how those models work, what their underlying assumptions are etc. rather than the calculations.
I think in “showing the calculations”, probably most ecologists aren’t thinking of the sorts of numerical techniques to which you refer. They’re thinking of things like, e.g., formulate for the the decomposition of total SS in an ANOVA, as in Sokal and Rohlf. Or perhaps just giving students the gist of what a computer’s numerical procedures are doing, without going into technical details. For instance, I explain numerical maximization of likelihood functions to my ecology students with a verbal analogy to walking around a landscape, while making clear I’m glossing over many technical details. I use this to give them a bit of intuition about things like when numerical maximization procedures are likely to fail, how to recognize when they’ve failed, etc.
Then I think we have a similar approach – those are the things I do spend some time on, if just to show that these aren’t random values plucked from the ether by the statistical Gods.
I agree with comments above re: the analogy with kids is not good and it takes at least twice as much work to de-train someone as to train someone. Yes, one needs to know many classic statistics to get a job or to read the literature but if this is taught first, it will be the LAST stats class taken by most biologists so they will think classical statistical thinking is the way things ought to be and there is no opportunity to unlearn, unless they do it on their own. It’s like your zombie problem – anti NHST rants have been flooding the journals for 15 years but no one is paying attention or at least movement away is glacially or even continental driftally slow. The only way to make progress as a field is to not teach poor statistical thinking in the first place.
Yes, intro courses can run into problems if they teach students to mindlessly follow a set of rules or recipes without teaching them the *point* of those rules and recipes. And obviously intro courses are problematic if they teach the wrong rules! And no, my analogy with teaching kids good behavior isn’t perfect, and you and Arne are right to question it. But I don’t think it’s a terrible analogy either. I do think you have to teach beginners *some* specific techniques and specific examples of how to apply general principles. And you also have to teach them some more or less clear-cut rules or other of good practice. Otherwise the material remains too abstract, and too open-ended, for most undergrads.
Re: anti-NHST rants, I don’t subscribe to them and certainly don’t think we should do away with teaching null hypothesis tests. See the link in the post to “error statistics” for a summary of my views on this.
Yes I think that we’re all struggling with how to best teach intro statistics. I think many students and even faculty (at least outside of E&E) see stats as a one and done. Learn t-tests, anova, and chi-squared and I’m good! I’m not anti-hypothesis testing and p-values at all. But the over-reliance on these leads to poor statistical thinking. For example When I see a paper with the title “No selection on horn length in unicorns” which is inferred by an non-significant p-value from a multiple regression on observational data being published in the top journals in ecology and evolution, something is wrong with the way statistical thinking is taught. This is what Hurlburt was talking about when we are taught to accept the null if p > 0.05. I really like Andrew Gelman’s quote “I’ve never had a type I error in my life”
I couldn’t agree more with Brian’s comments regarding the emphasis on modelling. We were exposed to the cookbook style approach and it was dull and difficult to comprehend (and it took a long while to shake off). I remember when I finally came to understand that a t-test is just a special case of an ANOVA which is a special case of a general linear model etc. It was an epiphany! But it should have been introduced at step 1.
I don’t understand why anyone would think such a disjointed approach is preferable for teaching purposes.
So following up on Brian’s comment – to explicitly teach t, anova, regression as specific parameterizations of the general linear model, would you do anova before regression or regression before anova? And why? Most intro books do regression first. S&R do anova first (and teach it first – yes I was a sokal AND rohlf student). Regression first seems backwards to me, maybe because how I learned it but it could seem odd to learn about ANOVA tables with regression but not learn about ANOVA until after regression (maybe that is my fault for limiting my definition of ANOVA to an among-group analysis instead of any analysis where the variance is partitioned into a model component and a residual component). One problem with regression is, it is many things to many people: 1) exploratory, descriptive tool (in which case the link with the general linear model isn’t necessary), 2) a means of estimating causal effect (hypothesis testing) and 3) a means of estimating E(Y) or predicting Y. Econometrics textbooks tend to focus soley on #3 while bio textbooks focus on #1 and somewhat on #2 depending on the book.
I agree with a lot of what’s being said here in regards to what would be beneficial to teach in an intro stats course. I like the idea of teach ANOVA/regression/t-test as one type of modelling approach. I was taught the traditional intro. stats course as a Masters student and spent A LOT of time on learning how to calculate an ANOVA. While I don’t think it was beneficial to spend as much time as we did on this, I do believe that doing those calculations helped me truly understand the difference pieces (MSE, degrees of freedom, etc) and also shows you how those tests are related. There is a benefit to doing the math (which we spend so much time learning but not practically using as students in your major classes) and understanding how it fits in to the different pieces used to explain our results. Without being able to take something apart, its hard to be able to fully understand it and be creative later on.
I’m with Brian on this one — as my post in the previous thread about Kaplan’s book would probably indicate. In my view, the main thing missing from the conventional teaching of statistics is adequate emphasis on model formulation as part of the scientific enterprise, rather than starting with the model as given (conditional on answers to a few basic questions like, are the explanatory variables categorical, etc.). Fitting the model to data in this context is about parameter estimation (true regardless of whether you are a Bayesian, frequentist, whatever, despite technical differences in what you understand an estimate to be and how you produce one). N-P hypothesis testing is then one common (albeit ad hoc) approach to drawing provisional conclusions about whether to consider an estimate as larger or smaller than some reference value (usually zero). This is however just one of many things you can do with estimates. At least for relatively simple analyses, I don’t think this approach is any more difficult (and for many students probably much more intuitive) than the recipe-based approach. Moreover, it gets students to recognize from the outset that choosing a model for the data is a scientific judgment call, and that different judgements that are comparably well supported by model fit checking, etc., sometimes give different answers. This might obviate the need for students to unlearn (if they ever do) an excessively black-and-white attitude towards the output of statistical software packages.
My only reservation with this is with the presentation of more complex linear models where the conventional model-based presentation involves design matrices. Without any matrix algebra, this could be a little tricky. I don’t have a good solution to this because I’ve been involved in teaching “pre-statistics” (if you like) to first years, and then more advanced stuff to graduate students, but I don’t teach our second year biometrics subject.
Jeremy, Brian, Sean and others: have any of you taught from Grafen and Hails as an introductory undergraduate bio textbook? It teaches from the GLM worldview explicitly. Chapter 1 is a short intro to ANOVA presented traditionally. Chapter 2 is a short intro to Regression presented traditionally. Chapter 3 then re-casts ANOVA and regression as a GLM and the rest of the book follows. It seems that this is the book that Brian is looking for.
Grafen, Alan, et al. Modern statistics for the life sciences. Vol. 123. Oxford: Oxford University Press, 2002.
I had a look at Grafen. It’s too advanced for the class I’ll be teaching. That’s not a criticism of the book at all–it’s just not the right fit for my needs.
There are lots of great ways to teach introductory statistics, but I think the three strongest pieces of advice that I have ever heard are:
1. Probability, probability, probability. Cookbook approaches may be to blame for this, but few introductory non-majors statistical courses (used to?) worry about probability with enough depth to make sure that students got the point. Undergrads were primarily using stats to go fishing int their field of choice, and cared very little about what that p-value actually meant, or how it changed with different sampling designs or analytical frameworks.
2. Use real data from the field of interest. This almost goes without saying in 2013, but in 2003, lots of stats courses were taught with data from older books that was then copied into newer editions. My then statistical mentor, John P. Holcomb (Cleveland State University) was using data from clinical trials and demography and it showed me the power of statistical methods to answer real and important questions.
3. Decouple some of the programming from the statistical theory. It’s almost a given that a biostats course will use R in 2013, but too often I see students struggling with the fact that R can’t coerce an atomic vector or that an object can’t be found in trying to check off assignments, rather than interpreting what the data, methods and results mean. It’s nice for them to learn programming or stats and not have to do both right off the bat.
I would also use lots of clicker questions and provide lots of the widely available open source documentation that facilitates self-teaching, troubleshooting and exploration. Speaking of free, but under-the-radar, the copyright is up on Oehlert’s text on experimental design and it’s now freely available:
Click to access fcdae.pdf
Pingback: Hurlbert’s Classic Pseudoreplication Paper & Using MythBusters to Teach Undergrads About Experimental Design | Dynamic Ecology
Well, I’m a fan of statistical models and parametric statistics. So, I think everyone who uses such things should know how to construct them and, for factorial models that are unbalanced, users ought to know how to determine the correct error term to use in testing effects. Anyway, I think we should share out the blame and include Steel and Torrie and Snedecor and Cochran in the circle of shame. Seems to me someone is trying to increase his credit in the market by bashing the big fish. Besides, Sokal is dead and Rohlf is probably trying to fish more and not have to be bothered by detractors. Sure it’s confusing at times and the end pages are a bit of a head scratcher and all those summation signs make me dizzy, and are a blast from the past. They would have been better off using linear algebra to explain how SS work and then get on to how Fisher made his brilliant analysis of variance components (his original text is still the best explanation of variance ever). Have fun with the new course – must be interesting trying to make stats attractive to undergrads. Rob Wright Ph.D. Saskatchewan Parks Service Plant Ecologist.