Flipping our big intro biostats class didn’t work (for me) (yet) (UPDATED)

A couple of months ago, I told you why and how we flipped the big intro biostats course here at Calgary, and that it led to a big improvement in student performance the first time my ace colleague Kyla Flanagan taught it back in the fall. And I said that so far, it seemed to be working well for me this term.

I spoke too soon.

I wrote those posts right before the midterm. Which didn’t go well. On average, the class as a whole performed substantially worse than Kyla’s fall cohort, and worse than past cohorts I’d taught under the old, lecture-based format. Fortunately, they performed much better on the final exam, roughly on a par with students to whom I’ve lectured in the past, but still substantially worse than the cohort Kyla taught in the fall.

N=1, obviously. You don’t want to read much into an anecdote, which is what one term of any class is. On the other hand, the fall term was an anecdote too. And so were the past terms I taught as lectures. Teaching (and life) is a series of unreplicated, uncontrolled experiments, on the basis of which decisions must be made.

I have some ideas about how to improve things next time I teach the course. Some of the information I have to go on comes from the students themselves. On the excellent suggestion of Kyla and another colleague, I gave the students the chance to do a “midterm reflection” in exchange for a couple of bonus marks. After the midterm, they completed a 1-page form with questions on how they thought they did, how they studied, how they planned to study differently in future, and what I could do to help them succeed.

Obviously, I’m going to tweak the course content to focus more on the material the students struggled with most.

I’ll also follow Kyla’s lead and incorporate more clicker questions and maybe some short team activities into the remaining lectures. I didn’t do much of that this term–indeed, less than I used to–because I felt like the students were doing a lot of active learning in the other class sessions and I needed to cover a fair bit of ground in the remaining lectures. That may have been an error on my part. And I may go back to making a small part of the course mark dependent on answering clicker questions asked in lecture, as I’ve done in the past, because lecture attendance was poor later in the term.

And while I’m not sure the course schedule will allow it, I might try to squeeze in a couple of class sessions devoted solely to exam-style practice questions done individually. Under the old course format I had 3-4 of these sessions scattered throughout the term, and students found them very helpful. Under the new format, they do many quizzes and assignments involving exam-style questions, but mostly in teams. I think talking with their teammates helps them learn, but I’m not sure it does a great job of alerting them to what they as individuals need to work on. Now, I do make many practice questions available, but this term most students either didn’t do them, or more commonly crammed on them in the days leading up to the midterm. I wonder a little if a side effect of the flipped format we use is to make students less likely to do optional practice questions. The flipped format forces them to spend more out of class time on the course than they otherwise would, making them (quite rightly) reluctant to spend even more time. I’m happy with their overall level of effort, shaky lecture attendance later in the term aside. So rather than expecting them to do lots of practice questions on their own and so put in even more effort, it’s probably more reasonable and effective to devote some class time to practice questions.

Finally, I think I need to provide more really difficult practice questions. Make the practice questions as a whole a bit more demanding than the exams. If only to ensure that the practice questions don’t lull students into a false sense of complacency.

Will these tweaks work? We’ll see–and of course it’ll be hard to tell for sure. Every cohort of students is different. My goal remains to get the class as a whole to a level beyond what I was ever able to achieve via lecturing. I know I can get the class as a whole up to an adequate level of mastery by lecturing, because I’ve done it many times before; I’d like to bring them up above that. I’ll keep trying for a while, though not indefinitely. At some point, if I can’t get this new format to work any better for me and my students than lecturing did, I might go back to lecturing just because it requires less work on my part.

In the comments, please share your own experiences with tweaking your classes.

UPDATE: Just got my student course evaluations for this class. As I expected, the overall average is solidly below our departmental average for courses at this level, and below the average for other courses I teach, though not terrible in an absolute sense. And there’s a fair bit of variance around the average, though again only a small minority hated the course. You also have to keep in mind that student eval scores for this course always run low, because few biology students like statistics. Unfortunately, the scores on the various granular questions my university asks turned out to be a little hard to interpret. I mention all this just for completeness. I don’t find student course eval scores very helpful to improving my teaching, and I see no reason to be concerned about them unless they were disastrous. The students also fill out quite open ended comment forms, which I have yet to look at. When I do look at them, I’ll only be looking for evidence of widespread issues. It’s a bad idea, pedagogically and emotionally, to worry about the inevitable few students who hate the class. A class of any size is almost sure to contain at least a few students who hate it.

19 thoughts on “Flipping our big intro biostats class didn’t work (for me) (yet) (UPDATED)

  1. I really like this sentence from your post: “Teaching (and life) is a series of unreplicated, uncontrolled experiments, on the basis of which decisions must be made.” this is when priors become really important, i think…

    • “I really like this sentence from your post: ”


      “this is when priors become really important, i think…”

      Hmm…yes and no. It’s in part because of my “priors”–in the broad, loose sense of “lines of evidence besides my own anecdotal experience”–that I flipped this class. But if I can’t get the flipped class to work better than the old lecturing format, I might go back to lecturing. That is, I might let a few terms’ worth of anecdotal experience with a flipped class trump my priors.

  2. So many conditionals in trying to evaluate the efficacy of the flip!

    “I might try to squeeze in a couple of class sessions devoted solely to exam-style practice questions done individually”

    Might this simply make students good at answering exam-type questions? One thing we’ve learned from the physics pedagogy work is that physics students are very good at memorizing patterns of exam questions and answering these on tests but still don’t really understand the physics.

    Just looking at how many colleagues do statistics doesn’t suggest to me that teaching to a test is very successful. What if the goals were less on answering exam style questions and more on why and how to continually self-learn and the many ways one can do statistics badly?

    • Yes, I don’t want to teach to the test. And raising performance on exams isn’t itself the goal. The goal is to have the students master the material, and exam performance is an indication of mastery.

      I confess my own sense of what constitutes “teaching to the test” has grown more, not less, confused as I’ve gained more experience. Certainly, both the current and past version of the course have included numerous practice exam questions very similar to the questions that will be on the actual exam. For instance, multiple choice questions that name a variable and then ask “how would you expect this variable to be distributed?” occur in both the practice questions and on the exam itself. I guess those sorts of practice questions could be considered “teaching to the test” in a sense?

      But the practice questions, and the exams, also include questions that ask them to think through novel scenarios and work out the implications of what they’ve been taught. I’m not too concerned that asking such practice questions constitutes “teaching to the test”, because the practice questions and the exam questions present such different scenarios. I don’t think these sorts of thought questions could be easily answered by memorizing patterns.

      One non-exam-related measure of how successful the new flipped course is will be if students are noticeably better prepared in upper-level courses for which this course is a prereq. But even that’s not a perfect measure, since the majority of students in the course won’t go on to any courses for which intro biostats is a prereq.

      Sometimes teaching is an unreplicated, uncontrolled experiment which provides little or no data on the response variables of ultimate interest. 🙂

      “What if the goals were less on answering exam style questions and more on why and how to continually self-learn and the many ways one can do statistics badly?”

      I do talk some, and ask exam questions about, common statistical mistakes. I even sometimes pull examples from Andrew Gelman’s blog and our Friday linkfests. I agree that’s a very useful thing for a class like this to do.

      I certainly hope that the class gives them a foundation on which they can build, either in future classes or (if they wish) on their own.

      • I realized after my comment that I implemented a similar strategy in Intro BIO this year. That is, if students are going to be able to do quantitative problems, they need practice with feedback so yeh, this means test-like practice questions with discussion.

      • As you indicated, it depends on what is meant by “teaching to the test”, but I think doing so is fine. You write the test, presumably emphasizing the concepts and skills that you think are important. Given that, it makes sense that you would teach in a way that would help students do well on the test, right?

  3. Thanks for writing this! This is a really important follow up.

    I think my personality makes it so that I will always be tweaking my courses; some of those tweaks will work, while others won’t. It’s hard when they don’t, but it seems the alternative is to feel like my courses are stale. I am considering more major changes to Intro Bio, but haven’t decided yet on what I want to do. I won’t be teaching it this fall, so I have more than a year to decide.

    • “I am considering more major changes to Intro Bio”

      You and Kyla are cut from the same cloth. 🙂 It’s my impression that she’s never satisfied with her courses. Even to the point of being willing to make big changes after having recently made big changes.

    • Will the new intro bio instructors be teaching from your prep? If not (and even if so!), I’d be curious if instructors in upper-level courses will be able to tell the difference between students who’ve been through intro bio with you vs. some other instructor.

      I do hear from other biology instructors that, in upper level courses for which intro biostats is not a prereq, that they can tell the difference between students who’ve had intro biostats and those who haven’t. The students who’ve been through intro biostats are better at reading papers from the literature, for instance–they can grasp experimental design, they can interpret graphs, etc. But of course, the differences between students who’ve taken the same course with different instructors presumably will be smaller than the differences between students who’ve had a course and those who haven’t.

      • One thing I’ve been thinking about a lot is how to balance making sure students all get the same general content and skills while also making sure instructors can play to their strengths. I think I could easily expand the virus & infectious disease section while still covering topics that are important, and that the students would like that. But I haven’t figured out what else could be dropped that wouldn’t potentially cause problems down the line.

        Another big change I’m mulling is to have almost no text on slides, but just have figures that I annotate. That would definitely be a risky change — students might absolutely hate it. A post I have planned for when I’m back to blogging regularly relates to how students perceive teaching and how that impacts how I consider changing courses. UMich is now releasing teaching evaluation scores to students, and I wonder if this will make people more hesitant to try new things in their courses. (GT released scores when I first got there and I had no problem with that. But, with more experience, I have more concerns about the practice.)

      • Re: changing topic coverage, yes, that can be tough to do in intro courses since everything depends on them. You can’t stop covering something other profs expect you to have covered. We accidentally caused some awkwardness in our upper level ecology courses when we stopped teaching non-parametric tests in intro biostats in favor of bootstrapping and randomization tests.

        Re: no text on slides, just figures that you annotate, that’s what Kyla does. She has an electronic pad and stylus for this purpose. I tried it in intro biostats but it didn’t work for me, I just found it awkward. Which is a little weird, since in other classes I do nothing but write on the board. But I do think it’s a good idea, it forces students to take notes, and forces you not to go any faster than they can write. But yeah, I wouldn’t be surprised if some of them didn’t like it and just wanted you to make text-heavy slides available to them.

        Re: releasing teaching evaluation scores to students, Calgary doesn’t do that. My first reaction is that it’s a bad idea, but I haven’t thought about it much.

  4. Thank you for sharing your experience in flipping your class! I have also implemented a post-exam reflection (after the first exam) for students in a 100-level microbiology course. It’s a great way to help students recognize whether or not they have achieved an understanding and allows the instructor to gauge where problems are occurring (are students underprepared, did they misunderstand the question, did they not understand the material) and address them accordingly.

    I have also found it useful to include questions prompting the students to reflect on their understanding of material after certain lectures or in-class activities. For example, have students write down and turn in a card answering what they learned, what they are unclear on, and why is the material useful to them. This provides insights that are useful to the students and instructor.

    I also really like your idea of including clicker questions and short activities–keeps things interesting for the students while providing feedback to you on whether or not the students “get it”. Making adjustments in the classroom is tough, but it sounds like you’re on the right track! Keep up the good work!

    • “I have also found it useful to include questions prompting the students to reflect on their understanding of material after certain lectures or in-class activities. For example, have students write down and turn in a card answering what they learned, what they are unclear on, and why is the material useful to them.”

      Yes, back when I was first starting out I sometimes had students do “minute papers” (because they took 1 minute) at the end of class on what aspect of the material they were finding most difficult. I’ve gotten away from it, in part because I now know from experience what they’re going to find difficult, and in part out of laziness to be perfectly honest.

  5. Jeremy,

    This reads to me very similar to the rhetoric I heard from a professor whose semi-flipped classroom failed. You talk about what you did and what you’d change, and only minimally about what actually worked and why and what actually failed and why. I’d encourage you to lay out a next version of this post the way you would an NSF grant, clearly laying out each issue, your proposed approach to the issue, and why you think it will work.

    In the flipped class I took did work, I think there were a few key formal innovations:

    1) Daily, in class, 4 question multiple guess questions, often trivia, with small grade consequence, as an accountability mechanism for doing reading and lectures at home.

    2) Teacher assigned groups, which made for turnover in pairings, and often strategic pairings that attached more advanced students, talented moderators, or the better prepared to those who needed more support.

    3) A clear notion of what was different about in-class time. Classes were practica- we used software packages to analyze unpublished genomics data, or paper discussions, or group problem sets where we used the tools and rules discussed in the reading and lectures. This was a sharp contrast to the failed class, where the main difference between in- and out of class assignments was whether they were group assignments or individual, and whether there was time to finish them in class.

    Do you think performance on the exams (as opposed to say, sentiments about the subject, enrollment in future classes in your department, or retention of the content or tools in the more distant future) is the right metric for success? Why do you think that attendance dropped? Did you do an actual midterm evaluation?

    I hope that as flipped classrooms are tested and tweaked, instructors use the best thinking and tools to make improvements.

    • “You talk about what you did and what you’d change, and only minimally about what actually worked and why and what actually failed and why.”

      Fair point. That’s for four reasons. (i) I didn’t think many readers would want to read such a lengthy, granular post. (ii) I don’t have the time or interest to write such a lengthy, granular post. (iii) I don’t know what bits worked and what bits didn’t with the level of detail and certainty you seem to want. (iv) As I wrote in an old post, the structure of the class is like a house of cards, the various bits are mutually supporting. You can’t easily change one bit without bringing the whole thing crashing down. That’s why the planned tweaks I described in the post are just that–tweaks. There’s no scope for big changes in the course structure.

      “Do you think performance on the exams (as opposed to say, sentiments about the subject, enrollment in future classes in your department, or retention of the content or tools in the more distant future) is the right metric for success? ”

      Yes, I think performance on exams is the right metric of success. It’s my job to help students master the material, and that’s what exams and other assignments (like labs) are meant to assess.

      I’ll certainly be curious to hear from other profs if they find that students who’ve been through the new version of the course are better prepared than students who’ve been through the old version, and I’d be glad if that was the case. But I don’t know any good way to measure that, and trying to measure it would be a lot of work. Plus, I doubt there will be a massive effect, because I think the only material students retain long term is they stuff they use regularly in the long term. I forgot most of my anthropology and organic chemistry classes shortly after taking them, even though they were well-taught, because I didn’t have any occasion to use what they taught me. Which isn’t to say it was a waste of time for me to take those classes–it wasn’t!–but that’s another topic…

      I don’t think enrollment in future classes is an appropriate measure of success, for various reasons that I’d be happy to list if you’re interested. FWIW, there are at least a couple of students who, having gone through the class, are now thinking about switching their major to ecology or who are now more excited than they were before about the upper-level biostats course we require ecology majors to take. But that’s just an anecdote about a couple of students, plus it’s definitely not just due to the flipped structure of the class. It’s also down to me being an ecologist, me occasionally plugging the ecology major before class, etc.

      “Did you do an actual midterm evaluation? ”

      Not sure what you mean. I described the “midterm reflection” exercise in the post–I take it that’s not what you’re suggesting?

  6. I’m curious. How did you conduct your team exercises? I have taught several calculus courses and have found that (for me) the students liked starting the problems individually and then getting together as a team after thinking about them on their own for a few minutes first.

    When teams immediately started discussing the problems, the fastest thinking and most vocal students tended to dominate the discussions with many other students withdrawing. I found that the time alone allowed weaker students to think about what they did’t understand, and ask questions (often leading to a very illuminating discussion for all). Flipping a classroom has to be done with a lot of care so as to not alienate weaker students, quieter students, and slow/methodical students who need time to process questions/activities. Plus individual time followed by group time tends to promote a greater diversity of ideas, since the group doesn’t just go with the first reasonable one proposed, usually by the fastest or most vocal student.

    • Good questions.

      On the RATs, students take the quiz individually first, then get together with their teams to retake the same quiz.

      For the team activities, they work on them as a team from the get-go. We try to pose questions for the team activities that are well-suited for teams to work on together, but I’m not sure how well we manage to pull that off. It’s difficult to write good team activity questions.

      RE: not alienating weaker or quieter students, or not letting quick, vocal students dominate the group dynamics…Yes, we worry about this. There’s no perfect solution. We tell the students at the beginning, and remind them throughout the term, that everybody on the team needs to contribute, and that teams are supposed to come up with their answers by discussion leading to consensus (as opposed to, say, just taking a vote, which some teams were doing early in the term). The teams write their own team contracts to formalize this. A small part of the course mark depends on teammates’ evaluations of each others’ (and their own) contributions to the team. In other words, quiet students can’t just hide. FWIW, the feedback we’ve gotten so far is that quiet students appreciate the push to contribute. And I circulate around the room as the teams are working and from what I saw the team dynamics were mostly good. Based on what I saw in the room, and teammates’ evaluations of one another, we rarely ran into issues with everybody just going along with whatever the best/quickest/most vocal student came up with, or quiet students not contributing or feeling they didn’t have a chance to contribute.

      None of which is to say your way of doing things couldn’t also work; I’m sure it does. And it might be that different ways of structuring team work vary in effectiveness depending on the students, the nature of the material, or other factors.

      • Thanks Jeremy. That team contract is a really interesting idea that I never would have thought of. Student evaluations of each other is also a good idea. I took a class that had those back in the day and it was really eye-opening to read evaluations of myself as a team player (both the positives and negatives).

      • @Matthew Holden:

        I wouldn’t say team contracts are a panacea. I had a few teams run into problems this term with chronic absenteeism/lack of effort on the part of a team member, but those were students who just gave up on the class entirely (skipped exams, etc.). Last term I believe my colleague Kyla had a team rendered dysfunctional by personality conflicts. And in theory, if everyone on a team was happy to just leave the team’s work to the best student on the team, and all give one another positive scores on the peer evaluations, I couldn’t prevent that. But overall, my first term of trying this was that everyone pulled their weight, save a few students who just gave up on the class (which happens in every big first- or second-year course whether the class is flipped or not, in my experience).

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.