It’s hard to systematically improve K-12 student outcomes by improving teaching. Are there implications for universities?

The Gates Foundation recently spent 6 years and a lot of money trying to improve student outcomes in 3 big US public school districts and 4 US charter school networks by improving teaching. Metrics of teaching effectiveness were developed, tracked, and incorporated into hiring, retention, promotion, and salary decisions. Teachers received frequent, structured observations of their teaching, and received training in how to teach better. And of course, the total investment in all this was much higher than the Gates Foundation’s expenditures, once you account for all the teacher, administrator, and support staff time involved. In monetary equivalent terms, the total cost was on the order of several thousand dollars per pupil per year. All of which made basically no difference for any measure of student achievement, not even if you restrict attention to initially low-achieving students.

I freely admit I’m not an expert on this stuff. Just thinking out loud here. Here’s my question, to which I don’t know the answer: what would you find if you did something similar to try to improve university faculty teaching? On the one hand, university faculty typically have little pedagogical training, so maybe there’s more room for improvement in their teaching than there is among K-12 teachers. I’m sure my own teaching has room for improvement, and I doubt I’m alone in that! On the other hand, just like in K-12 education, a lot of things that have big effects on university student achievement can’t be addressed by anything professors do in their classrooms.

To be clear, my question is not “Is there any meaningful variation among university faculty in how well they teach?” (I’m sure there is!), or “Can individual faculty ever improve their teaching?” (I’m sure many can!) My question is, “Can you meaningfully improve university student achievement, compared to the status quo, with institution-level initiatives that aim to train, hire, and reward good teachers?” I don’t know the answer. But extrapolating (over-extrapolating?) from the linked report, it seems like “no” might be a possible answer.

I guess the question behind my question is, if a university wants to improve institution-wide student learning outcomes, what sort of initiatives work? I’m sure there must be some research on this, of which I confess to almost complete ignorance. Looking forward to learning from your comments.

11 thoughts on “It’s hard to systematically improve K-12 student outcomes by improving teaching. Are there implications for universities?

  1. I’m not very familiar with Gates Foundation project, but it doesn’t seem to be about improving teaching, but rather increasing teacher assessments and setting higher stakes for these assessments. But that might be just me being Finnish and being quite averse of this teaching efficiency language in general…

    We do know there’s a wide range of different interventions which do improve student learning outcomes (for example, Hattie’s list: and teacher performance evaluation isn’t there high on the list.

    • My understanding from my quick skim is that it was about both improving teaching by hiring better teachers, firing bad teachers, giving incentives/rewards for good teachers (pay, promotion), *and* giving teachers feedback on their teaching and training on how to teach better.

      I’m inclined to agree that teacher performance evaluation (and financial/promotion incentives for good teaching) isn’t going to improve student outcomes. The outcome of the Gates Foundation study doesn’t surprise me. For me, it’s sobering, but not surprising.

  2. There’s a lot of non-intuitive stuff in pedagogy research these days. I don’t know how much I trust specific studies/results, but it definitely makes me feel like I need to be more vigilant about paying attention to teaching in general – both in terms of professional development and actually being conscious of student needs/success during classes and things.

    Below is a blog post which makes a somewhat orthogonal point to the this study, but it does suggest strong heterogeneity in the dependence of student performance on instructor ability. As you have said, this doesn’t necessarily suggest if institutional reform will make a difference to student outcomes. But perhaps these apparently contradictory effects (different instructors lead to different outcomes, and institutional reform seemingly doesn’t) could be explored further.

    • See my longer comment below. I think you are exactly right. There IS strong (and repeatable) variation in teacher quality. But reforms are not going to change the mean by very much.

  3. Very interesting results. Confirms a sneaking suspicion I had about impediments to progress in learning. But to your question, I too do not know the answer. I do know that I am encouraged to improve how I teach by students demanding clarity and seeking understanding at a level beyond what I currently provide. It is much too rare that they do. The flip side is that no matter how much work I put into the quality of the experience in my classroom, students who are not engaged will not learn, and there are too many of these. So the easy conclusion is that it is the students. I will note though that the amount of money and time talking heads spend analyzing the social dynamics, philosophy, and minute details of the performance of my institution’s sports teams exceeds by many orders of magnitude the discussions by the public of what kind of education is occurring here. Until that changes, I’m not sure I as an educator can do more except make small improvements when the rare student demands it.

  4. I recently learned about this great resource put together by folks at the journal CBE-Life Sciences Education: It summarizes research on teaching practices and provides some suggestions for best practices based on current evidence in the field.

    It’s more specific than the broad question of what can universities do to improve teaching but provides some nice examples of the kinds of things discipline-based education researchers are studying and working on at the university level!

  5. This idea – that good teaching does not necessarily improve performance – is one that keeps me up at night, as I work with students who already face barriers to success (my program serves low-income, northern residents, Indigenous and newcomer students), so the quality of my teaching is one of the primary ways I try to help them succeed! However, I think programs like mine that are holistic recognize the other important factors affecting student success and we try to work on those as well. For example, I work with a team that includes: academic advisor, personal counsellors, Indigenous elder, Arts/English education specialist, and we all serve the same group of students. It’s definitely not a perfect system but I have to say I think this more holistic approach is on the right track…

    • “so the quality of my teaching is one of the primary ways I try to help them succeed!”

      I teach a very different student population than you, but I feel the same way. I suspect that’s a common feeling in many contexts (e.g., climate change: People want to feel like they have agency. They want to feel like they have control over their lives and the lives of the people they’re trying to help. They want to feel like they’re making a difference. But in fact, it’s often the case that you, personally, have at most a *very* limited ability to make a difference on your own (here, by improving your own teaching).

      As you say, one response to that is to try to make a difference by playing a role as part of a much larger group of people. That group of people can make a difference. At least if they focus on the right things, or on a wide enough range of things (as opposed to, say, everybody focusing narrowly on improving everybody’s in-class teaching).

  6. I think a lot of this parallels the K-12 (primary, secondary school world). One commentor already linked to Hattie who has done massive compilations of thousands of research papers, basically a book full of meta-analyses on diverse topics. He covers everything. Including things like having teachers or not, having textbooks or not right down to particular forms of pedagogy.

    One broad conclusion from hattie of effect sizes:
    student/home>teacher>having any teacher>school-level effects~=individual development w/o teacher or formal education>0

    There are some interventions that get up to the same magnitude or larger than student/home, but this are kind of foundational and already done (e.g. have special education for students with learning disabilities, having gifted and talented programs, having a positive, constructive, education oriented discipline program, giving feedback not just a score). And there are some on the same magnitude as teacher effects, but most pedagogical interventions are pretty small in effect size. The effect size of who is teaching you is one of the largest that educators have control over. Teacher effects show up not just in short term studies in test scores, but can even be measured all the way out to higher incomes when grown up. So realistically if you want to have a big bang for your dollar getting better teachers is a high priority (probably a higher priority than any curriculum or pedagogy interventions except for the obvious things already done).

    That said how you get better teachers is not an easy not to crack (in K-12 or university). The correlates of good teachers are things like passionate, connects with people, explains things clearly. Things like training, having a masters degree in teaching, professional development time, socioeconomic background of teacher have little correlation. Years of experience matters for the first few years but not at all after that.

    So the Gates study tried two things. More teacher training, especially for those performing badly (as measured by their students controlled for who their students are). And evaluating and supposedly firing bad teachers. From the last paragraph you might expect that more training was not (and never will be very effective). And on the 2nd that in my mind was the big failure/learning point of the Gates study. Principals were unable/unwilling to give bad performance evaluations and extraordinarily few teachers were fired even thought there were basically financial incentives to do so (and Gates did not tackle the hiring side of teacher pool)

    My take home is if you want to improve educational outcomes student/home effects are out of your control (as an educator, but governors etc should be homing in on that). And there is not really a lot of evidence for training teachers to be better. So the real lever is to hire great teachers (and secondarily to accurately evaluate and replace bad teachers).

    Of course, easier said than done. And there are real problems with scaling up. If you have a great principal (or dean) who finds great teachers, how do you leverage that to other schools. Given how low pay is how do you attract enough great teachers in.

    But to put it very directly (and harshly) in an evidence based world, you don’t tweak curriculum and pedagogy or train better teachers (all worth doing, but they are small order effects), you make sure you have a staff of great teachers.

    And that is a conversation that is not very popular in K-12 education nor in universities (because of course these are real peoples lives). But if you want to improve educational experience in universities you have to start putting teaching outcomes to the fore of hiring and tenure decisions. And pay so that there is a large pool to choose from. Its that simple. And that hard.

    To briefly bring this back to your questions about universities:
    – Is there meaningful variation in quality among professors? Almost certainly a fairly large and meaningful variation
    – Is there anything we can do to improve an individual already hired teacher’s ability? Let them teach their first five years to get some experience, but not much else
    – Is there anything we can do to improve teaching level across a university? Yes. Hire and fire based on teaching. The Gates study might have intended to study the 2nd half of this solution but did not actually.

    You can see why nobody wants to follow this train of logic. But I think a full and unblinkered read of the evidence leads here.

    • From reading Hattie’s work, do you think differences in test scores [and other outcomes for that matter] are way too noisy to evaluate individual teachers for employment decisions?

      This is my impression from the work by Kathy O’Neil.

      So while it is clear that teachers have a “relatively” big effect size on outcomes, given massive amounts of data, my impression is that models are pretty useless for identifying/predicting who a good teacher is/will be. My guess is that this is to some degree the biggest problem. Operationalising the hiring and firing of good and bad teachers is a lot harder than confirming that teacher quality is important.

      • I think teachers should be evaluated like everybody else – by a multidimensional assessment. Test scores should certainly be used (with care, controlling for the nature of the students in their classes and acknowledging that I don’t think statisticians have yet to nail a truly “fair” purely statistical assessment). But other factors should be included as well. Principal and peer. But one of the things that came out of the Gates study is that student evaluations were a better predictor of test scores than principal evaluations. Doesn’t mean everything because I just said test scores aren’t the endpoint, and principals pick up on other things like inclusiveness that might not show up in average test scores. But we are missing the boat not using student evaluations (and there’s little data but I would suggest parent evaluations, especially for younger kids) as part of a multifaceted evaluation. This freaks K-12 teachers out but of course university teachers have lived with this for a long time (where again they should be used as part of a multifaceted evaluation and not taking on simplistic numerical tests that ignore issues of class type, implicit biases, etc).

        In the end its hard to devise quantitative methods (including surveys) that get to teacher quality. But it is not really a subtle thing. Everybody on a college campus knows which of their peers are good teachers or not. And every parent (and in older grades students) know who they hope they get as a teacher. And although there is a little variability (different teachers reach different kinds of students) the reproducibility of these net evaluations are higher than the reproducibility of most measures in ecology. People assess the same teachers as the best year after year after year. This is where really good educational leadership comes in.

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.