Guest post: judgment fatigue

Note from Jeremy: this is a guest post from Mark Vellend.


I feel tired.  It’s mentally fatiguing to judge others and to be judged, and it seems that we scientists spend an inordinate proportion of our time doing one or the other.  Do we do more than people in other lines of work?  Do we do the right amount given our overarching goals?  Or do we do too little, or too much?  Could we judge in ways that are less exhausting for everyone?  My tired mind thinks we do too much, and that there must be better ways of doing it.  But that’s just from the gut, and I have no specific ideas to offer.  I’m hoping that when you finish reading this you will have some thoughts to share in the comments.

(1) Judging.

  • As an Associate Editor, I’ve got one manuscript awaiting decision, and probably a couple more “out there” in review (I’m not totally sure). I will make a judgement of each, offering my opinion to an Editor about whether a paper should be published in the American Naturalist. There’s at least one major judgement per month.  As a reviewer for many different journals, I’ll do the same, one step back in the process, also roughly once per month.  The decisions are consequential for the authors.
  • Upon submitting my recommendation to the Am Nat Editor, I will also be asked to rate the reviewers of the paper – in other words, to judge the judgements. I’m not sure, but one can imagine that at some level my judgement of the judgements will be judged by other judges.  I sure hope someone judges that judgement. Not.
  • I’ve currently got an application package for promotion to Full Professor to read, after which I will communicate my judgement as to whether this person deserves promotion. Maybe 2-4 things like this happen in a given year.  The outcomes matter for people’s lives.
  • The season of NSERC Discovery Grants is upon us, and as a member of the Evolution and Ecology evaluation panel, December-February will involve evaluating some 40-45 full proposals. February will include an intense week of coming up with final judgements as to whether each grant is insufficient, moderate, strong, very strong, outstanding, or exceptional, in each of three different ways: excellence of researcher, merit of the proposal, and training of highly qualified personnel.  Over my three-year term, this committee will determine the funding level for more than half of Canada’s ecologists and evolutionary biologists funded through this program.
  • For the dozen-plus grad student committees I’m on, each year will involve a handful of oral comprehensive exams, a handful of defenses, and a larger bunch of progress reports. On top of that will be a few turns as external examiner for Ph.D. exams or defenses elsewhere.  The committees will decide whether students can continue their degrees, or get their degrees, in addition to judging all the details of their research.
  • And also…job applications and interviews, awards committees (I’ve been on two major ones for scientific societies during the past 3 years), discussion groups critiquing papers, grad student seminar evaluations, undergrad papers and exams…and I’m probably forgetting things.

(2) Being judged.

  • Every judgement involves both the judge and the judged, so this section can be short (switch roles from the last section). In sum, for every judgement I render, there is a judgement received, on a manuscript, grant proposal, or promotion application of my own.  And while no one has an evaluation sheet in hand when I give a seminar, there are most certainly judgements made, sometimes evident during question period (or by someone falling asleep), sometimes not.  After a paper is published, sometimes you receive positive feedback, sometimes rather harshly worded e-mails. The positive judgements feel good, but probably not as bad as the negative ones feel bad. The wear you out.

Important point 1: This is likely a fairly typical judgement load for someone at my mid-career stage (with the exception of the NSERC panel, which is not a constant throughout a career). In other words, I’m not claiming to do any more of this stuff than anyone else, but rather that we all spend a tremendous amount of time as judge or judgee.  Little stuff happens daily, and formal judgements are probably given or received several times per week, averaged over the year. It’s exhausting.

Important point 2: The cumulative mental fatigue is not just from doing the work, as it is for an activity like preparing a lecture.  It also stems from the knowledge that each decision matters for someone else’s future – some in only a small way (a mark on an oral presentation) but some in a major way (e.g., funding, promotion).  The fact that essentially all final judgements involve multiple judges eases the stress somewhat, but it’s still there.  On the other side, being judged is always stressful.

Important point 3: The work of judgement is extremely important and very often gratifying.  The importance hardly requires explanation: ideas turn into data and data into knowledge via judgements.  The activities described above involve helping many people to get better at what they do, to think more clearly and critically, and to improve communication.  The judge also learns a lot, and all that is quite gratifying.  In short, my intention is not to suggest that we’re just running around like chickens with our heads cut off, following the rules of the system blindly.  The work is important and rewarding.  But…

  • Do some people steer clear of scientific careers once they see how often they’ll need to judge and be judged? If so, are we losing out from the absence of those people?  (Note that I’m not asking about the fairness of judgements – a very important issue – just the fact that so much judgement happens.)
  • Would science be better off with more doing and less judging?
  • To return to my initial questions: Are we doing the right/wrong amount? Could it be done in a less exhausting way (for both judges and judged)? To be sure, some strategies are already aimed at reducing our collective judgement load, such as journal editorial decisions to “reject without review”, or reducing the number of times one can apply for a given fellowship/grant.  These strategies also come with costs, so solutions will not be a simple matter.

Anyone else feeling judgement fatigue?  I look forward to hearing what you think…

78 thoughts on “Guest post: judgment fatigue

  1. Very nice post! Regarding your rhetorical question “Do we do more than people in other lines of work?”, probably only people in the justice system do more judging than us. 😁

  2. Thanks Mark,

    To friends and relatives that don’t have a clue about what my job is about, I often conclude by saying: “Don’t worry, I don’t know many other professions that are being judged as often as we are”. It is indeed tiresome because, just as you said, we work on both sides of the judging system.

    A large part of being judged is about understanding the judging system. How many times as an early-career scientist did I fail to figure out what are the “unwritten rules” of the judging system. These rules will differ if I try to get funding from NSERC discovery, foundations, some partnership programs, or my institution. They differ in the range of expertise on the review panel, the evaluation criteria, the level of competition, and even the “seriousness” of the judging system itself. In turn, these considerations are a strong incentive to step-up and judge others in different judging systems, because this is how you learn how it works. It is therefore not just about judging or being judged, it is about learning how the science system works.

    The number of journals, with different aims and scope, has exploded, but also the number of funding opportunities, or the number of achievement awards, all supported by different judging systems with their own unwritten rules. The wheel is spinning and we do get bored. I don’t know what the proper solution to this is. What I do see are researchers that specialize in getting one or two types of funding and be happy with it. The science judging system is compartmentalizing.

    • “In turn, these considerations are a strong incentive to step-up and judge others in different judging systems, because this is how you learn how it works.”

      Very interesting remark. Suggests that the existence of “unwritten rules” is a good thing in that it gives people an incentive to volunteer as judges!

      • Yes, but the problem I see is that the number of different judging systems keeps expanding: foundations, industries, local partners, municipalities, NGOs, federal agencies of all kinds, not to mention scientific journals. I cannot , and probably don’t want to, figure them all out, but I would still be curious to see how they operate.

      • Great point about the unwritten rules, and how judging helps a lot when putting forward things to be judged. But to respond to Jeremy’s conclusion, surely it can’t be a good thing that people who haven’t been on the judging side (e.g., of an NSERC panel) are at a relative disadvantage.

        I think the role of mentoring here is huge. As an early career scientist I benefited immensely from insights more seasoned researchers could provide. Part of that was good fortune to have generous people offering advice, but part of it is under our control – seeking out advice rather than hoping that just reading the instructions is enough.

    • I think unwritten rules are the main source of stresss related to academic judging. That’s why I try to write a lot about them on my blog, so students and postdocs can navigate the system better. We need to discuss openly even the issues that seem trivial.

  3. As I live and feel it, your last two points actually relate to two distinct parts of the process in research and education. The review part, and the qualitative feedback (as opposed to quantitative, or at least ordinal feedback) that can be given as a result, is the gratifying part both for the “judge” and the “judgee”. But it is not that judgmental, in that this part per se does not involve any decision. The mental strain rather comes from the final decision, the actual judgement, one has to make, often reflected in a quantitative/ordinal measure of success and/or an accept/reject decision. As an example, reviewing critically a research paper for a colleague and giving my feedback is a pleasure, even when I am very critical of the work. Doing the same as a reviewer for a journal, and in particular suggesting a decision and/or grading aspects of the paper, is the straining part. So I am not sure I would see anything really positive in judgements: they are just here because there are choices to be made and someone as to make them. In this case, sharing the burden and discussing decisions is probably the best to do for one’s mental health. But in any situation where judgement is dispensable, I do without it and favour constructive and detailed comments, especially in my teaching activities (e.g. no scores, even ordinal ones, in students’ peer evaluations).

    • Great point about feedback vs. final judgement. Indeed I love discussion groups and small meetings in which we look for logical holes in arguments, potentially better methods or analyses or ways of explaining things, etc. I totally agree that it’s the final quantitative/ordinal judgement that is stressful. At some point in the next couple years I’ll probably end a 10+ year run as editor at one or more journals (every submission involves a final thumbs up/down decision), and then do more work as a reviewer (same constructive feedback but less of a sense that a final decision is on my shoulders). Looking forward to it!

  4. Thanks for this post, Mark. Many of us are on reading week, and when we slow down just a little, we realize how tired we really are. I look around, and I see that most of us are also as tired as me, and as you. The majority of us are really working at this same level, except for some who have decided to step back from one or more roles. This can happen at any time, due to necessity (health, family, other roles we need to take on) or choice. We do have a choice, and for myself, as time goes on I take on fewer manuscript reviews, since to do a good job, I take quite a bit of time, and I want to learn each time, so I only review the manuscripts that really interest me. I do enjoy the research evaluation committees, I learn an enormous amount each time. It is hugely stimulating and lets us see where our science is going more generally, and this is fascinating.
    But I also note around me that our health often suffers eventually from this sur-abundance of activity and judging. The scientists who are my «models» (not that many), do less, but do it very well when they take on a task. There is always a choice, and I would recommend to say no more often, at any career stage; to place an emphasis on quality, not quantity. Easy to say. Thanks again, we need to think about this more critically.

    • Thanks for this, Alison. This relates to the tweet above from Droo. For some things we don’t have much of a choice (e.g., if you’re a teacher, you need to evaluate the students), but otherwise we can make choices, and obviously our mental/physical health needs to be a priority. However, saying ‘no’ more often doesn’t reduce the judgement load on the scientific community, it just means someone else will have to say ‘yes’.

      But you have made me think that if we all _submitted_ less stuff for evaluation (slower science) that really would reduce the collective judgement load. The pressure for numbers (papers, grant dollars) is definitely an issue here.

  5. Perhaps we formalise judgement-making more than many other professions, but that seems to me to be a positive feature of academia. Everyone makes judgements constantly, often subconciously, and these are not necessarily inconsequential. Is it the fact that we link them directly to processes and outcomes that makes ours more emotionally taxing?

    One of the paradoxes of academic judgements in my experience arises in the unusual context of the review panel. I have yet to attend a panel meeting in which the participants did not conclude by congratulating one another on a very fair and rigorous decision-making process, even while lamenting that they could not have given out more awards. And yet the aggregate outcomes of these same panels so often display massive biases. This is perhaps why organisations are so resistant to criticism about their records, for example if men appear to win an unreasonable proportion of awards. After all, everyone tried very hard to be fair! How can it be that individual decisions appear defensible while aggregate outcomes are so often not?

    • “And yet the aggregate outcomes of these same panels so often display massive biases. ”

      What sort of biases are you thinking of? The data that come immediately to mind for the granting bodies with which I’m most familiar indicate quite fair decision-making processes. But of course, I’m only familiar with a small number of granting bodies.

    • Interesting. My experience is that the averaging process of aggregate judging by exlusion favors good-average profiles. Reviewers make a list of their top 5 awardees. One appears on two lists, but not the third, Most often, this candidate will be excluded because the panel will prefer to focus on what they have in common. By this process, it is easier to exclude than to include. I’m pretty sure there are ways of avoiding this bias.

    • Thanks, Markus. I agree that there is a positive aspect to constant judging. Related to Charlotte’s point above, indeed seems likely that it is the accept vs. reject or winning vs. not winning an award that makes the stress mount as a judge.

      I am hopeful that many of the biases that seem evident in things like past lists of award winners, for example, are being actively tackled at the aggregate level at present, such that the future on that front looks bright. NSERC, for example, is very actively trying to deal with (un)conscious biases of various sorts, and hiring/award panels I’ve been on have actively discussed issues of equity, diversity, and inclusion. I believe they have affected aggregate outcomes, and Jeremy’s recent post in hiring might be evidence of that.

      • As with Jeremy above, I think the panels I’ve been on have been very fair, and increasingly aware of their responsibility to support diversity within the field. Unconscious bias training might be helping. Certainly at BES we now check the portfolio of award winners after the panel stage, which is effectively ‘reviewing the reviewers’, and sometimes decisions are revisited. Historically, however, persistent biases have been apparent in most academic bodies, and I’m sure the participants in those decision-making processes didn’t consider themselves to be anything other than fair either. Perhaps the long-term trajectory is towards improvement; there are positive signs but it’s too soon to tell. Our perception as white, male, mid-career tenured academics isn’t necessarily the most accurate either. You, me and Jeremy aren’t exactly an advertisement for diversity…

  6. I’ve been thinking about how we would reduce the amount of judging we do. Here are some ideas, and I’d be curious to hear what folks think of them. I suspect that most folks, even those who feel judgment fatigue, wouldn’t want to see most of these ideas adopted (I sure wouldn’t…). But I could be wrong!

    -Have selective journals do even more desk rejections. That is, put more of the collective burden of judging papers onto the shoulders of editors, to relieve the reviewers of it. (Aside: years ago, my own motivation for proposing the “PubCred” system–basically, obliging authors to review in appropriate proportion to how many mss they submit–was to *reduce* the number of desk rejections and so *increase* the amount of judging we all have to do…)

    -only get 1 review per paper rather than 2-3.

    -do away with pre-publication peer review. Just put preprints online and leave readers to filter and evaluate them…somehow (e.g., via Google Scholar’s recommendation algorithm, via the recommendations of people they follow on social media…). Actually, we already know from poll data that this is a very unpopular option:

    -have all journals review only for technical soundness, with readers left to filter and further judge those papers for interest/novelty/importance on their own. Poll data shows that’s unpopular too:

    -give out research grants of equal size via a lottery to all eligible candidates, perhaps after a preliminary quick-and-easy judging step to weed out bad proposals. Can a case be made for this in Canada for NSERC Discovery Grants, for which the success rate is ~70% and for which there’s not *that* much variation in grant size (a few high fliers aside)? This would be a much more radical reform in most other countries, where success rates are currently much lower.

    -stop giving out merit-based awards. Quit giving out merit-based scholarships and fellowships, no more student awards for best poster or talk at conferences, no more scientific society awards for the best paper or best young researcher, no more Nobel Prize or Fields Medal, no more university teaching awards, etc. Instead, spend whatever money is involved in other ways, such as on scholarships, fellowships, and student travel awards given out on basis of financial need or via a lottery. (Note that this may not actually be an option in some cases, if the award is funded by an endowment or bequest that can’t legally be spent on other purposes.)

    -give out some merit-based awards based on some formula. For instance, give out graduate student scholarships based on some combination of GPA, number of conference presentations, number of papers, etc. (Aside: note that there are actually-existing versions of this, for instance in countries in which college admissions, scholarships, or civil service jobs depend heavily on results of a high-stakes exam.)

    -in countries that have second graders for student work (e.g., UK), cut back to one. This one actually seems quite feasible to me, since I’m in a country where the norm is one grader.

    Semi-related: my old post asking what ecologists should learn *less* of, in order to free up time to learn more about all the things that somebody thinks we all need to learn more about ( Asking “how can we have less judging” strikes me as a similar sort of situation. Probably lots of people like the idea of “less judging”, just as they like the idea of all ecologists learning more coding, or more natural history, or more [insert your favorite subject here]. But I bet hardly anybody actually likes any of my concrete suggestions for how to have less judging. Everybody wants to go to heaven, but nobody wants to die.

    • I wish I had more to say here, but for every action that seems like it might help, one can too easily think of downsides. To some degree I subscribe to the Freakonomics view that people will (almost) always act in accordance with the various incentives, so if you create an algorithm for determining winners, people will go about their science in ways that maximize their score. (Evidence of this if you compare countries with different incentives, and then how scientists subsequently behave.) If your next grant is uninfluenced by whether you just stay active or put extra effort into doing something a bit risky and super cool, many (but certainly not all) people will be less likely to really dedicate themselves to deep thought and reflection. Of course we have competing incentives (maybe we’d like a particular recognition that requires lots of papers, but we also don’t want to be seen as publishing lots of insubstantial papers), and people weight incentives different ways, but many of us operate in the same basic “system”.

      I wonder how one might incentivize submitting less stuff? That obviously reduces the collective judgement load. If three Ph.D. projects relate to a common theme, you can be pretty sure you’ll get at least 3*3 = 9 papers, when really there could be far fewer. I see no way in the current system to do it any other way without compromising career trajectories of the students. Ideas anyone?

    • I love many of these ideas, Jeremy. Particularly, the most radical ones. I would love to see NSERC grant awards automated. It wouldn’t have to be equal amounts to everybody but automated in the way that you suggested for student scholarships. Do the same for student scholarships. I think there are a bunch of awards that we could get rid of – not the biggies like the Nobel prize or the Fields Medal – but a lot of the university-level awards are prizes that professors decided we should give out to professors. Most teaching awards weren’t instigated by students clamoring to recognize the best teachers – it was professors deciding we should give professors awards for teaching. And these awards often seem to get passed around – if university-level teaching and research awards actually identified the best teachers and researchers the same people would have win streaks. If Professor Smith is the best teacher or researcher this year it seems logical to me that they would often be the best teacher or researcher several years in a row. How often have you seen repeat winners? Am I picking all the ones you wouldn’t support? Jeff

      • I would not love to see NSERC grants automated according to a formula. That thought horrifies me. Especially because the formula would have to be made public and so would be gamed.

        I do think it’s a nice morale boost for university staff to have university level awards. And at Calgary, there are some teaching awards given out by the student’s union.

        I could probably be talked into some automated system for giving out some student merit scholarships. For instance, at our uni, I think you could probably give out NSERC USRAs automatically on the basis of GPA without changing the actual outcomes much if at all. Though since I now sit on the Faculty of Science awards committee maybe I’ll discover I’m wrong about that.

    • I think the lottery system should at least be tried (at least for the gray list of proposals that are good but really apples and oranges), not only to solve the judgment fatigue, but also to create a fairer distribution of research money (by fair I mean not influenced by reviewer or committee member noise caused by low sample size and by a potentially biased list of people from which they are sampled). In Germany there is a grant scheme called EXPERIMENT where half of the grants are chosen based on judgment and the other half on lottery. They are going to evaluate which system provides the best science. That of course requires some judgment again…

  7. Two retained my attention:

    “Give out research grants of equal size via a lottery to all eligible candidates, perhaps after a preliminary quick-and-easy judging step to weed out bad proposals”
    -I like to think that a researcher that remains active in, by supervising some students, publishing in good peer-reviewed journals, attending conferences, and having ideas and research questions, should have access to basic funding. To demonstrate the above is faily easy, the ranking is the difficult part, as always.

    “Give give out some merit-based awards based on some formula”
    -Why not? I would not be shocked. But don’t tell the formula!

    • “But don’t tell the formula!”

      I suspect that if it became known that there was a formula, the demand to reveal it would be overwhelming. And I don’t know how one could possibly keep even the *existence* of a formula secret.

      • Daniel Kahneman’s book Thinking Fast and Slow provides many good examples that in many decision systems a simple formula does an equally good, or better job, than expert assessment. He goes on explaining that we believe decisions made by experts are less biased, which is fundamentally wrong according to him.

  8. Interesting post Mark. It prompted some self-reflection, because I don’t feel judgment fatigue even though you and I have similar judging loads (save that I don’t sit on the NSERC panel, but as you say that’s a temporary obligation). Here are some hypotheses for why I don’t feel the same judgment fatigue you do:

    -Maybe the rewards of certain sorts of judging loom larger for me than they do for you? I *really* liked serving on the ASN YIA committee. And I like editing for Am Nat because it forces me to read more than I otherwise would, and the papers I’m being obliged to read are mostly good.

    -I’ve rarely had a sense that my judgments are very high stakes for those being judged. For instance, when I served on the ASN YIA panel, I didn’t feel like we were going to make or break anyone’s career with our decisions. And I don’t worry that, as an Am Nat editor, that if I reject a student’s paper or a pre-tenure prof’s paper that that will make or break that student’s or prof’s academic career. I think people’s career outcomes mostly reflect lots of judgments by lots of people; they also reflect lots of things unrelated to anyone’s judgments. So there aren’t that many scientific judgments that are really high stakes for those being judged (there are some, of course).

    -I like to think of myself as a good judge of science. It’s part of my self-image as an ecologist: that I’m a good judge of what makes for good ecology. That’s part of what gives me the confidence to blog about the topics I do (“zombie ideas” etc.), and to enjoy blogging about them. I don’t *think* I’m unique in this. In The Silwood Circle, Hannah Gray talks about how two shared traits of many successful scientists are “heightened yet selective awareness of what others are thinking [about science]” and “greater than usual willingness to pass evaluative judgment on the scientific work of others”. But of course, I’m sure one could argue that even if those traits have been characteristic of successful scientists in the past, that doesn’t mean that they *should* be. Perhaps science as a whole would function just as well, or even better, if it were arranged in such a way that those traits were no longer particularly useful or valued. I’m not sure; I’m curious to hear what others think.

    • Just for the record, I also quite like editing for Am. Nat. (and previously for Oikos, Ecology Letters, and Proc B). At least in a general sense. For papers that are technically and logically sound, the stress starts to build as I form a judgement about whether the paper fits the bill for Am. Nat. (change how people think about topic, conceptual synthesis…[people can go read their criteria]) or Ecology Letters (“exceptional novelty”). From experience on the other side, two different equally qualified people can give confident thumbs-up or thumbs-down judgements, and so I’m always aware that while I might be as qualified as the next person to make the judgement, it can often enough easily go a different way.

      It’s true enough that any one decision is not make or break for someone’s career, but it wasn’t so long ago that I was a student for whom it sure _felt_ like a massive deal to have a paper accepted vs. rejected at journal of choice or to win or not win an award. I’m in that person’s head as I think about decisions…

      • “but it wasn’t so long ago that I was a student for whom it sure _felt_ like a massive deal to have a paper accepted vs. rejected at journal of choice or to win or not win an award. ”

        Sure. Meghan has a good post on this:

        So one question is, if it is indeed true that no one judging decision will make or break someone’s career, how much should judges worry if some of those being judged *feel* (incorrectly!) that a single judgment will make or break their careers? Obviously, judges should sympathize with those being judged who are anxious about the perceived stakes. But arguably, doesn’t it go a step too far for judges to themselves start feeling (again, by hypothesis *incorrectly*) that the stakes are as high as those being judged perceive them to be?

        My thinking about this is probably shaped by my experience compiling extensive data on recently-hired N. American TT ecology faculty. Those data reveal that many ecology faculty job seekers have seriously incorrect beliefs about many aspects of the ecology faculty job market. For instance incorrectly believing that various very minor things have huge effects on search outcomes. Those incorrect beliefs lead to a lot of understandable anxiety and frustration on the part of ecology faculty job seekers, with which I sympathize. But I don’t feel the same anxiety and frustration myself (and didn’t even back when I was a faculty job seeker myself). And so I write posts about the data I’ve compiled to try to relieve anxiety and frustration based on incorrect beliefs.

        As an undergraduate instructor, I often have the experience (which I’m sure every undergraduate instructor has) of helping anxious undergraduates realize that one bad grade on one exam is not going to determine their fate in the course, or in their degree program, or in their lives. Perhaps those of us who act as mentors to, and judges of, trainees need to do more to help them understand the same thing with regard to their papers, their fellowship and award applications, etc.

    • My rational mind agrees with you – don’t take on someone’s stress if they’ve overestimated the consequences. My emotional mind overrides the rational one, and so it’s stressful!

      • The part about editing that I dislike the most (by far) is the part where I have to make a recommendation. I like thinking of different possible reviewers, I like thinking about how to improve a paper, but I agonize way more than I should over what the recommendation should be for many manuscripts. I agree with Mark that it feels like a big deal, especially if the first author is an early career researcher.

  9. Can I just say how happy it makes me that we immediately got such a good comment thread going? Rather than people just tweeting to say they liked the post? I feel like it’s 2013 again around here, and that’s an increasingly rare feeling.

  10. This post made me think about how all this constant judgement interacts with imposter syndrome and career stage. One of the exhausting parts of judging can be constantly convincing yourself that you are qualified to judge others’ work, and being judged is exhausting because it leaves so many opportunities to feel inadequate. (On the other hand, I have found that when there is more interaction in these situations, such as conversations between a reviewer and author, it can actually be helpful in alleviating this feeling on both sides – making me feel like my comments are actually helpful, and that someone respects my work enough to give useful feedback in addition to the judgement.)

    • Thanks, Claire. What do you mean by “conversations between a reviewer and author”? Some back-and-forth with an editor (for example) can be appropriate, but there is also I think a need for a bit of “distance”, so that one can remain as objective and fair as possible (e.g., not favouring authors with whom you had a great exchange vs. authors who didn’t get to know).

      • That’s a good point. Any example I was thinking of was still in the context of a standard single-blind review process (so, not a real “personal” interaction); for instance, getting a second set of revisions back that say something along the lines of “the changes you made also made me think about this” or “that’s what I was getting at and the point is clear now.” As frustrating as it can be to go back and forth a bunch of times with reviewers, it can start to feel like the judgement is accompanied by being helpful. The same for interactions with an editor. But I can also see how this a bit of an aside to your original point, since you could have a similar interaction (e.g., with a colleague’s “friendly review”) without the formal “judgement”.

    • Yes, I totally agree about the interaction with imposter syndrome! A blog post I’ve been meaning to write relates to this. As a reviewer, I used to feel like a total fraud if the other reviewer noticed a problem I missed. And, as an Associate Editor at AmNat, sometimes the “big editor” notices something I missed, or gives a really amazing suggestion that I didn’t think of, or something else that makes me feel imposter-y. But I’ve come to realize (if not to fully internalize!) that part of why this happens is because different people have different perspectives, so it’s actually *good* if the reviewers don’t all notice the same thing, because it indicates that the reviewers were coming at the manuscript from different angles. As an AE, I now realize that I often choose reviewers to be different intentionally (e.g., one who is an empiricist and one who is a theoretician; or one who works in that study system and one who works on the topic but in a different system). Given that, it’s to be expected that the reviews will pick up on different things. But I know that, for lots of people, missing something another reviewer noticed triggers imposter syndrome.

      • “As an AE, I now realize that I often choose reviewers to be different intentionally (e.g., one who is an empiricist and one who is a theoretician; or one who works in that study system and one who works on the topic but in a different system). ”

        Yup, as an editor I too often deliberately choose a mix of reviewers who will read the ms from different perspectives. I think many good editors do this, but I don’t really know. Haven’t talked to many other editors about how they choose reviewers.

        “But I know that, for lots of people, missing something another reviewer noticed triggers imposter syndrome.”

        If it’s any comfort, I can confirm that *nobody* catches *every* technical mistake they encounter. Not even leading senior people who are conscientious reviewers/editors and technical experts on the subject or methods of the ms. So if you miss something as a reviewer, well, it happens to *everyone* at some point. It’s ok. That’s one reason why journals ask for multiple reviews.

        I wonder if experience playing team sports (as a teenager or adult, not as a little kid) helps with this? I played baseball up through high school. Good teammates are always picking each other up after a mistake or failure. Telling each other “It’s ok, we’ve got your back” or “Don’t worry, we’ll get those runs back” or whatever. For me, that experience helped me get used to the ideas that (i) everybody fails or makes mistakes sometimes, and (ii) when you fail or mess up, there are usually other people who can help undo, fix, or cover for your mistakes.

      • I played tons of team sports (and even coached basketball during summers in college!), so I’m not sure that’s it.

      • As AE, I don’t really feel like I have a ton of control over the expertise of the 2-3 reviewers. I recommend 7-10 potential reviewers, and they represent different perspectives, but those who actually agree to do the review is a crap shoot (the “final” two might have a very similar perspective).

      • I often tell the journal office to get me one reviewer from this list of 4 (or however many), one reviewer from the list of 4 other people with different expertise, etc.

      • I often end up sending the AmNat office decisions trees along the lines of:
        1. Start with X. If they say no, move on to Y. But don’t ask X & Y.
        2. Start with A. If they say no, move on to B. But don’t ask A & B.
        3. If still not at two reviewers, try P then Q.

        I do that to try to end up with people with different perspectives (though it doesn’t always work out!)

        Or, for a paper I’m handling now for E&E (for their Academic Practice section), I wanted one reviewer who was earlier in their career and one who was senior. I was a little surprised that it ended up being much easier to get a senior person to accept the review!

      • I should probably be doing much more of that multiple lists / decision tree thing – great idea. Although it does increase the workload of the judgement process even further…

      • I find giving the journal office a decision tree doesn’t appreciably increase the amount of time I spend identifying candidate reviewers. Maybe just because I’m unusually fast at that? How long does it take you to come up with, say, 6-8 potential reviewers?

  11. Mark, I felt very close to your post when I read it. I am 55 so I’ve been a lot through judgement, to a level that became even painful. My main fatigue was that related to contracts and positions, because as you say the future of colleagues is at play; yet, in Spain, where I work, budget cutting down already lasting 10 years have left many people still struggling for a position and being judged every year to end up being back to the line to cue for the next round of chances… That mean people of 45y old and even older. And it means that many of them have better CV (better in metric terms, which is the less harmful way to evauate an intelectual career) than many researchers having got a position before the economic crisis arrived (as I said, it arrived 10 years ago). It was so painful for me that I decided to give it up, so I moved to another research branch, decided to give up my research group that I built with great effort over the years and then I have became a solitary researcher who do not accept any further involvement in more judgements, except for reviewing papers and some small proposals. That’s my story.

    (Note from Jeremy: comment edited to correct a typo)

  12. Mark asked “Do some people steer clear of scientific careers once they see how often they’ll need to judge and be judged?” I seriously considered not going to grad school when I heard about qualifying exams. The idea terrified me.

    • One small factor for me in choosing a grad school was choosing one that didn’t have a written qualifying exam. There are several schools that weed out half their PhD students during these exams and give them masters (rather than just doing an oral – “talk about your research” + quiz on any semi-related topic, style of exam). I think I do fairly well on high-pressure timed exams compared to my abilities, but they cause me too much stress to be worth it. I also viewed this exam style as a cue that the program, in general, wasn’t the right fit for my personality.

      • I actually thought this point would generate more discussion. Having known someone smart who failed their comps (when I was an M.Sc. student), I was pretty freaked out about the idea of a long oral exam about everything. But aside from that particular exam, I do have the sense that a decent number of people shy away from the academic science route because of the constant judgement. Maybe that’s just the way it is (e.g., don’t become a fire fighter if you don’t want to rush into burning buildings), or maybe it doesn’t need to be such a strong disincentive.

      • My own admittedly-anecdotal sense is that a decent fraction of grad students are in grad school at least in part because they like being in school and they’re good at it. Ok, they don’t *like* taking high stakes exams. But on balance, they like taking exams (and doing the other school-like stuff that goes along with that) more than they like the alternative of not being in school. So I dunno, I find it hard to guess even roughly what fraction of prospective grad students shy away because they don’t want to be judged so often. And as you say, even if that fraction is high, is it a fraction we should aim to reduce, or is that just the way it is?

        For some forms of judgment, I do think that’s just the way it is. In particular, there’s not much support in our polls for the idea of doing away with peer review. So if you really, really don’t like the idea that your science will be evaluated by other scientists on grounds of interest/novelty/importance/etc., well, sorry, you either need to get used to it or you should do something else with your life (which of course might include “a job as an academic scientist at a teaching-focused institution, so I don’t have to write peer-reviewed papers”).

      • “So My own admittedly-anecdotal sense is that a decent fraction of grad students are in grad school at least in part because they like being in school and they’re good at it. Ok, they don’t *like* taking high stakes exams. But on balance, they like taking exams (and doing the other school-like stuff that goes along with that) more than they like the alternative of not being in school. So I dunno, I find it hard to guess even roughly what fraction of prospective grad students shy away because they don’t want to be judged so often. And as you say, even if that fraction is high, is it a fraction we should aim to reduce, or is that just the way it is?”

        Well, once I’m past undergrad, I question whether a written 2-hour math exam on all my undergrad and grad coursework is a useful indicator of my ability to do good research. I chose to seriously doubt cultural fit of any program who had such exam, which could block PhD completion. To be clear, this exam is different than, and in addition to, the traditional oral exam about research progress/proposal and breadth of knowledge relating to topics broadly pertinent to the general research area. I think such an oral exam is valuable. However, taking 3 months of my life to study for the former type of written exam, when I could be using that time to do research, doesn’t fit my view as to what the PhD is all about. PhD programs are products, and

        I “judged” the programs that needlessly judged their students.

        I looked favorably on the programs that valued creative research and a student’s ability to do that research, rather than ones that tested general “math chops.” In pure math, the written exam might be more valuable, and my guess is that in applied math programs, it is somewhat of a relic of the pure math legacy.

      • Interesting. I wasn’t aware math PhD programs had a big math test in addition to what life sci programs would call the candidacy or comprehensive exam.

    • “So if you really, really don’t like the idea that your science will be evaluated by other scientists on grounds of interest/novelty/importance/etc., well, sorry, you either need to get used to it or you should do something else with your life (which of course might include “a job as an academic scientist at a teaching-focused institution, so I don’t have to write peer-reviewed papers”).”


      I’m going to push back here. I work in a super contentious part of the world, ecologically and socially. Judgement happens often, and aggressively. As a part of our staff, we have a trained facilitator/moderator who has spent a significant portion of her time training people how to give better, useful, and less personally-driven feedback. I didn’t realize how awful many scientists, both inside and outside of academia, were at giving feedback until I started working with her. I accepted it as normal. Moreover, I was shocked by how many scientists were downright terrible at receiving perfectly respectable and justified feedback. Most saw any negative feedback as an attack on them personally.

      The moral of my story is that not all “judgement” is the same. It can be helpful, motivating, constructive, and non-personal. It can also be demeaning, personal, depressing, and useless. We not only need to learn how to give it, we must also learn how to take it. As a profession, I feel that we have quite a ways to go on this front, and I think it adds to our judgement fatigue.

      Finally – just a quick story. A student I know received a review this year that said she was an “embarrassment to her field.” It was her second paper ever submitted, and to a reputable journal at that. That’s wildly inappropriate, and stunning that it made it all the way back to the authors. I’m now a bit embarrassed to say that, at the time, I told her to “shake it off” and that “it happens to everyone once in a while.” We can do so much better than that.

      • Oh, I agree completely that cases like the one you describe at the end are wildly inappropriate. Nobody should ever have to get used to crap like that to have a career in science! As I perhaps should’ve made clearer in my comments, I was suggesting that, when it comes to peer review, students do need to get used to receiving *professional* (but critical) peer reviews of their work. Nobody should have to get used to, or grow a thick skin about, personal attacks.

        As your story illustrates, our peer reviews have room to improve in science. I do think they’re mostly professional, as indicated for instance by survey data indicating that most scientists report mostly or entirely positive experiences with the peer review system. But the fact that most peer reviews are professional obviously doesn’t somehow make up for the unprofessional ones.

        I have an old post on this in the context of so-called “post publication review”. My anecdotal sense is that post-publication reviews are much more often unprofessional than are pre-publication reviews, and that advocates for post-publication review are far too quick to sweep this unprofessionalism under the rug.

        I also have an old post on how to write a peer review:

        The British Ecological Society publishes a very good guide with more detailed advice.

      • Jade raises what I think is an important distinction. We don’t want to fall into the lazy conclusion that everyone just needs a “thick skin”, which can seem like giving a free pass to aggression and personal attacks. More formal training and insistence on polite, constructive feedback also seems like a good path to easing the fatigue: it feels better to both give and receive such judgements, and so should be less mentally taxing on all.

      • “it feels better to both give and receive such judgements, and so should be less mentally taxing on all.”

        Well, less taxing relative to receiving “judgments” in the form of personal attacks, certainly. But still taxing in an absolute sense, presumably? I mean, you both give and (I assume) receive mostly polite, constructive judgments. And yet you’re still fatigued.

        I agree completely that nobody should have to grow a “thick skin” to have a career in academic science. I also stand by the claim that everybody in academic science has to be comfortable with giving and receiving a lot of constructive–and often critical–feedback, and be ok with frequent negative judgments such as getting papers rejected. For the reason that, as you point out in another comment, there are more people who want a career in academic science than can have one.

      • Still taxing, for sure, and I totally agree that giving and receiving criticism is an essential part of the job. But I guess even a small proportion less constructive, more aggressive judgements can have an oversized impact on fatigue. And even giving our reviews a final proofread to see if they might be a little less snippy can help. Am. Nat. is good about fostering a culture like this, and I think people on both sides of the judgement probably come out feeling a bit less fatigued for it.

    • It’s interesting that the qualifiers terrified you. I made a major mistake in my choice of PhD program and one thing that I really disliked about the program I chose was that the qualifying exams were almost silly. My comps lasted an hour. All the questions were simple. I thought it was a joke. The Dept Chair’s final question was apparently an attempt at some sort of reverse psychology trap. It was so easy I had to take an extra minute to think it through and make absolutely sure it was as easy as it appeared to be. When I decided it was, I spoke the one-word answer and he clapped his hands together and said “Excellent!” And that was the end of it.

  13. Wow. I had no idea that reviewers were judged!

    I assume the main dimensions are (1) whether the review was on time (2) whether the reviewer offered constructive and reasonably thorough feedback and (3) whether the reviewer was polite, yet honest. Do I have that right?

    • This post actually generated some discussion from the Am. Nat. editorial office about what exactly this is for (kudos to them for promoting transparency). At present it’s just a 5-point scale on the value of the review, but in the future there might be a verbal reminder about judging – as you say – things like how thorough and constructive the review was. Timeliness is not part of it, since that’s recorded in the system automatically, simply because everything has a time stamp. The main purpose is to help other/future AEs decide on reviewers and to help Editors decide on nominations of future AEs. Also, it’s optional, so not all reviews are judged (not sure what the % is).

  14. Via Twitter (short thread, click through for the whole thing):

  15. If there were less competition and more collaboration, perhaps there would be less judgment. In my teaching, I am increasingly dissatisfied with the fact that I have to assess students. I want to teach them something, not judge them, and ideally they should assess themselves…

    • I like that as a Utopian vision, and the idea extends to much earlier in life, as I am reminded frequently via children in primary/secondary school studying “for the test” instead of in order to learn. But in a world where more people would like to do certain things than is possible (e.g., admission to a given study program, getting a certain job), some degree of competition is unavoidable, as best I can tell. Judgements (not done by oneself) are thus hard to avoid.

    • We’d all like to not have to judge anyone and have everyone be our friends. But there’s a reason we don’t operate that way, and it’s a good one: advancing people with the most merit produces the most benefit for society as a whole. Judging people is an extremely important part of every job. If you take pains to do it well, there’s no need for regret. It has to be done.

      And academics are hardly alone in judging people. Every person in every job has to judge other people. Everyone. When a supervisor hires someone they judge the 10 or 50 or 500 other applicants less worthy. When a manager promotes someone, they judge the other people who wanted that promotion less worthy. When the time for layoffs come, those judged less worthy go, those judged more worthy stay.

      • I was wondering if someone would bring this up. How often are workers in other fields judged, and how often do they have to judge others?

        I suppose the answer varies hugely by field (and even individual firm?) and by seniority (if you’re on the bottom of the org chart, maybe you don’t have to judge anybody else).

      • I would say judging and being judged happens just as much in other fields. It just feels different because it is hierarchical. Typically there are one or two people judging you, and you (if a manager) are judging 5-10 people. Same people year in and year out usually.

        So yeah, everybody is judged. But I do think the peer governance/assessment model we have in academia feels different. And the formal occasions (e.g. where you have a deadline and have to write something) are much more frequent in academia

        How being judged feels in a hierarchical business model depends a lot on who your boss is – it can feel very comfortable or terrible. We don’t have that kind of variance in academia.

      • Good points. I agree with the importance of “advancing people with the most merit”, it’s just that I don’t always find it easy to judge who has the most merit. Sometimes people/papers/grants really stand out to almost any judge as particularly meritorious (and vice versa), but very often the distinctions are subtle, and yet (as others have pointed out) we need to make yes-no or at least categorical decisions. I wouldn’t say I often have “regrets”, just doubts and fatigue, knowing that things could easily have come out differently. (As an aside, psychologists have shown that we are really good at tricking our minds into believing – post hoc – that whatever we decided was obviously the right decision [think about some major purchase you’ve made]. Maybe I should stop reading such things!)

      • Well, are you sure that “advancing people with the most merit produces the most benefit for society as a whole”? From all the studies I know of, societies with the best well-being and health indices are those with less economic and social inequalities. And social mobility research mostly show that meritocracy tends to replicate or even increase inequalities. Even if the goal is only to have a productive research output, meritocracy and resources concentration does not seem to do the trick:

      • “meritocracy and resources concentration”

        I think it’s useful to keep those two concepts distinct in the context of scientific grant-giving. As an example, both the NSERC Discovery Grant system in Canada and the NSF grant system in the US are “meritocratic” in the sense that that they try to figure out which proposals are better than which others. In the NSF system, only the very best X% of proposals get funded, where X is some fairly low number for many panels. But in the NSERC Discovery Grant system, something like 70% of proposals get some non-zero level of funding. And while the funding level of funded proposals is tied to their merit, most funded proposals end up in a fairly narrow range of roughly $25K-45K per year (there’s a long tail of “high fliers”, the rare outstanding grants from outstanding people that get significantly more money). So both systems try to be meritocratic, but in the NSERC system resources are less concentrated: the money-merit relationship isn’t as steep or nonlinear as in the NSF system.

        Personally, I prefer the NSERC system of not putting all the funding eggs in a few baskets. I think it makes more sense to spread money more widely because it’s hard to predict where the next big advances will come from, and because empirical evidence shows that the productivity of investigators with lots of money become increasingly limited by other factors (as your link points out). Although those aren’t decisive arguments; And the cross-country comparative data that I’ve seen don’t suggest any obvious clear-cut answer as to whether an NSERC-style system is better or worse for overall scientific productivity than an NSF-style system. National scientific productivity depends on lots of factors besides whether national funding bodies give smaller grants to many investigators or larger grants to fewer investigators.

  16. Pingback: Reviewing my reviewing hours | Trees In Space

  17. Pingback: Friday links: planning as an act of hope, academic robes vs. robes, and more | Dynamic Ecology

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.