Does ecology need more criticism of the literature? If so how?

This post has evolved substantially over its writing. It started from a good post over on EEB and Flow by Marc Cadotte arguing that ecology needed a more robust culture of critique to weed out bad papers, and arguing that comments/critiques to the journals that published the original papers was an important way to do this. Despite strongly agreeing with the first part, I instinctively disagreed with the later part. (And have been thinking about critique letters a lot lately in my role as Editor-in-Chief at an ecology journal just as Marc has)*. But unpacking why I don’t like critique letters has led to a lot of musings on how ecology works, how the human mind works, and my own answer to the specific question of how best to steer the field away if you see a bad paper. And just maybe along the way I stumbled on a strategy or two for killing zombie ideas!

Lets start with the premise that there are bad papers that get through peer review and published. This is undeniably true. And I am clearly on record as thinking ecology is not advancing as quickly as it could because of our inability or unwillingness to declare ideas as decisively rejected and move on (just e.g. “Is ecology idea free“, many posts on the need for prediction not to mention Jeremy’s zombie ideas posts most of which I agree with). There even was an “Ask us Anything” question on how to respond to bad papers once that we responded to.

More data-driven, there is actually good (but depressing) bibliometric research on this.  Banobi et al identified 7 “prominent” examples where an original paper was wrong (in their case all making management-relevant claims) that were decisively rebutted. The original papers continued to be cited at very high rates after the rebuttals were published. The original papers continued to be cited 17x as often as the rebuttals, and 95% of the citations to the original paper remained positive to the original paper, rarely acknowledging the rebuttal.

My favorite anecdotal example of how hard it is to kill off an idea is Robert MacArthur published the broken stick model of species abundances in 1957. In 1966 MacArthur pleaded “Let us hope these comments do not draw additional attention to what is now an obsolete approach to community ecology, which should be allowed to die a natural death.” In short he pretty vehemently trashed his own theory as being obsolete. Yet 50 years later, you can still find new papers being published using the broken stick model (and the answer still remains it doesn’t work very well).

There are a lot of places to go with this devastating problem in ecology ranging from questioning the whole culture, to pushing philosophy of science, to critiquing wish-washy statistical inference methods, to arguing for a more predictive approach (you could argue Jeremy and I do little more than natter on about this problem!). Here I want to step away from the big picture and tackle this at a very small-grain, tractable level. If you see a “bad” paper, how best to respond? 

First, I think you have to unpack what it means to be a “bad” paper. I can think of dozens of ways in which a paper can be bad (which I’ll define here as needing overturning).  A few are:

  1. Outright undeniably wrong
    1. Fraud
    2. Honest mistake (usually a data error, coding error or math error)
  2. Not outright wrong, but not good
    1. Of limited generality (e.g. worked in one organism or site but not repeated in most other studies done afterwards)
    2. Assumptions that turn out not to be true (but weren’t known untrue when the paper was written)
    3. Statistics are “flawed”
    4. Paper doesn’t fully achieve goals
    5. Paper overinterprets results
    6. Assumptions too simplistic
    7. Vague, hand-wavy models
    8. etc., etc.

So, what should happen if a paper is bad? I would argue pretty strongly that it depends on which category of bad. If it is in group 1 (fraud or honest mistake), then a retraction or correction should be published. Period. Highlighting the error in a critique letter and then (usually) having a rebuttal by the author does not serve science well. It also does not follow the COPE (Council of Publish Ethics) guidelines that most ecology journals subscribe to and are bound to follow. Something stronger is needed (preferably with the authors own agreement). All journals have corrections or corrigendum. They also all have retractions. There is also increasingly a category known as “statement of concern” published by the editors when the author hasn’t come along. All of these are stronger than a critique letter. In a very important way. Somebody a year or two later who cites the original paper without acknowledging a corrigendum or a retraction or a statement of concern is committing malpractice and likely will be criticized by reviewers and required to change. You can’t say the same thing if it just has a critique letter about it. So I think category 1 bad papers need something stronger than a critique letter. That said, we should never lose the distinction between a 1A paper (fraud) which deserves all the reprobation it gets and a 1B paper (honest mistake) for which authors should be praised for correcting.

Category 2 papers are much harder. And the vast majority of “bad” papers are in this category. What exactly should be the goal if there is a category 2 “bad” paper? Such papers are kind-of-sort of wrong but not black-and-white wrong. They’re in a grey zone. A person is not crazy to keep citing the original paper (but nor is a person crazy for disagreeing with it). A lot depends on what you took away from the paper and how you cite it (because some parts of category 2 papers are valid and some parts are wrong). Sousa’s classic paper on IDH (intermediate disturbance hypothesis), clearly got the basic empirical existence of the pattern (and attribution to a form of disturbance) in his system right, but his conclusions about mechanism (“delaying competitive exclusion”) wrong.  If a paper is not wrong enough to be corrected or retracted, you are inherently in the zone of opinion where people can legitimately read the situation differently. Like most of the zombie ideas Jeremy talks about, the zombie idea of the intermediate disturbance hypothesis can be decisively rejected in one specific form (mechanism of “delaying competitive exclusion”) but not in some other forms. 

So what should you do if you are bugged by a Category 2 bad paper? What is the best route for science to advance? Well, obviously you can just ignore it and let somebody else take it on. That can be a very a valid choice, but for science to advance, somebody has to take it on, and lets assume for the moment you feel called to take on paper X. What then?

I can think of three approaches**:

  1. Write the editor to prove it is so wrong it belongs in category 1 and should be retracted/corrected
  2. Write a “letter to the editor” style critique letter
  3. Identify the root issues (i.e. 2A vs 2B vs 2C etc in the list above) and address them in a full length opinion or research paper

#1 is a very high bar and should be. To retract a paper you need absolute evidence.But go for it if you think you can do it. But by definition you can’t do that for the much more common case of a category 2 paper. So really it comes down to #2 vs #3. I’m going to argue that #3 (stand alone paper) is a way more constructive option than #2 (critique letter).

Can you name a critique letter that changed your mind on a paper? I can’t. The debates on null models and on Diamond’s checkerboard island patterns were amusing and somewhat enlightening to read but their length (stretched over years, and still debated to this day) pretty much prove the point that they never solved the debate.

I think there are good reasons why critique letters often aren’t impactful and decisive:

  1. Critique letters and their rebuttals have a real tendency to turn into two trains passing in the night that never talk to each other.
  2. By definition critique letters are very short and not a place to bring to bear extensive new data or ideas.
  3. They must be written entirely reactive to the original paper, not  heading off into a new synthesis or direction nor usually even responding to a suite of papers.
  4. They typically only critique a subset of the paper which means most people will walk away remembering that the original paper is partly right and partly wrong.
  5. The bar for novelty and importance is often lower than the bar for regular papers

All of these reasons may in part explain my experience as both an editor and an author of papers with critique letters submitted about them that a fairly high fraction (75%?) of all critique letters are rather weak. Many raise rather debatable points of analysis (e.g. data is maybe not normal enough so a generalized linear model is needed which only slightly changes the conclusions, but data doesn’t fit the proposed alternative distribution very well either – and yes this is a recent example for a paper I am a coauthor on) or concerns that the discussion overstates the conclusions compared to what is justified by the data (something the reader can form their own opinion on without a critique letter). Moreover, good fraction of the critique letters submitted to a high profile journal like Science or Nature strike me as largely opportunistic (i.e. based on the belief, mistaken in my experience, that it is easier to get a critique letter into Nature or Science than an original research article). So, in short, I haven’t seen a lot of really important, decisive critique letters.

On the other hand, I think full-length stand-alone papers may not stop the original paper from getting cited (as Banobi et al showed pretty clearly) which is a cultural flaw in ecology. But I do think they can very much influence the debate and overall direction of the field. I think this is true for a number of reasons:

  1. The bar on importance and novelty is higher for a stand-alone paper which I think encourages more effort and filters for truly compelling work
  2. The higher word counts allow one to more carefully build a case, bring in simulations and models, etc.
  3. One can make a general point, not just react to a single paper and generality is good
  4. Most importantly one can frame things in a positive, constructive, field advancing way rather than a reactive negative, tearing down way.

I think that last point is vital. Advancing science, not reversing science is more likely to be appreciated, remembered and influential for lots of reasons rooted deep in human psychology. Erasing an idea is impossible. If I instruct you not to think about clowns with balloons it is not going to work very well. Moving people to a new idea works well. If I give you a vivid image of elephants on the Serengeti, you are going to forget about clowns with balloons even if you were intensely thinking about them a minute ago. And much as we might like to think scientists dwell in a more rationalistic domain, we don’t. People need a new idea to move onto, not a suggestion than an old idea should be thrown away. And this fits the full length paper better than the critique letter.

Similarly putting a new cohesive idea in the literature is much more likely to be remembered than pointing out that somebody else’s paper is partly wrong (which paper isn’t partly wrong …?). Critiquing another paper is effectively carving in bas relief – gouging away a few bits but most of the paper will remain standing in people’s minds. Although the scientific method may be framed around falsification, the human mind works differently (for good reason – we should not eliminate an estimate of risk because it failed to materialize once). So if you’re goal is to overturn a paper, you should assume up front you will only overturn a piece of the paper.

There are plenty of examples where stand alone papers had impact when critiques did not. Neutral theory received multiple criticisms of type 2B and 2F above (invalid assumptions) that had zero impact. Papers by myself, Dornelas et al, and Gilbert and Lechowicz and several other early papers that made a methodological point that we needed to compare neutral theory predictions to alternatives (e.g. lognormal or environment vs dispersal) significantly tempered the way people thought about neutral theory. And, although I don’t like it, neutral theory hung around for several years despite these rejections until the idea of niche-neutral continuum expressed in several papers came along to provide a viable alternative. Gilbert & Bennett’s paper on using variance partitioning between environment and space has been far more successful as a stand alone comprehensive analysis than it would ever have been as a 1000 word critique of a single paper using this method (as Jeremy has pointed out). I hope that the advanced analysis of Grace et al 20016 will help us move on from the simplistic productivity-diversity idea. Most of these examples serve to highlight, though, that while advancing our methodological understanding is useful and serves as an example of how a new idea can replace an idea better than a critique, these examples also illustrate it is hard to really kill something off until a clear (equally simple) alternative explanation is available.

Returning to Banobi’s work, two things struck me as I spent some time looking at their examples in detail. First the rebuttals were always poorly cited in comparison to the exciting (but wrong) claims. But the stand-alone rebuttals that put a new, different answer out there were cited about 10x as often as the critique letters that were written directly in response to the original paper (many of the critiques were only cited 10 times when the main paper was cited 1000, but many of the standalone rebuttals were cited over 100 times). Second, although the rebuttals faired poorly in the metric of citations and in the metric of killing off the original paper, there is evidence they DID have an impact on the conversation. A detailed followup was done by Branch in 2013 of one case study in the Banobi paper (Wurm et al’s 2006 paper on fisheries collapse by 2048). Banobi showed that many of the citations of the rejected claim (projected dire state of the fisheries in 2048) were either rather naive or the paper was being cited in a rather generic way (e.g. fisheries are declining so my research question is important). Specific citations of the specific 2048 claim were in fact largely prevented by the rebuttals and the more sophisticated the authors, the less likely they were to repeat the specific claim that had been rebutted. So these paper-length independent rebuttals did influence the discussion materially.And the citations of the original paper were very often to parts of the paper that hadn’t been rebutted.

Is there any place for critique letters? Sure. But when they exist, the one thing I feel very strongly about critique papers is that stringing out the dialogue through a series of correspondences separated by months or years and hundreds of pages in the journals is not helping the reader very much. So it is incumbent on editors to get the conversation all published in one place back-to-back. And I personally believe correspondence need to rise to the same editorial standards of novelty and importance as original papers (which they rarely do).

To summarize, truly wrong papers need something stronger than a critique letter (i.e. a correction or retraction). Critique letters often seem to me to be indecisive (partly because the flaws in the original paper are rarely truly fatal to the entire paper and partly due to the limits of critique letters). Really strong critiques that are going to have a chance of changing the field are worth standing in their own right as a paper. And if you really want to kill off an idea, pay attention to human psychology and: 1) give the reader (and the field) a replacement, and 2) recognize critiques only invalidate parts of papers, not whole papers, in people’s minds.  A stand alone paper that presents a positive, coherent alternative is much more effective based on psychological principals.

What do you think? Does ecology need more rejections of published literature? How should this best be done? Do you read critique letters? Do you find stand-alone articles to be more decisive rebuttals?

* although they are in no way responsible for all the opinions in this blog post, I worked closely with the Deputy EiCs at GEB, Maria Dornelas and Richard Field, in developing our journal’s approach and my thinking benefited greatly from their thoughtfulness on the topic.

**or go blog about it. I am not touching blogs here, but I think blogs can actually be a rather effective tool for starting a conversation about the merits of a paper or idea in a format that is lightweight but rapid and with a permanent record.


20 thoughts on “Does ecology need more criticism of the literature? If so how?

  1. Good points, but let’s not create a false dichotomy between approaches 2 (critique letter) and 3 (independent research addressing the fundamental issues). You can do both in the same paper; I currently have one paper under review doing just that, tapping several classic papers for evidence contrary to the recent authors’ claims.

    • True. From a journal editorial policy there is a black and white distinction. But as an author you’re absolutely right there is a continuous gradient between the two and it is probably important to be intentional and thoughtful about where on that gradient you are trying to be.

  2. Very interesting ideas here. The nearly entire emphasis on P-values in many papers creates a gray zone between Category 1 and Category 2, for me at least. I am thinking specifically of two papers, one published in Nature and one in PNAS in which the entire paper is written around the existence of an effect evidenced by P < 0.05 but the p-values were computed from grossly inappropriate (and not debatable) models. So the published estimates are not wrong (actually in the nature paper, the estimates were not even reported, only asterisks or N.S. for the effects, which were correlations) but the inference tests were meaningless. Is this Category 1 needing "correction" (these are not due to coding or math error) or Category 2? Sorry to not be more explicit on which papers…neither are ecology.

    • You and I both agree that this is probably highlighting the invalidity of hanging everything on p-values. But I think if you could decisively shown the wrong model was used (not just a matter of judgement where different people can think different things) and it caused it to not be p0.05.

  3. Interesting thoughts.

    Re: critique letters and authors’ replies often talking past one another, that’s because many editors allow it. Taking the view that each side in a debate ought to be allowed to make its case however it wishes, as in a courtroom. Is that the best model? Should editors oblige each side to address the other point by point?

    Re: no critique letter ever causing lots of people to totally reject a paper, wasn’t there a paper in Nature a number of years ago with Alan Grafen as one of the authors, showing that a prominent claim of a life history invariant in fish was a mathematical artifact and pretty much convincing everyone? Ok, it was a standalone paper rather than a critique letter, but as I recall the paper’s claims were purely negative. The only point was to show that the purported invariant did not exist–the paper didn’t show a better way to identify and study invariants. But perhaps I’m misremembering. (EDIT: ok, went ahead and looked it up. It’s Nee et al. 2005 Science: Cited 77 times according to Google Scholar. Drew a reply from Savage et al., cited 21 times in Google Scholar: Can’t really say any more, not knowing this literature, so nobody should trust my earlier “recollection” that Nee et al. had convinced everyone. Ric Charnov is a co-author on the Savage et al. reply and sometimes comments here, so perhaps he’ll chime in. As an aside, I’ll just note that I find this disagreement very interesting because both sides include people whose other work I know and think very highly of, and whom I think of as all being on basically the same wavelength scientifically.)

    In any case, I agree that it’s rare for a pure critique to change many minds. Much as some of us might wish it were otherwise! ( We’ve discussed various examples on the blog in the past (e.g.:

    I note with interest that you seem to agree with Marc that effective criticism of individual papers is something we need more of. I’m not so sure about that. I agree that we need more criticism of ideas in ecology, but I think the ideas that need criticizing are those that transcend individual papers. The IDH and the hump-backed model, for instance. Individual papers rarely make much of a difference to the direction of a field. And even though some rare individual papers make mistakes that end up getting widely propagated, you usually can’t identify which papers those will be until after the mistakes are widely propagated. And for the reasons you lay out, I don’t think it’s beneficial to the field on balance to publish many more criticisms of many more papers in the (futile) hope of preventing any mistakes from ever getting widely propagated.

    • Your last paragraph strikes me as important. I think I broadly agree with you that the grain size criticized should be a line of thought, not a paper. Which was part of why I argued that standalone papers are better than critique letters. But you put it into words much better than I did.

  4. Great post Brian and I think you raise some important issues here. Three points to make:

    1. We all know from reviewers’ comments that differentiating between a “good” and a “bad” paper is hugely subjective and very often down to personal opinions and biases. Any successful critique has got to go beyond this so it’s incumbent upon the critiquer to try to be as objective as possible (however difficult that can be).

    2. That said, your Category 2A of “bad” papers [“Of limited generality (e.g. worked in one organism or site but not repeated in most other studies done afterwards)”] are often the raw material for later meta-analyses or syntheses of current knowledge, so I wouldn’t refer to them as “bad” unless they were flawed in other ways.

    3. If you want an example of a critique paper that tried to give a positive spin to a disagreement, and move the field forward, and was backed up by a blog post, check out: As I say in the post, the journal in which the original paper was published refused to publish our critique, for reasons known only to itself.

    • I have a sneaking suspicion (anecdotal only) that top journals are more prone to let through and publish glancing criticisms of the papers they publish than truly effective criticisms

      And I agree – category 2A might not be quite the same severity as some of the other category 2 groups..

      • “I have a sneaking suspicion (anecdotal only) that top journals are more prone to let through and publish glancing criticisms of the papers they publish than truly effective criticisms”

        Really? How come?

      • Mostly having had several papers where I saw the critiques the journal chose to send to me to rebut vs critiques sent to me by their authors that I thought were stronger but never got sent to me by the journal. The one caveat is that the stronger critiques were usually later, so an alternative hypothesis is that journals let the first one through and then move on regardless of strength.

        When you get right down to it, there is a conflict of interest in a journal evaluating the critiques they’re going to publish of the papers already published in their own journal. I think journals handle this with high integrity and place the science first. But there is an innate COI.

      • “When you get right down to it, there is a conflict of interest in a journal evaluating the critiques they’re going to publish of the papers already published in their own journal. I think journals handle this with high integrity and place the science first.”

        Hmm…yeah, I see what you mean. But I think the key point there is that, in practice, journals really do handle this with high integrity, at least in ecology and evolution. Authors of papers and critiques may often disagree with journals’ decisions to publish or reject this or that critique. But I’ve never heard even a whisper that those decisions are dictated by the journal’s desire to protect its own reputation and defend papers they’ve already published.

  5. Then there’s the related issue of when, if ever, it’s ok for an author to ignore or gloss over previously-published critiques of his or her approach: Tough call.

    As an aside, it’s since come to my attention that the specific example of Smith et al. used in that post is not a good example of the point I was trying to make, for reasons I won’t get into. I think the general issue is still worth thinking about, though.

  6. Ok, just thought of a critique that really did shoot down a paper to pretty much everybody’s satisfaction (as far as I know!): the critiques of Bill Shipley’s Science paper trying to use MaxEnt to relate species’ abundances to their traits. The critiques showed that the only reason MaxEnt worked so well was because Shipley used so much trait information as “constraints” that basically the only distribution of abundance data consistent with those constraints was the original distribution. So he was basically just showing that the original data was identical to itself. And crucially, the analysis would’ve come out that way even if the trait data were random numbers having nothing to do with the determinants of species’ abundances. As I recall (and I may well not, it’s not really my field), Shipley later proposed a bootstrapping method in Ecology to try to fix this problem with his original Science paper, but I don’t know if that method actually works or if it’s been widely taken up.

  7. Good one Brian.

    You laid out several important considerations and your viewpoints on the issue very well. I consider the lack of criticism to be the single most important topic in all of science. There are all kinds of issues involved in this, even beyond those raised here, e.g. signal to noise ratio, efficiency, public trust and etc. But I’m not going to respond to your piece, but rather pro-actively write my own stand-alone take on it.🙂

  8. Life History Invariants, dimensionless life histories as a general approach , are alive and well, with the decade old exchange of views in Science mentioned by Jeremy playing zero role in the trajectory. Having watched controversy in ecology as a participant [and from the bleachers] for 40 yrs [ plant sexual selection, optimal foraging, dimensionless life histories, metabolic ecology, …to mention 4 of my historical interests] I conclude that criticism per se is vastly overrated [ except for outright error correction, … math errors , stat errors] . If anything alters trajectory of a field its showing a ‘better pathway to the truth’ with criticism of existing work a small add on; creation, not criticism is the way foreword. So is data designed to actually test a theory; sometimes it takes a while to know just what a theory predicts, so early tests are not definitive.
    Of course highly political science, like conservation biology, probably obey different rules, as shown by Greenpeace’s recent silly attack on Ray Hilborn.

  9. Late to the party (is it on purpose you’re not on EcoBloggers any more?), but still a few thoughts:

    1. I agree that we do need more discourse in ecology – as it is, ecological studies tend to stand quite isolated compared to other sciences, with comparatively few replications / direct follow-ups / etc. that would lead to a convergence of opinions in the community (I’m overgeneralizing of course, this does happen, but I would say less so than in physics or cell biology).

    2. A productive discourse, however, should go beyond criticism. Criticism of the works of others in the form of follow-up papers or comments is certainly valuable, but equally important are responses that add new ideas / perspectives to a paper, to get a discussion going. So, what we need are formats / forums that facilitate such a constructive discourse.

    3. I want to defend the comment / LTE format for that purpose. Yes, many comments are weak, tangential, or essentially self-advertising. However, that must not be so. Statistics journals have the nice tradition of inviting responses (often in the order of 5-10 people) to a paper that is to be published, with a rejoinder of the authors. I really enjoy this format, and it is often very insightful to get different opinions by different people on a subject. So, maybe the issue is more how we currently use comments (Damn it, this guy got a paper on my topic in Nature, now I want to say something as well), rather than the format itself.

    4. Apart from that, there is another important function of comments that becomes more important as methods increase in complexity – pointing out that methods in a paper are borderline wrong. Unfortunately, journals / authors will typically only retract or correct papers if things are black and white wrong, i.e. typos in the code, errors in the data, etc. If the analysis is formally correct, but arguably inappropriate (wrong method, wrong selection of variables), many people refuse to take action, with reference to the fact that the methods are clearly described, so everyone can make up their own mind about them. For example, we lately pointed out that the results of a paper hinged on using an inappropriate method to fit the models . I would say the conclusions of the original paper where massively question by that finding, but the authors didn’t correct the paper, so the comment was essential. I think it’s useful if people would have the opportunity, also long after publication, to point out technical problems in papers. PeerJ has this system where you can add comments to a published paper. It would be nice to have this for all publications, this way you could also add links to newer papers that show a different result, or question a particular step. (@Jeremy: yes, I’m arguing for post-publication peer review😉

    So, in conclusion: I’d be glad to have more full papers that engage in a productive discourse, but I also find comments / links in published papers essential, and actually, I would like to see much more of them, both the technical and more discursive type. And we don’t have to have these comments always at the same time a paper is published. We have the technology now that could overlay new comments / links on old papers, and I would find it very helpful to use that.

  10. Pingback: Friday links: when you stop believing your own results, and more | Dynamic Ecology

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s