A little while back I polled y’all on your views on who should get to be an author of a scientific paper, and on author contribution statements as a way of apportioning credit and responsibility for scientific papers. Here are the results on authorship, which I find quite interesting. There’s a fair bit of disagreement! I’ll post results on author contribution statements in a follow-up post.
We got 246 respondents. Not a random sample of any well-defined population, obviously, except maybe the population of “regular readers of this blog”. But a sufficiently large and diverse sample to be worth talking about, I think. Certainly a bigger and more representative sample than any one person’s own anecdotal experiences. Which is what I find interesting about these polls–contrasting the results with my own anecdotal experiences.
Respondents comprised a good mix of career stages: 25% grad students, 36% postdocs, 32% faculty, 6% professional non-academic scientists. Just a couple of respondents in other categories. That’s low on grad students compared to old reader surveys–maybe the students are all out in the field?
Respondents mostly do ecology (60%), or a mix of ecology and evolution (26%). They’re from around the world, and their distribution roughly matches where our pageviews come from, except that we were a bit low on Canadian respondents: 53% USA, 23% non-UK Europe, 8% UK, 7% Canada, the rest elsewhere. They mostly do basic research (39%), or a mix of basic and applied (48%). They come from a mix of departments: Biology (33%), EEB (22%), Nat Res/Wildlife/Forestry/Fisheries (18%), or other departments (21%) and non-academic employment (5%).
Main results: what contributions merit authorship?
One view of authorship is that you need to make an independent intellectual contribution to a paper to be considered an author. See, for instance, Ecology Letters’ authorship policy, which reads in part:
All authors are expected to make a substantial intellectual contribution to a manuscript. Honorary authorships are discouraged: merely providing facilities, obtaining funding or heading a research group that houses authors are not sufficient reasons for authorship, nor is provision of a small amount of data collected primarily for purposes other than those in the manuscript.
As another example, the Ecological Society of America journals oblige authors to adhere to the ESA Code of Ethics, which says that authors must make a “substantial contribution” to the paper, and says authorship may legitimately be claimed if researchers
- conceived the ideas or experimental design;
- participated actively in execution of the study;
- analyzed and interpreted the data; or
- wrote the manuscript.
Which of course rules out authorship based solely on provision of funding, facilities, or a small amount of data originally collected for another purpose. It also rules out authorship solely by virtue of being the PI of the lab in which the other authors worked.
Hence my first poll question, which asked readers whether various contributions–including those listed in the Ecology Letters authorship policy–are sufficient on their own for authorship. Here are the results (note that bar labels only summarize the contributions as they were described in the original poll):
Only 22% of respondents thought that none of the contributions I listed are sufficient for authorship. But the remaining respondents disagreed a lot on which of the listed contributions are sufficient on their own for authorship. Only one–providing a large amount of data originally collected for another purpose–got a majority, and not a massive one. I’m stunned that decent-sized minorities thought that collecting data as directed by others, providing a small amount of data originally collected for another purpose, and providing facilities or other resources were sufficient on their own for authorship.
Interestingly, responses were not nested. For instance, respondents who said making extensive comments on a draft ms is sufficient for authorship were not a strict subset of those who said providing lots of data originally collected for another purpose is sufficient for authorship.
One particularly striking feature of these results: a sizable majority of respondents think that, if you provide lots of data originally collected for another purpose, you should be an author. Whereas only a minority–about 24%–think you should be an author if all you do is collect the data as directed by others. Which seems really weird to me! So if you collect lots of data originally for some other purpose, that probably makes you an author, but if you collect lots of data originally for the purpose of the paper, that probably doesn’t make you an author? How does that make sense? Anyone else puzzled and bothered by this aspect of the results? Am I reading too much into them?
I’m also surprised that only 29% of respondents said that it was sufficient for authorship to be part of the working group that conducted the work. I had thought it was standard practice for working groups to agree up front that all participants will be co-authors on all of the group’s papers?
Looking at the crosstabs, there weren’t any strong predictors of respondents’ views on which, if any, of the listed contributions merit authorship. Faculty and students had similar mixes of views. So did Americans and non-Americans. Etc.
Only a minority of people give a crap about journal authorship policies
The results above imply that a majority of respondents disagree at least in part with the authorship policies of journals like Ecology Letters and the ESA journals. But of course, not all journals have authorship policies, and those that have them have different policies. For instance, Nature “does not prescribe the kinds of contributions that warrant authorship“, and BES journals have no authorship policy I could find. Further, journals don’t have any means of enforcing their authorship policies, because their only source of information as to who should be listed as an author is the authors themselves. So it’s perhaps unsurprising that respondents varied a lot in their awareness of and attitude towards journal authorship policies. Only 17% said they take account of journal policies when deciding authorship on their papers; 23% said they don’t. 28% said they ignore journal policies because those policies just codify what they’d do anyway. 24% were unaware if the journals to which they submit have policies, or were unaware of what the policies are. The remainder gave other responses.
As with the previous question, there were no strong predictors of respondents’ views on journal authorship policies. Surprisingly, students weren’t much more likely than anyone else to indicate ignorance of journal authorship policies.
Comparison with a previous survey of ecologists
Weltzin et al. 2006 Front Ecol Envi informally surveyed 57 attendees at the 2004 ESA meeting as to who should be an author in each of six hypothetical scenarios. They got widely varying responses to many of their scenarios. For instance, 25% of their respondents thought that, if pers0ns X and Y develop and conduct a project building on ongoing work of colleague Z, to which Z contributes only by attending a few meetings at which analytical methods are discussed, Z should be an author.
I confess, it’s a little harder for me to get too worked up over disagreement about authorship order conventions now that I know there’s so much disagreement over what constitutes “authorship” in the first place!
I used to think that authorship contribution statements were an obvious if imperfect solution to this (since they don’t state the contributions of non-authors). But for more on that you’ll have to wait for part two of the poll results…
Looking forward to your comments, as always.
Perhaps part of the reason for an (I agree) odd set of results is that the realities of research means that some of these criteria will be closely correlated rather than being stand-alone. For example, is it really likely that someone will have received funding for a project to which they later don’t provide advice and/or extensive comments on the manuscript. Likewise, heading a research group and obtaining funding means that funding agencies would expect to see outputs bearing the name of the person to whom the funding was awarded.
I’d have liked to have seen the category “provided taxonomic expertise” included within the list: my policy has always been to include specialist taxonomists as co-authors when they have identified/verified large numbers of specimens.
Yes, in retrospect “provided taxonomic expertise” would’ve been a good one to include.
Not sure if people not actually answering the question asked–is contribution X sufficient *on its own* for authorship–explains why the results are odd.
Thinking more about why most people think “providing lots of data originally collected for another purpose” merits authorship, whereas “collecting data as directed by a supervisor” does not…I think what’s going on is that we think of data as “really” having been collected by whoever supervised the data collection. So someone collecting the data as directed by others isn’t “really” the one collecting the data, at least for purposes of attributing authorship. Which makes sense, I think. Authorship is about responsibility and credit. If you tell others what data to collect and how, any errors in data collection are your fault (assuming others did as you told them). And you get the credit, because the data wouldn’t have been collected unless you’d told others to collect it.
Yeah. Here’s how I interpreted those two things:
1) I collected a ton of data for Purpose X and wrote a paper on it. Researcher A read that paper and thought, wow, the data behind this paper would be great to answer Hypothesis Y. Researcher A contacts me and asks to use my data; I agree, on condition of authorship, and hand it over. This data forms the basis of paper written by Researcher A. (This policy is the (sorta) explicit one of the LTER network and others. It’s definitely the cultural norm in ecology. I’ve written (and you’ve read) about how this intersects with Open data http://ecologybits.com/index.php/2016/06/15/open-data-authorship-and-the-early-career-scientist/ )
2) I am a field tech hired by a PI to collect data. I do my job and get paid. There are a dozen of us. We don’t expect authorship on resulting papers, nor is it offered to us.
Yes, I imagine that’s more or less how many people interpreted those two options. And yes, it is the cultural norm. What I’m wondering is whether the cultural norm can be given a principled justification. If authorship really means “made intellectual contribution to the paper in which the data were used”, then neither the person who provides data originally collected for another purpose nor the field tech should be an author. Conversely, if authorship means “made a substantial contribution of some sort or other to the work reported in the paper”, then both the person who provides data originally collected for another purpose and the field tech should be authors. It’s not so easy to give a principled justification for making the former, but not the latter, an author. As I said in another comment, the best I can do is to suggest the principle that data “owners” are always authors, whether or not the data were originally collected for some other purpose.
Of course, if that’s indeed the principle, it still needs some justification. I’m not sure that “you get to be an author only if you haven’t already been compensated in some other way for providing the data, such as by being paid a salary and getting some field experience” is a great justification for the principle. But probably there’s some other justification for the principle that “data owners get to be authors”.
I should emphasize that I’m trying to understand and justify my own intuitions here. Probably should have said that in the post. Until I stopped to think about it, my own first instinct was to say that obviously someone who provides a lot of data originally collected for some other purpose should be an author, and that obviously someone who collects data as directed by others should not be. But it’s not so obvious…
Yup, agree entirely. I really struggle with authorship a lot. I am a data owner, and I am not really sure how to make a principled justification as to which papers I warrant authorship and which I don’t. (Just exchanged an email with a collaborator about this today. It comes up a lot. Great data set!)
I completely agree with Margaret on this comparison, this is exactly how I interpreted it. However, I am still struggling with the “on its own” part. There is no question for me that the data owner of a data set I use, will be granted co-authorship (especially if its the only data used), but I would still expect the data owner to read and comment on the manuscript before submission. I would also want to read a manuscript which used my data before it is submitted. How else do you make sure that all co-authors agree with the content of the manuscript (which you have to confirm during the submission process at any of the journals if have submitted to)?
When does data “ownership” end? My understanding has always been you publish on data once, and then they belong to everyone. This is necessary for replication and meta-analysis/synthesis. Also, in the US, funding agencies pay for data collection most of the time. I believe there are stipulations that data must be provided if asked for (but I may be inaccurately remembering these policies).
Yes, de facto, data created on a U.S. federal grant must be given away freely. But that’s not what happens in real life. In real life, there’s a cultural expectation that data owners have some attachment to that data and should be properly compensated. That compensation is rarely just a citation to an existing paper. And major data networks (e.g. LTER) have policies that explicitly say that use of the data comes with the expectation of contacting the data owner and discussion the possibility of co-authorship.
Okay. So data ownership never ends?
Nope, not the way things are done these days.
I think there might have also been some confusion as to the amount of data collected. Does the person collecting the data under directions collect just a small part of the data (say, up to one fifth) or nearly all of it (say over 75%)? I’d say the first case does not warrant authorship but the second does.
Interesting. The view I “grew up with” is that, if you’re collecting data as directed by someone else, that’s a non-intellectual contribution and so doesn’t merit authorship, no matter what fraction of the data you collected. But yes, in retrospect I could have given two poll options on this, as with data collected by others.
This is the view I had seen the most as well, but I never quite agreed with it… First because there are decisions that have to be made by the person collecting the data, and measurement errors etc will be his/her responsibility, as well as the quality if data collected. For example, many animals may remain undetected without local field assistants, who may greatly improve the data collected. And second because I think that statistical analyses are often just as technical as data collection, but they often are deemed worthy of coauthorship.
But I’m not quite sure of either of my opinions. 🙂 And my academic experience is quite limited anyway.
Aside: that almost half of respondents think “providing extensive comments on a draft ms” is sufficient on its own to merit co-authorship is a measure of the amazing service that peer reviewers provide. As a reviewer, you provide extensive comments on draft mss–but without any expectation of co-authorship.
Likewise – as member of MS and PhD committees for other students in my Department, I perhaps provide more extensive comments than peer reviewing for a journal, or as a co-author with colleagues, yet we are rarely offered authorship. Sometimes the time I spend on graduate students’ work from other faculty is substantial ,and borderline in my mind, in terms of whether authorship is warranted. Just a thought…
Or that the the respondents (like myself) interpreted “extensive” as relative to what they would give as a reviewer. I used my lengthiest/most substantive review ever as my baseline as to what I considered extensive. So for example my average review of 2 pgs of single spaced comments, I considered “not extensive” for the purpose of that question. I’m curious how others interpreted this word. I bet there is huge variation in what people consider extensive comments. I actually didn’t answer that question affirmatively but I could see the rationale for it depending on what “extensive” means. Also if you write down equations and do some pen and paper analysis in your comments, is it still considered just “comments”? There is a lot left up to interpretation in some of these questions. This one is probably the least clear to me. Just my opinion.
“There is a lot left up to interpretation in some of these questions. ”
Sure. That’s an argument for giving people scenarios instead, like Weltzin et al. 2006. The trade-off there is that covering as many bases as I did in this poll would require a *lot* of scenarios, and you’re not going to get many respondents if you ask people to read a lot of scenarios. Personally I think having more respondents makes for a more effective conversation starter, which is all that a blog poll like this is really for.
Sorry, didn’t mean that as a criticism of the poll. I’m aware of such trade-offs and absolutely agree with you that less options is better in this case. I was just pointing out that some would not consider reviewer comments as “extensive” compared to the comments given for papers they think they deserve to be added to.
No worries, no apologies needed, didn’t mean to sound like I was annoyed with you. Just taking the opportunity to explain my thinking in writing the poll as I did.
I wonder if the community standards have changed over time. We often see a system that rewards maximizing the number of published papers. Hiring, tenure, promotion may provide incentives to maximize the number of authors, and also lead to a quid pro quo. If I get to be an author on your cross site study, perhaps I will be more likely to expand the author list on my papers. That might explain, in part, the reluctance to include directed data collectors. They are less likely to write papers of their own.
Yes, I’m sure that one source of disagreement here is changing community standards. Community standards change in an uneven way, as some people leave the community and are replaced by others with different standards, and as different individuals change their standards (or not). So at any moment in time there’s always heterogeneity of standards within a changing community.
The results on provide vs. collect data make some sense to me when considering intellectual effort. If a scientist contributes a lot of data to a paper, I presume she was the brains behind getting it, even if it was originally for another purpose. That’s why she has the authority to share the data. In contrast, someone who collected a ton of data but didn’t contribute much intellectually would be considered a pair of hands, not an author. I didn’t see how the poll questions were phrased, but maybe the respondents were thinking along similar lines.
Yes, I was just thinking along the same lines re: providing vs. collecting data.
Though thinking about it further: who’s the “brains behind getting it” when the data were originally collected for another purpose? Isn’t it the person who recognizes that the data can be put to a new purpose, not the person who was the brains behind the data for the original purpose? I wonder if this line of thought explains why 35% of respondents *don’t* think that you should be an author just by virtue of providing lots of data originally collected for another purpose.
Or what about: I’m going to collect data on X for my own purposes, but it’s only marginally more work (or no extra work!) to ALSO collect data on Y and Z at the same time. And I know this data would be valuable to other researchers. It would be a lot of work/time/money to collect data on Y and Z separately, so it’s worth it to *science* (and possibly future collaborations, grant proposals, etc.) for me to collect this extra data even though I don’t have an immediate use for it.
“Or what about: I’m going to collect data on X for my own purposes, but it’s only marginally more work (or no extra work!) to ALSO collect data on Y and Z at the same time.”
Hmm, dunno. This is one of those things that probably lots of ecologists have experience with (?), but not weirdo microcosmologists like me. 🙂 Hopefully others will chime in.
I also responded that I would be inclusive regarding authorship towards scientists that collected lots of data. Rather than quid pro quo, which may be a little part of it, I guess it’s also just a bit of recognition and gratitude towards the fact that very useful data was collected. There’s a chance it wouldn’t have been or would diminish in the future if it isn’t valued enough, for instance with authorship. To me it’s exactly the same with a large amount of taxonomic advice. This is especially true for meta-data on a large research collaboration, that everyone needs but is hard to put in a paper in its own right.
“There’s a chance it wouldn’t have been or would diminish in the future if it isn’t valued enough”
Really? I hear lots of people make this argument, but I’ve never understood it.
A lot of the data that ends up in meta-analyses is data that was already published in a paper (perhaps a paper on another topic). So its value has *already* been recognized with a publication.
And has anyone in history ever collected data only in hopes of getting to provide those data to a future meta-analyst in exchange for co-authorship of the resulting meta-analysis paper? Put another way, it’s not as if the existence of meta-analyses on which data collectors might not be authors provides a *dis*incentive to collecting data.
I just don’t see how the existence of meta-analyses affects people’s incentives as to what data to collect or not collect *at all*, no matter who does or doesn’t get to co-author those meta-analytic papers.
To be clear, my comments here take no stance on whether collecting data that later gets used in a meta-analysis should make you an author of that meta-analysis.
Sorry, I didn’t mean for purpose of meta-analysis, in that case I would agree with you. I mean multi-scientist/group multi-year field experiments, where everyone has his/her own specific question. I am a soil ecologist, and then the bread-and-butter meta-data is things like soil chemical composition, soil-type, vegetation inventory, soil microbial biomass, you name it. These variables greatly benefit every single project but may not warrant a paper on their own right. For the sake of the commons it is very nice if they are measured, and authorship greatly lubricates this investment. I am not saying this should be automatic, but I do think without relatively lenient authorship morale it would be harder to toss around the hot potato of “who will measure PH?”. It’s perfectly alright if you disagree of course 🙂 But my suspicion is this is a very common discussion and solution.
Ah, sorry for the misunderstanding, apologies for going off on a tangent.
Now that I’m the same page, I think you raise a very interesting issue. Never having worked this sort of multi-scientist, multi-group project, I have no feel for their group dynamics. Looking forward to others hopefully chiming in on this.
As someone very new to this, I am curious about how it would be handled if journal authorship policies were more followed and authorship was discussed beforehand when papers are shopped about from one journal to another as different journals have different policies. Perhaps it is the number of rejections that has made it so that author policies are less important in view of other concerns?
“Perhaps it is the number of rejections that has made it so that author policies are less important in view of other concerns?”
No, I don’t think that authors have ever paid much attention to journal authorship policies. So I don’t think they’ve stopped paying attention to them because it’s inconvenient and awkward to have to reassess the author list every time a paper gets rejected and has to be resubmitted elsewhere. But you’re right that this is one obstacle (among many others) to getting authors to take journal authorship policies seriously.
RE: providing a large data set
Handing over data does not justify authorship in my view. That said, I doubt that many people actually just hand over data. My guess is that most folks want to know what their data will be used for and they will provide extensive feedback on a wide range of topics including biases in their data, appropriate methods of analysis, etc… And the original researcher may be handing over data that has been extensively proofed, formatted, standardized…I know that I would want to be an active part of the team deciding what to do with the data/questions asked before putting my name to it. And if the project wasn’t up to snuff, I would likely hand over the data and refuse authorship. I have done this once before.
I did this because I don’t feel that data I collect is mine. Public money paid for my education, pays my salary, and paid for the data collection (I have also collected data on my time and my dime and that is different). Public data sets are a growing trend (that’s my assumption anyway) for a reason.
I should also add that in most circumstances I would only hand over data after I had concluded the initial planned project and (hopefully) published.
But I know that many people feel differently about this- data they collect is their property (and some even extend that to the system they work on).
I interpreted it as worded but … as I commented to the original poll I think a confusion arises when we consider what is “sufficient” to be included as an author from what accepting that invitation entails. Accepting to be an author is, or should be (?), also accepting an obligation to be involved in the framing, interpretation, writing and revisions etc. (and other aspects within reason and as called upon) — so authors accept wider obligations.
Well, that’s certainly an ideal, though an ideal that’s not equally practical for every sort of project. In my own experience, adherence to it has been mixed. And I suspect it’s inevitable that adherence to that ideal will decay as authorship standards loosen and as collaborations become more extensive and involve more specialists each focusing on “their” bit of the project.
I think another interesting thing about the poll results is there’s a disconnect between what many of us know ‘should’ be worthy of authorship and what happens in practice. I’m a postdoc and most of the papers I’ve written as a student or postdoc had at least one author who provided funding, was a PI on the project that collected the data but didn’t contain the specific ideas in the manuscript, provided use of facilities, etc. I couldn’t even bring myself to fill out the poll since it seemed absurd to fill it out in a way that contradicted my own practice. The reality is that authorship is free (aside from some mild dilution of credit), and it just doesn’t seem worth being stingy with authorship for people that serve as references or could provide future funding.
“there’s a disconnect between what many of us know ‘should’ be worthy of authorship and what happens in practice.”
Yeah, I bet that’s a fair bit of what’s going on. A mix of people voting as things actually work, and people voting as things should work.
One thing really surprised me, Jeremy. You wrote that “I’m also surprised that only 29% of respondents said that it was sufficient for authorship to be part of the working group that conducted the work. I had thought it was standard practice for working groups to agree up front that all participants will be co-authors on all of the group’s papers?”
Maybe you mean something very different with the term “working group” than I do, but for me “working group” is equal to all the people who work under one professor. Depending on the size of the “working group”, this includes a number of different projects which can be fairly unrelated to each other. So for me, it would be really strange to be a co-author on a paper which a fellow post-doc wrote within a different project than mine.
If you however define a “working group” as a group of people who work together on a project, i.e. plan the data sampling, analyze the data and discuss the results together, then authorship should be appropriate.
I imagine that the -in your view surprising- results are just due to a misunderstanding of language 🙂 In German, we use the word “Arbeitsgruppe” for all postdocs/PhDs/staff etc. who work for one professor, which is a direct translation of “working group”. Maybe you should check if there is a difference in answers between researchers from the US/UK and others, could be that people just use different terms for a lab/chair/working group in different countries 😉
I did indeed mean something very different by working group. I meant the sort of collaborative group supported and popularized by NCEAS.
Via Twitter, Elizabeth Borer registers her surprise:
All authors should, to my mind, be authors – i.e. writers. It is surprising how frequently people with only one or a few of your polled contribution types are coauthors without having read or contributed to writing a manuscript whatsoever. Perhaps this is more about author responsibility, than what warrants coauthorship, but it peeves me when coauthors don’t provide any comments on my manuscripts, and it peeves me when a paper’s coauthor cannot discuss the results in even minimal detail – which would result, for example, if coauthors contributed only data.
I don’t always provide extensive comments on drafts by co-authors if I’m happy with the draft (though if I’m not, I of course make edits). But I definitely read the draft very carefully. I need to be ok with every word of the ms if it’s going to have my name on it.
Pingback: Friday links: how to spot nothing, Aaron Ellison vs. Malcolm Gladwell, and more | Dynamic Ecology
Pingback: Case studies in coauthorship: what would you do and why? | Dynamic Ecology
Pingback: Poll on co-authorship of papers using publicly available data | Dynamic Ecology
Pingback: Poll results on co-authorship of papers using publicly available data | Dynamic Ecology