About Brian McGill

I am a macroecologist at the University of Maine. I study how human-caused global change (especially global warming and land cover change) affect communities, biodiversity and our global ecology.

25 years of ecology – what’s changed?

I am giving/gave a talk this morning at a Festschrift celebrating the 25th anniversary of the Graduate Ecology program at the Universidade Federal de Minas Gerais (UFMG), the large state university in one of the larger states/cities in Brazil. So first congratulations to the program and many thinks to the organizers (especially Marco Mello, Adriano Paglia and Geraldo Fernandes) for inviting and hosting me.

I was invited to give the talk based on my blogging, which is sort of a new trendy thing in ecology. So I foolishly offered to give a perspective on the past 25 years of ecology and what the next 25 years of ecology will contain, because I like to think about such things. But as I prepared my slides I increasingly got nervous because these are topics no one person should claim expertise on!

However, I did come up with a couple of data-driven graphics that I thought readers might find interesting.

Publication trends

First I did some statistics on rates of publishing by country (using Web of Science so biased to English journals). I picked out the US, several European Countries and Brazil and China. What would you guess the trends are? First, the total # of papers published per decade is increasing at a phenomenal rate, so everybody is publishing more. But as a percent of published papers, most European countries are holding steady (although some countries like Germany started to publish in English later than other countries like Sweden so they show a big increase in the 1980s or 1990s), the US is slowly declining and China and Brazil are increasing rapidly.

Total ecology papers published per decade

 

According to Web of Science which is English journal-biased. RoW is rest of world.

According to Web of Science which is English journal-biased. RoW is rest of world.

 

Research topic trends

Secondly, and more interesting to me, I did a Wordle on the titles of the top 200 cited papers in 1989 and the top 200 cited papers in 2012 (yes it is 2014 but I found I had to go back to 2012 to get citations that had settled down to papers that were truly the top instead of just the ones published in January).

The two Wordles are for 1989:

 

Word cloud for titles of top 200 cited papers in 1989 (click for a full size image)

Word cloud for titles of top 200 cited papers in 1989 (click for a full size image)

And 2012:

Top 200 for 2012

Top 200 for 2012 (click for full size image)

There are some obvious differences. But before I comment, I am curious to see what you all see (that is the point of a word cloud after all)  I hope you all will share your thoughts on what has or has not changed in 25 years (OK 23). I’ll try and add my thoughts in the comments after others have had a chance at analysis.

 

PS – if you’re curious you can download my slides for the talk from figshare. The first 1/3 matches what you read above. The last 2/3 mostly matches themes I’ve hit on before here in my posts on DE. Although students might enjoy the next to last slide on advice to students.

 

Poll: What should a community ecology class cover?

This fall I will be teaching a graduate-level community ecology class for the first time. Most people would say that community ecology is one of the five or so main subdisciplines of ecology along with physiological ecology, population ecology, ecosystem ecology and maybe behavioral ecology.

In the 1970s community ecology was an “in” field. Then in the 1980s and 1990s my perspective is that community ecology was passe. I started graduate school in 1997 and I well remember how all my graduate student peers would say things like “I study species interactions” rather than use the phrase “community ecology”. Now community ecology feels very much like a reinvigorated, “cool” field again, but in part because the lines have blurred with topics like macroecology and global change ecology.

So it has been an interesting exercise for me to think through what exactly should be covered in a community ecology class. Its a bit of a definitional exercise in defining what I think community ecology is today. There is definitely more than enough material to fill a semester these days, so choices must be made. There are two great textbooks on community ecology by Mittelbach and Morin (both reviewed by Jeremy). So I can look at the tables of contents there, but there are some noticeable differences from the choices I will make.

So I thought it would be fun to take a reader survey to see what topics people think belong in an early graduate (e.g. first year graduate student) community ecology class.There are 30+ topics. Each topic could easily take 1 week to cover (in fact could easily be an entire semester seminar), and here at Maine we typically have a 15 week semester, so assuming we’ll squeeze a few topics together, you can pick up to 20 topics (it would be no fun if you could check everything!). I’m sure there are other ways to organize/slice&dice these topics, but this is a reasonable approximation. What would you prioritize in a community ecology class? What are your top 20 priorities for an introductory graduate level community ecology class? Take our poll (NB: I have NOT randomized the order presented to keep related topics close to each other, but please make sure you read to the end and don’t just bias towards the first things you see):

 

Four rules for long distance collaborations

One trend of ecology, science, and life generally is increasingly doing work with people who are not physically in the same location.

Some examples of collaborating remotely that are part of my academic life include:

  1. Students in different locations – some of these have been my fault (i.e. I moved to a new university and left behind students I was supervising and needed to find a way to continue supervising). Some of these have been the student’s requirements (often involving spousal or SO constraints). (See Yi Han’s post on Yvonne Buckley’s website for another discussion of remotely advising students)
  2. Working groups – although the whole point of a working group is to get people together in one place, working groups invariably demand working remotely a good chunk of the time too. I have a post planned for the near future on how to make a successful working group, but one piece is certainly just the generic problem of collaborating remotely.
  3. Collaborations assembled for reasons of complementary expertise among people in different locations to do research. One of my best and most productive collaborations right now is with two people in Scotland and one in Vermont. Aside from student/adviser type papers, it is getting increasingly rare  these days to see multi-author papers where all the authors are at the same university or the same city these days.

It is claimed that technology makes us “one world”. I’m pretty sure this is overhype on the part of the technologists :-) But it is true that skype and equivalents, drop box and equivalents, google docs and equivalents, etc have made things possible that weren’t possible even in the days of telephone and email. Although even there, I remember a project 20 years ago where a co-worker and I were porting a complex (1 million lines of code) product to a new operating system (Windows NT to date myself). I was in London, he was in Boston, but it was extremely efficient. Just as I was finishing I would email him where I was at and he could pick up just as his day was starting and all we needed was email and web-based source code management (and extremely rarely a telephone). But he was a close friend that I had worked with for years – we could practically anticipate each others next move.

Which brings me to what I think is the most important aspect of long distance collaborations. The technology has changed. But the social challenges have not changed and remain huge. Indeed, if I had to boil down my rules for long distance collaborations to just one sentence it is “Humans are still primates”. The social dynamics are extremely important and should not be ignored under the illusion that a collaboration relies merely on intellectual exchange of ideas which is easily solved by passive technology. Making a long-distance collaboration work requires a VERY ACTIVE attention to social maintenance. I can guarantee you things will sour quickly if this is ignored.

So although the following four rules are just elaborations on this point, here are my four rules of long distance collaboration:

  1. They have to start with a significant component of face to face time. I don’t think I’ve ever had a successful collaboration that begin and remained primarily on Skype. Beginnings are delicate, critical times, and face-to-face meetings are the key to success in these delicate beginnings. This is built into working-groups – indeed is the reason d’etre of places like NCEAS, sDiv, etc is to make the quality face-to-face time at the beginning of a collaboration happen. This also applies to working with students remotely – I refuse to do it if we can’t find a way to overlap in the same place for an extended period at the beginning (usually 1-2 years for Phd students, 3 monts for postdocs). My successful collaborations on papers also involve people I already know in person from working groups, repeated discussions at ESA, etc. While talking science during this early face-to-face time is useful, what is really important is establishing a rapport. Eating together, socializing together, cracking jokes together. Taking an adventure (be it to a scenic vista or a restaurant in a strange town) All these social trust building functions are what is most important. Rationally right brained people will scoff at this, but ignore this at your peril! Just be glad we’re hairless primates and don’t need to groom each other for lice to build social bonds!
  2. Schedule unstructured time – Beyond building social bonds and trust, another important feature of being in the same place is the occurrence of chance meetings that involve conversations that are not directed at a purpose. An obvious part of working remotely is talking by phone/skype/email to move the project forward. But if you only have these goal-oriented discussions, things will not go as well. Thus it is important to schedule time where you are “just talking” and conversations can meander and go to new (and hopefully exciting and innovative) places. Such unstructured time also leaves room for the occasional joke, how are the kids?, etc per #1. Being overly goal oriented on the phone/skype can kill a collaboration.
  3. Continue to make face-to-face meetings happen – Although #1 and #2 are the core ingredients, it is important in long-lasting collaborations to make sure that even if #1 and #2 happen additional face-to-face, same-location time happens. With remote students I try to make sure they spend at least a week per semester in the same building with me (and a month is better). With collaborators I try to get together at least once per year, sometimes only over dinner at ESA but often via multiple sesisons of a working group or even travelling to meet (I just spent 3 days in a random hotel in the middle of generic suburbia Connecticut as it was the best way to get four of us together).
  4. Make sure everybody has a quality work environment – This applies mostly to working with students or postdocs, but if they are not going to be in my lab, it is important that they have a productive work environment wherever they’re located. The idea that they’re going to work from home or from Starbucks is not a good idea. Students not in my lab all need to find a lab in a university where they are located so they have a desk and a weekly meeting with live, in-the-flesh people.

Those are my four core rules. I want to be clear that successful remote collaborations are relatively rare and hard. There are lots of studies that show that being next door is a lot better than being downstairs which is a lot better than being a couple of buildings over which is a lot better than being cross campus which is a lot better than being remote. As a physical setting, remote is at the bottom of the list. But there are times and circumstances when it can pay off (or where a collaboration you already invested in has to turn into a remote one). But in those times don’t kid yourself – you are starting the race behind and need to put extra energy into overcoming that deficit. Being highly proactive about #1-#4 in collaborations I care about is the formula I have learned over some successes and many failures (going all the way back into my business days).

What is your formula? Do you even participate in remote collaborations? If so what are the keys to making it work for you?

On the differences between natural resource and biology departments

Six weeks ago, in my post on research funding, in the comments several people noted that funding for TAs and RAs were different in natural resource departments than in ecology and evolutionary biology or biology departments. A reader Steven Byrd, emailed me asking me to expand on the perceived differences since he was about to make the switch moving from his masters in a biology department to his PhD in a natural resource department. I myself have jumped this divide nearly every move I’ve made – PhD in EEB department, Postdoc in Fish and Wildlife, tenure track at McGill in Biology, tenure track at Arizona in School of Natural Resources. Since many people like myself and Steven cross this divide or at least contemplate crossing this divide at least once in their career,  I thought it would be interesting to comment on the cultural differences I have observed and see what others think.

First a bit of background. This is specific to the US, but I know it is similar in Canada and believe it has parallels in Europe and Australia as well. Definitely curious to hear from our international readers. Most universities are organizied into departments nested inside of colleges nested inside the university. Ecology is typically found in two locations. One is in an EEB or Biology department inside of a College of Science (or on a smaller campus a College of Liberal Arts and Sciences). This college also has chemistry, physics and often some of atmospheric sciences, oceanography, geology, etc and is focused on pure research without focus on applications. The other is in the College of Agriculture where there are usually departments like Wildlife and Fisheries, Forestry, often Soils, Crop Science, Range Management, Hydrology and some others that overlap with ecology as well as things like plant sciences (plant breeding and pathology), animal husbandry, etc. The college of Ag is focused on applied questions, and in the US in land grant universities the college of Ag is naturally where the agricultural extension agents are homed. The college of Ag is also where federal cooperative units with the USGS (which has a mission of biological inventory and survey) and the US Department of Agriculture are homed – these units are employees of their respective federal agencies and are forbidden from teaching undergraduate classes but otherwise are rather regular members of departments doing research and having graduate students. In many campuses the forestry, wildlife, etc departments have been shrinking and have been merged into unified “natural resource” departments. These departments have also been undergoing a major transformation in recent decades from an emphasis on “hook and bullet” management of game animals for hunting and fishing to conservation of endangered species.

OK – so enough background. These departments all do ecology but if you’re contemplating a switch, what should you you know about the differences between the Biology/Ecology and Evolutionary Biology/College of Science and the Fish and Wildlife/Forestry/Natural resources/College of Agriculture world? (From here on I will abbreviate these two contrasts as EEB vs NatRes). The following are my own observations. They are general stereotypes based on the many departments I have visited and certainly do not apply to 100% of institutions, and in fact none of them apply to every place I’ve worked (and most of them don’t apply to my current place at U Maine which has several unique features with respect to this divide). But broadly speaking:

  • Research funding – EEB goes after NSF and maybe NASA or NIH. NatRes goes after USDA and an occasional NSF, but the majority comes from contract work for state and federal agencies (e.g. monitoring endangered species). As a result I think EEB tends to be a bit more boom-bust and (also divides people into have and have nots) while NatRes tends to be a bit more slow and steady.
  • Research topics – both sides are doing good ecology which is probably the most important point. But there are subtle differences. NatRes is more focused on collecting data and using sophisticated quantitative methods to make sense of the data. In EEB there is more of a split between pure field work and pure mathematical ecologists. In EEB there is also more of a focus on questions rather than information. Sometimes when I sit on NatRes committees I have to push students to ask questions that tie to theory (but many NatRes faculty are doing the same push), but sometimes when I sit on EEB committees I get bemused by how much handwaving there is about incorporating the latest trendy question (can you say phylo-spatial-functional trait coexistence?) without really thinking through the value of the work.
  • Reputational basis – evaluation for tenure and more generally for reputation is more mutlidimensional in NatRes. Papers and grants are still vitally important, but relationships with state and federal agencies, making a difference on the ground, outreach and education are all also important. EEB tends to be very one dimensional on papers and grants. For these reasons the pressure levels might be slightly lower in NatRes (although no tenure track job on the planet is absent of stress). Certainly I think people in EEB are more likely to know and talk about their h-index.
  • Relationships between departments – in general EEB tends to think they do better science and look down on NatRes. NatRes tends to think EEBers have their heads in the clouds and are irrelevant. For the record, I’ve seen places where from an objective outside view, NatRes is clearly the superior department and places where EEB is clearly the superior department and places where they’re both good, but they all still tend to adopt this attitude towards each other. Which is unfortunate, because despite the fact that in my opinion both groups are doing exactly what their mission mandates and there are enormous synergies, on most campuses these judgmental attitudes prevail and there is very little interaction between the two groups (and they are often physically separated by large distances).
  • Undergraduate curriculum – NatRes are training undergrads to get jobs in state and federal agencies. For students to be hired by these agencies, they must have taken a very specific set of courses so the whole curriculum is built around these national requirements. EEB tends to teach a lot of service courses (i.e. introductory biology, neurobiology, plant taxonomy) taken by people all over campus. The majority of undergrads majoring in Biology want to go into medicine/health sciences.
  • Graduate trajectory – in NatRes most students stop after a masters (again targeting jobs in state and federal agencies or maybe an NGO). If you want to get a PhD you usually need a masters first, preferably from another institution. In EEB – most students are doing a PhD, often without having gotten a masters first. Traditionally EEB departments see their graduate program as primarily for creating new professors, although I do think they are increasingly embracing the role of training people for conservation work as well.
  • Graduate funding – in EEB it is a mix of RAships from NSF grants and lots of TAships (coming from the service courses). In NatRes TAships are few and hard to come by so it is mostly work on the contracts with state agencies and any USDA grants. The TAships in EEB help to counter the boom-bust nature of pursuing NSF funding (i.e. provide backups when funding goes dry), so it can be very hard to have students in a NatRes department if you primarily pursue federal funding and don’t have a steady stream of state/federal contracts.
  • Internal departmental culture – EEB is much more bottom-up governed while NatRes is much more top-down governed. Both groups have regular faculty meetings and votes. But the opinion of the department chair (and in NatRes often an executive committee of 4-5 senior faculty) counts a lot more heavily, and I’ve seen people have heavy consequences from getting on the bad side of a department chair much more in NatRes – EEB is the stereotypical herding cats where everybody just shrugs their shoulders and expects some people to be prima donnas. Also I think it might be fair to say that the proportion of old white males is slightly higher in NatRes than EEB (although this is changing and nowhere in ecology does particularly well on race). I don’t know a nicer way to say this but some (and only some) NatRes departments still have more of a “good-old-boy club” feel. Some EEB departments might have more of an elitist attitude.
  • Relationships between the colleges – almost invariably the College of Agriculture is the second richest and most powerful college on campus (after the college of medicine if such exists). They always have new buildings, money floating around for various initiatives, etc. Within the college of agriculture, NatRes is usually near the bottom of the ladder. In contrast, while colleges of science are usually less powerful, EEB/Biology is often the biggest and richest department within the college (especially when its a joint Biology department with EEB and molecular/cellular biology). So NatRes tends to be the little fish in the big pond, while EEB tends to be the big fish in the small pond. There are advantages to both – mostly depending on whether resources are being allocated at the university level (e.g. buildings which favors college of ag) or at the within college level (e.g. various travel awards to students which can tend to favor EEB).
  • Interior decorating – by far the most important distinction is what the hallways look like!. EEB departments tend to be in generic university drab with perhaps a glass display case of books by the faculty or maybe something out of the collections. NatRes Have large stuffed mammals, often a bear, mounted upright in the wildlife half and to have gorgeous solid wood paneling on the forestry half.

Those are the differences that jump most immediately to my mind. As already stated they are sweeping stereotypes and the landscape will differ in individual units. My only goal here is to provide a “quick reference” for people contemplating the switch. Overall, I find it highly regrettable that these cultural differences exist and that people don’t work together better between these units. We are all doing ecology after all. And it makes me really appreciate the structure here at U Maine where all of the biological sciences (from EEB to nursing and food sciences to forestry) are in one college – effectively a college of biology. More universities should move in this direction. Maine is also a place where people aren’t very hung up on the basic-applied distinction – something else I wish more universities would foster

I fear that somebody will get annoyed by my putting this down in black and white, but my intention is to help people new to the issues. Keep in mind that these are only approximately true, and that I love – repeat love – my time spent in both types of units on multiple campuses and nearly always end up finding a way to have cross appointments or what not to effectively end up in the middle between the two which is where I am happiest.

What are your observations about the similarities and differences across the “divide” (which shouldn’t be as big a divide as it is)? How does this generalize in other countries? What about people at private universities or undergraduate education-focused universities in the US – which culture matches better to what you experience?

How to write a great journal article – act like a fiction author

There are a number of good posts out there on how to write a good journal article and even a whole blog devoted to the topic; many of them are linked to in the comments section of my post on writing style.

Here I want to elevate above the nuts-and-bolts sentence-level-detail of my post on writing style* and even elevate above the aforementioned posts that break down different sections of a paper and zoom out to 100,000 feet and think really strategically about writing a paper.

In my experience as a student committee member and as an associate editor for three journals I must have seen many 100s if not at this point 1000s of pre-publication articles. And they are varied. But many of them are by already good writers in the sense of clear, fluid English that understands well the purpose of each of the four sections (intro, methods, results, discussion). But many (most?) of these are still missing something. Something which I think is the hardest thing to learn: to think about the paper as a cohesive unit.

Think about an artistic painting. For the artist, it is made up of 100s or 1000s of individual brush strokes, each one of which requires skill and artistry. And of course a painting typically has a few key objects – a building, a lake, a person and the strokes have to make those up convincingly. But the reason an artist makes a painting, and the reason we hang paintings in the Louvre and visit them by the millions is none of those reasons. It is the overall gestalt effect – the message, the emotional impact. The sum of the parts is MUCH greater than the whole in a great piece of art.

It is no different with a paper. A day after reading it, you don’t remember well-crafted sentences or a really clear introduction – you just have an overall gestalt. With an academic paper this gestalt usually includes a one sentence summary of the factual content of the paper (and yes it is really only one sentence). But it also includes the emotions and judgments hanging on that one sentence. Is it convincing or weak? Is it elegant? Clever? Surprising? Ultimately, much of the emotional gestalt we take from a paper is was it convincing? do I trust the author? It is my experience that first-time writers and even many more experienced writers are so caught up in the mechanics (the sentences and sections in analogy to the brush strokes and objects in the painting) that they never think about the overall gestalt. And as a result the gestalt is rather poor. Which, fairly or not, reflects on the results of the paper. This of course is what distinguishes an art school student (working on mastering the details) from a great artist. And it is what distinguishes a publishable paper from a great paper, one that is remembered, one that has impact, and, dare we dream, a paper that will achieve the analog of being hung in the Louvre (whatever that might be – and no its not getting published in Science or Nature).

My main piece of advice will sound like it is tongue-in-cheek but it is in fact straight-up serious advice. Think and work like a fiction author! Wikipedia says that the main ingredients of fiction writing are: Character, Plot, Setting, Theme and Style. I’m sure there is debate, but these sound a lot like what I learned in high school, and I’m going to go with these.Notice that these are all unifying elements – they are things that cut across the introduction, middle, and ending/resolution of a fiction story. In short they are what give the gestalt.

Let me address each of these in a little more detail as they relate to non-fiction, scholarly article writing:

  • Character – in fiction the characters need to be richly drawn to draw you into the story and make you care enough to keep reading and to remember them. The characters in a journal article are the questions you are asking. Introduce us to them. Spend a little time fleshing out their nuances. This is not achieved by a dump of literature citations, although that is a piece of it. You need to sound excited by your questions (which means you need to know what they are!). And you need to make them 3-D. And you need to dwell on them lovingly. None of this by the way means that you should write a long introduction anymore than you should spend half your book introducing the characters. Just as in the best fiction, the characters (questions) should be introduced deftly and crisply, which requires work.
  • Theme – the take home message. In fiction it is a moral, or perhaps an emotion. In a journal article it is the one sentence take home message. You may think I’m joking, but most people really will take away only a single sentence summary of the paper, so you better know what you want it to be before you start writing. “Figuring it out as you write” is a terrible approach. Your paper will sound disjointed and like you didn’t know what your theme was before you started. So figure out your one sentence BEFORE you start writing. I am known in my lab group for mercilessly asking a student who is at the writing stage of a paper “what is your one sentence?”. I ask them before they start the presentation. I ask them immediately at the end of the presentation. And I ask them several more times during the discussion with the lab. It might seem impossible, but it is actually very achievable – it just requires setting this as an explicit task and spending some time (usually interactive with other people) to achieve it. It is a sine qua non for a paper that has a good gestalt. How can a fiction writer construct plot/story arc, characters, setting to all build towards a powerful theme if they don’t know what it is? No different in non-fiction.
  • Plot – a good piece of fiction has a clear sense of movement. It starts one place, gives a sense of motion at any point you are reading, and then you end up somewhere new. It’s a big part of why people keep reading to the end. I call this the story arc. And the story arc is the thing that I find most often missing in journal articles. You need to take the reader along a very clear trajectory from question to conclusion. Just having the standard four sections is nowhere near enough. So many papers organized by the four sections still sound like a dump of everything you ever thought or did in connection to the paper. You need to work hard on story arc to make sure everything in the paper is pulling towards that one arc. This is why figuring out your one sentence before you write is so important.This lets you know what is superfluous and unnecessary and trim it away (most good writers will tell you that half the battle is knowing what to delete).
  • Setting – the place and culture in which things happen. In field experiments or observations this is pretty simple. Just as I cannot begin to fully understand or relate to a character unless I know their context, I won’t really care if p<0.05** unless I can visualize the whole experiment in my mind. Almost everybody tells me that they used a 1m x 1m quadrat (or whatever their sample unit was) but many fail to tell me if their replicates are 5m apart or 1km apart. If they’re on the same topography or randomized, surrounded by the same vegetation, etc. A well drawn, information-packed diagram of the layout is something I often find myself requesting as a reviewer or editor..
  • Style – this is a broad category that covers everything from writing dialogue to what voice is used – but it is ultimately the techniques. The brush strokes. And it is the clear writing I posted on last year in a non-fiction article.

My bottom line is this. Every word, every sentence, every paragraph, every section of the paper should be working together, like a well-synchronized team of rowers all pulling towards one common goal. The introduction should introduce the questions in a way that gives them emotional pull and leaves us desperate to know the answer. The methods and results should be a page-turning path towards the answer. And the discussion should be your chance to remind the reader of the story arc you have taken them on and draw sweeping conclusions from it. Any freeloading sentence or paragraph that pulls in a different direction should be mercilessly jettisoned (or at least pushed to supplemental material). Does this sound like a novel you would want to read? Yes, it does, and it probably sounds like a journal article you would want to read too.

I wish more people saw themselves as needing to use the skills of a story teller when they write a journal article. I of course don’t mean the connotations of dissembling or making things up that the word “story” carries. But I do mean the art of story-telling that knows where it is going and does it crisply so that it sucks us in and carries us along with just the right amount of time spent on details of character and setting. Where the characters (questions), the plot (story arc), the setting, the theme (the one sentence take home message) all work together to make a cohesive whole that is greater than the sum of the parts. Like anything in writing, you can do it if you work at it, but you do have to work at it (writing is not a gift handed to you by the gods)***. So go ahead, turn your next manuscript into a cohesive whole with great characters and a compelling story arc that leaves us deeply moved.

UPDATE, 22 June 2014: Comments on this post are now closed. This post was highlighted on “Freshly Pressed“. Which is flattering, but has led to dozens of people who wouldn’t otherwise have seen our blog trying to make non-substantive comments in order to promote their own blogs. We may or may not reopen comments on this post in the future.


* (which I badly violated in this sentence by stringing 5 nouns and more connective words in a row with no verb in sight and then running on for 45+ words in one sentence! – do as I say, not as I do :) )

**I probably won’t care about p<0.05 for a whole other set of statistical/philosophical reasons, but I leave that for another day!

*** just as an example of the messy, iterative process that writing is which depends as much on the bas-relief process of what is removed as what is added, I had a clear vision for this post – science writing should be more like fiction writing with the same elements as a compelling story which immediatley led to a title and intro. Then I when I started writing, I ended up with an outline that looked like

I – you have to know your main point

II – you should be like a fiction writer

IIa – character

IIb – plot

IIc – theme

etc

Well – I clearly had lost my way. While nothing I said was untrue or unimportant, I had bifurcated and complexified off my main theme. This is something I am very prone to do (as I think are most academics). So I deleted two whole paragraphs on I – you have to know what you want to write about – and then worked a much reduced version of it into the IIc theme section. Boom – back to a single story arc, a single sentence to remember, and a tighter, stronger piece. Not every edit is this easy, and this post could certainly benefit from more, but I hope it at least makes my point that you have to edit with a mentality of “does this add or distract from my main point” and be merciless if the latter.

Frogs jump? researcher consensus on solutions for NSF declining accept rates

Dynamic Ecology’s readers have spoken in a clear voice! There is a clear consensus around what changes people favor to address the hopelessly declining grant award rates at NSF. In a post on Monday I described what I see as the long-term exogenous trends in our society (US specifically but as commenters noted probably largely applicable globally) that affect NSF. And that are putting in NSF in a tight squeeze leading to a current acceptance rate of 7.3% and every expectation it will go still much lower. Basically funding flat, many pressures on researchers to apply for more grants (both more applications from old hands and pressure on many to begin applying) lead to a trade-off, the only variables of which NSF controls is # of applications and grant size in $.

I had a reader poll on what choices readers would like to see NSF adopt. To be clear this poll is entirely unscientific in sample design – its whoever reads the blog and answers. It is presumably mostly academic ecologists, and our readership skews early career and male (although how much more it does so than academic ecology in general is unknown), but beyond that I couldn’t say what biases there are. There were 450 votes – I don’t know how many voters since each voter could vote up to 3 times and polldaddy doesn’t give me the details unless I pay them $200 (I know there are other choices – I’ll probably use them next time but polldaddy is so convenient from inside WordPress). But to a first approximation there were presumably about 160-175 voters (some voters likely voted for only 1 or 2 choices). The results as of 11:30AM EST Wednesday (in my experience the vast majority of people who will read the post have read it by now) are:

Results of survey on solutions declining accept rates at NSF

Results of survey on solutions for declining accept rates at NSF. Note since users could pick three choices the 450 votes probably maps to slightly more than 150 voters, perhaps 160-175 total voters, each picking 1-3 choices.

Basically there are three groups of answers. In the first group, nearly everybody who voted was in favor of two changes:

  1. Reduce the average grant size from the current $500K to something more modest ($200K was the example in the poll). This would immediately increase accept rates by 2.5x (last year’s 7.3% would have been 18.25%. That’s a pretty big difference. Several people noted cutting grant size would negatively affect graduate students (fewer RAships), faculty at institutions/departments without TAships, and postdocs. Presumably the choice for only a modest cut was partly driven by this. Personally I would take some of the money saved and put it directly into NSF predoc and postdoc fellowships (this money doesn’t come with indirects and so is more efficient and also tips the balance of power to the students which is desirable in my opinion).
  2. Limit the number of proposals by restricting multiple grants to one researcher in a fixed period. The example given in the main text was at most one grant per five year period (once you’ve been awarded you cannot apply again). There are of course devilish details – do coPIs count, SKP count (senior key personnel=people whos CV is submitted but no salary in the grant), etc. And 5 years from award date or end of grant? etc. And while there is no perfect solution – nearly every solution will unfairly penalize some deserving person – there are certainly multiple good solutions and this is not a reason to not implement this.

Again it is remarkable that nearly everybody who voted, voted for both of these options. These options together effectively amount for a vote to spread current funding around more widely. Also note that implementing #1 almost requires some version (possibly weaker than I proposed) of #2 or you just will compound the problem of more people submitting more applications to chase fewer dollars.

Three other choices were about evenly split. To a first approximation, almost everybody voted for the two choices above, and then split evenly among the following 3 choices with their 3rd vote. To wit:

  1. Reduce grant sizes even further to $50K (not the $200K from above). This would have allowed an acceptance rate of 73%. It would have also severely limited funding (after overhead it is about $35K so roughly 3 months of summer salary or 1 year of PHD or 1/2 year of postdoc). My guess is that the thinking here is that these grants would not mostly be used for such things and instead just cover the basics of fieldwork, travel to conferences, publishing papers, etc. In short not so different from the Canadian NSERC Discovery grant. To me it is striking that across choices #1 and #3 they got a combined 47% (recall 33%=everybody voted for it if everybody voted all 3 times). – presumably a non-trivial number of people felt so strongly about this they used 2 of their 3 choices to vote for reducing grant size.
  2. Limit number of proposals by only allowing “productive researchers” to submit – this of course begs the question of how you define productive researcher. I threw out the example in the main text of 15 papers published in the last 5 years. Like #2 above this will require an arbitrary definition that hurts some deserving people, but that alone is not a reason to avoid it – especially once the rules are clear people can manage their lives around the rules (and one could imagine exemptions for early career researchers, special circumstances, etc). One reason to like this option is that studies have shown that past research success is one of the best predictors of future research success (better for example than panel evalutions of projects).
  3. Limit number of proposals by a lottery – Again many details on how this works. Is there a lottery to apply? or just a lottery for the awards among those who applied? or just a lottery among qualified scientists however that is defined? Although the lottery seems absurd on the face of it, two recent studies cited in salient fact #2 of my original post suggest that, at least among those proposals ranked moderately high (30% in the DEB case), panel scores were not that different than a lottery in predicting research outcomes. Presumably this is true for some of those that were just below the 30% cuttoff and not true for the bottom 10-15% with the line somewhere in between. Thus the lottery has the great virtue of calling a spade a spade and removing stigma from losers in what currently has a large component of lottery already but cloakings of assessment.

Then there were two “no-hopers” – essentially nobody favored these choices:

  1. Business as usual – live with the low accept rates – this got only about 2% (perhaps 5-6% of voters), meaning about 95% of voters oppose business as usual with ever declining accept rates. In the metaphor of the original post, researchers are not frogs!!  In the original post and comments a number of problems in very low accept rates (beyond the fact it makes life tough for researchers) were identified including how it distorts the selection process (more conservative, more clique-driven and of course more random), the waste of time writing 15 page proposals (at least 1 month of researcher time) for 5% success, etc.
  2. Limit proposals to certain career stages – this was the absolute least favorite choice. We academics are an egalitarian bunch. It also is not obvious that any one stage is inherently more productive.

I said in my original post I would wait to share my opinions until the poll results were in to avoid driving the results. I’m sure my biases bled through in last post and this anyway, but hopefully not terribly. But personally, I agree with everybody else – I would be in favor of some combination of #1-#5 and opposed to #6 and #7. On the cutting grant size, I of course presented arbitrary discrete options of $50K or $200K, but to me the optimum would probably be about $100K*. Over 3 years that gives $22K of direct per year. That’s enough for field work (or computers equipment or what not for field), travel to conferences, publishing fees and some consummables each year with enough left over to give a bridge year to a student, a year to a postdoc, a year of tech etc. To make this viable, I would not put all of the savings into more grants (my $100K size gives an accept rate of 36.8% – I would aim for 20-25% accept rate and put the rest into more fellowships given directly to PhD and postdocs). The sublinear response of productivity/research outcomes to dollars input strongly argues we need to move down that curve to fewer dollars per researcher where the slope of the curve and hence marginal value of research productivity bought per dollar spent increases. By the same token, I think many feel, including, me that research dollars have gotten too concentrated in a few researcher’s hands (but I know of no data on this). There are good arguments for concentrating (see my post on Shockley and lognormal productivity), but really is a superstar lab with 18 students going to get more marginal value out of one more student than a very good lab that currently has 2-3 students? I doubt it.

I personally think #4 (limit by researcher quality) and #5 (limit by lottery) have more merit than people gave them credit for too, but they are more radical changes to the system.

It is worth noting that there is enormous consensus (at least among poll respondents) to reduce grant size non-trivially and put caps on number of grants per researcher. And these are things that NSF could, if it wanted to, implement immediately. No congress, no lengthy reform processes, etc would be needed. A year or two of appropriate advance notice to researchers would be good. But beyond that these are already within the purvey of program officers to adjust budgets and recall as a commentor did that a cap of max 2 proposals per PI was placed when the pre-proposals were introduced. It would probably require consensus across a unit to make the cap global and across multiple years, but that should be achievable. Finally, note that a single unit (say DEB just for example…) could implement these as an experiment while the rest of NSF watched to see how it worked (this already happened/is happening with the pre-proposal process too). Presumably the main dynamic opposing these changes are just innate conservatism/keep-it-like-it-is and lobbying by the few but powerful that are getting large chunks of money under the current system (although I would be curious to know how many of them really think the current system is optimal).

I think more meta-research is needed too. Just what can panels successfully assess or not? Although Sam Scheiner disagreed with me in the comments on my last post, I know of very little evidence that panels can do much more than distinguish the very worst proposals from the rest (please give my citations if you think I’m wrong). If that is true we need to be scientists and deal with it, not avoid doing the research to find out because the current system is comfortable. Kudo’s to Sam and Lynnette for their paper. Similarly the question of exactly how sublinear research productivity vs grant dollars is vitally important but not yet very clear.

I have no idea what the next step is, but it seems to me that the long term trends and outlook are so extreme something has to be done (only 5% favor business as usual). And there is such a strong consensus (nearly 100%, certainly *way* over 50%) on several concrete changes which would have big impacts but would not require major restructuring such that I would be disappointed to see nothing change over the next 3 years.

Here’s hoping the community can work together to find a way to turn down the heat on the pot we’re all in!


* I am not unaware that different subdisciplines cost different amounts to do research ($100K goes less far in ecosystem science work in the tropics than it does in simple trapping or counting experiments at a site close to home). The implications of this is a whole other topic, which I am not touching here. For this post if current DEB across all subprograms has a median of $500K then it can change to $100K with differences in funding between fields untouched.

 

Are US researchers slowly boiled frogs? – or thinking out of the box about the future of NSF

There is a belief that dropping a frog into hot water will cause it to react and immediately jump out, while putting it in a pan of cool water and slowly warming will cause the frog to never notice until it is boiled. Here in Maine you hear the same debate about how to cook a lobster. Whether the frog myth is true or not is debatable (although it is clearly sadistic). But it has become a common metaphor for failing to notice or respond to small incremental changes which when taken in the aggregate are terrible (fatal in the case of the frog). We seem to have a bit of the same thing happening with the primary basic science funding agency in the US (the National Science Foundation or NSF). In this piece I want to a) argue that due to macro trends not the fault of NSF, the agency and their researchers are in a frog-boiling scenario, and b) attempt to kick-start an out-of-the-box big picture discussion about what should be done about it (akin to the frog realizing it needs to take bold action and jump out of the pot).

But first, I’ve already said it, but let me repeat it to be abundantly clear. This is NOT a criticism of NSF. Every single program officer I’ve ever dealt with has been a highly dedicated and helpful professional (not to mention they are also researchers and one of us), and NSF regularly gets rated by government auditors as one of the most efficient and well run branches of the government. Instead, these trends are being driven by macro trends beyond the control of NSF (or of us researchers). I’m sure NSF is just as aware of and unhappy about these trends as I am. I expect they also are having discussions about what to do about it. I have not been privy to those discussions and have no idea whether NSF would welcome the discussion I am promoting here or not, but I feel like this blog, with its tradition of civility and rational thinking might be a useful forum.

Why researchers at NSF are like frogs being slowly boiled – the macro trends

I am going to focus just on the environmental biology division (DEB), although I don’t think the story differs much anywhere else. I haven’t always been able to obtain the data I would like to have, but I’m pretty confident that the big picture trends I am about to present are quite accurate even if details are slightly off. The core, graph, which I’ve seen in various versions of NSF presentations for a while (including those to justify the switch to the preproposal process) is this:

Trends in # of proposals submitted (green), # of proposals funded (blue), and success rate (red). This data is approximate (eyeball scanned from http://nsfdeb.wordpress.com/2013/03/11/deb-numbers-revisiting-performance-of-pi-demographic-groups-part-1/ provided by NSF). Linear trend lines were then added.

Trends in # of proposals submitted (green), # of proposals funded (blue), and success rate (red). This data is approximate (eyeball scanned from http://nsfdeb.wordpress.com/2013/03/11/deb-numbers-revisiting-performance-of-pi-demographic-groups-part-1/ provided by NSF). Linear trend lines were then added.

This graph confirms what NSF has been saying – the number of proposals submitted keeps going up without any sign of stopping while the number of proposals actually funded is flat (a function of NSF funding being flat – see below). The result is that the success rate (% of proposals funded) is dropping. But adding trends lends and extending them to 2020 is my own contribution. The trend in success rate is here actually an overestimate due to the stimulus year in 2009 which was left in. According to a naive, straight line trend, success rate will reach 0% somewhere between 2019 and 2020! Of course nobody believes it will reach 0% And the alternative approach combining the other two trend lines gives roughly 200 proposals funded out of 2000 for 10% in 2010. But the trend line is not doing a terrible job; when I plug in the 2013 number from DEB of 7.3%* it is not that far from the tend line (and is already below the 10% number). Nobody knows what the exact number will be, but I think you can make a pretty good case that 7.3% last year was on trend and the trend is going to continue going down. A few percent (2%?) by 2020 seems realistic. All of this is the result of inexorable logic. The core formula here is: TotalBudget$=NumberProposals*Accept%*GrantSize$

NumberProposals is increasingly rapidly. Although harder to come by data on, my sense is that GrantSize$ is roughly constant (at least after adjusting for inflation) with good spread but a median and mode right around $500,000. Maybe there is a saving grace in TotalBudget$? Nope:

nsf_funding_overview

Trends in NSF funding in constant 2012 dollars (data from http://dellweb.bfa.nsf.gov/NSFHist_constant.htm). Also see NSF’s own plot of the data at http://dellweb.bfa.nsf.gov/nsffundhist_files/frame.htm.

NSF appears to have had four phases – exponential growth in the early days (1950-1963), flat from 1963-1980. Strong growth from 1980 to about 2003. And then close to flat (actually 1.7%/year over inflation) from 2003-2013 (again a stimulus peak in 2009). Note that the growth periods were both bipartisan (as was the flat period from 1963-1980). Positive growth rates aren’t terrible and congratulations to NSF for achieving this in the current political climate. But when pitted against the doubling in NumberProposals, it might as well be zero growth for our purposes. It is a mug’s game to try to guess what will happen next, but most close observers of US politics expect since the debate has shifted to a partisan divide about whether to spend money at all and a resignation that the sequester is here to stay are not looking for big changes in research funding to come out of Congress anytime soon (see this editorial in Nature). So I am going to treat TotalBudget$ as flat line and beyond the control of NSF and researchers.

The number that probably deserves the most attention is NumberProposals. Why is this going up so quickly? I don’t know of hard data on this. There is obviously a self-reinforcing trend – if reject rates are high, I will submit more grant applications to be sure of getting a grant. But this only explains why the slope accelerates – it is not an explanation for why the initial trend is up. And there is certainly a red-queen effect. But in the end I suspect this is some combination of two factors: 1) the ever tighter job market (see this for a frightening graph on the ever widening gap between academic jobs and PhDs) which has led to ever higher expectations for tenure. To put it bluntly, places that 20 years ago didn’t/couldn’t expect grants from junior faculty to get tenure now can place that expectation because of the competition. and 2) as states bow out of the funding of their universities (and as private universities are still recovering from the stock crash), indirect money looks increasingly like a path out of financial difficulties. Obviously #1 (supply) and #2 (demand) for grant writing faculty reinforce each other.

So to summarize: TotalBudget$=NumberProposals*Accept%*GrantSize$. TotalBudget$ is more or less flat for the last decade and foreseeable future. NumberProposals is trending up at a good clip due to exogenous forces for the foreseeable future (barring some limits placed by NSF on number or proposals). So far GrantSize$ has been constant. This has meant Accept% is the only variable to counterbalance increasing NumberProposals. But Accept% is going to get ridiculously low in the very near future (if we’re not there already!). Part of the point of this post is maybe we need to put GrantSize$ and NumberProposals on the table too.

Some salient facts for a discussion of what to do

In the next section I will list some possible solutions, and hopefully readers will contribute more, but first I want to highlight two very salient results of metaresearch (research about research).

  1. Review panels are not very good at predicting which proposals will lead to the most successful outcomes. Some claim that review panels are at least good at separating good from bad at a coarse grain, although I am not even convinced of that. But two recent studies showed that panel rankings effectively have no predictive power of variables like number of papers, number of citations, citations of best paper! One study was done in the NIH cardiovascular panel and the other was done in our very own DEB Population and Evolutionary Processes panel by NSF program officers Sam Scheiner and Lynnette Bouchie. They found that the r2 between panel rank and various outcomes was between 0.01 and 0.10 (1-10% of variance explained) and were not significantly different than zero (and got worse when budget size, which was an outcome of ranking, was controlled for). UPDATE: as noted by author Sam Scheiner below in the comments – this applies only to the 30% of projects that were funded. Now traditional bibliometrics are not perfect but given that they looked at 3 metrics and impact factor was not one of them, I think the results are pretty robust.
  2. Research outcomes are sublinear with award size. Production does increase with award size, but best available (but still not conclusive) evidence from Fortin and Currie 2013 suggests that there are decreasing returns (a plot of research production vs. award size is an increasing, decelerating curve (e.g. like a Type II functional response).This means giving an extra $100,000 to somebody with a $1,000,000 buys less productivity increase then giving an extra $100,000 to somebody with $200,000 (or obviously to somebody with $0).

Possible solutions

Just to repeat this is not a criticism of NSF. The exogenous drivers are beyond anybody’s control and simple budgetary math drives the rest. There is no simple or obvious answer. I certainly don’t have the answer. I just want to enumerate possibilities.

  1.  Do nothing – low Accept% is OK – This is the business as usual scenario. Don’t make any drastic changes and just let the acceptance rate continue to drop to very close to zero. I actually think this might be the worst choice. Very low acceptance rates greatly increase the amount of randomness involved. They also ironically bias the panels to be conservative and select safe research (maybe even mediocre research) that won’t waste one of the precious awards, which is not good for the future of science. I recall being part of a discussion for an editorial board for a major journal where we all agreed the optimal accept rate was around 25-30%. Anything higher and you’re not selective. Anything lower and you start falling into traps of randomness and excessive caution. I think this is probably about the right number for grants too. Note that we are at about 1/4 of this rate. I personally don’t even consider the current acceptance rate of 7% acceptable but I cannot imagine anybody considers the rates of 1-2% that we’re headed towards to be acceptable. The other approaches all have problems too, but most of them are not as big as this one in my opinion.
  2. Drive down NumberProposals via applicant restrictions on career stage – You could only allow associate and full professors to apply on the basis they have the experience to make best use of the money. Alternatively you could only allow assistant professors to apply on the argument they are most cutting edge and most in need of establishing research programs. Arguably there is already a bias towards more senior researchers (although DEB numbers suggest not). But I don’t think this is a viable choice. You cannot tell an entire career stage they cannot get grants.
  3. Drive down NumberProposals via applicant restrictions on prior results - A number of studies have shown that nations that award grants based on personal records of the researcher do better than nations that award grants based on projects. You could limit those allowed to apply to those who have been productive in the recent past (15 papers in the last 5 years?). This of course biases against junior scientists although it places them all on an equal footing and gives them the power to become grant eligible. It probably also lops off the pressure from administrators in less research-intensive schools to start dreaming of a slice of the NSF indirect pie (while still allowing individual research productive researchers at those institutions to apply)..
  4. Drive down NumberProposals via lottery – Why not let the outcome be driven by random chance. This has the virtue of honesty (see fact #1 above).It also has the virtue of removing the stigma from not having a grant if people can’t be blamed for it. This would especially apply to tenure committees evaluating faculty by whether they have won the current, less acknowledged, NSF lottery
  5. Drive down NumberProposals via limitations on number of awarded grants (“sharing principals”) - You could also say that if you’ve had a grant in the last 5 years, you cannot apply again. This would lead to a more even distribution of funding across researchers.
  6. Decrease GrantSize$  – The one nobody wants to touch is maybe its time to stop giving out average grants of $500,000. Fact #2 strongly argues for this approach. Giving $50,000 to 10 people is almost guaranteed to go further than $500,000 to one person. It gets over that basic hump of having enough money to get into the field. It doesn’t have much room for summer salaries (or postdocs – postdoc funding would have to be addressed in a differnet fashion), but it would rapidly pump up the accept rate into reasonable levels and almost certainly buy more total research (and get universities to break their addiction to indirects). Note that this probably wouldn’t work alone without some other restriction on the number of grants one person can apply for, or everybody will just apply for 10x as many grants which would waste everybody’s time.

What do you think NSF should do? Vote by choosing up to three choices of how you think NSF should deal with the declining acceptance rates (and feel free to add more ideas in the comments):

I am really curious to see which approach(es) people prefer. I will save my own opinions for a comment after most votes have come in. But I definitely think it is time for the frogs (us) to jump out of the pot and take a different direction!


* Note that 7.3% is across all proposals to DEB. The blog post implies that the rates are lower on the core grants and higher on the non-core grants like OPUS, RCN, etc. They don’t give enough data to figure this out, but if I had to guess the core grants are funded a bit below 5% and the non-core grants are closer to 10%.

Scientists have to present a united front, right?

So on Friday, a group I’ve been working with (Maria Dornelas, Anne Magurran, Nick Gotelli, Hideyasu Shimadzu and others) came out with a paper in Science. We took 100 long-term monitoring datasets across 6 continents and many taxa and looked to see if there was a consistent trend in local alpha diversity over time. To our surprise there wasn’t – on average across datasets the slope was zero and most datasets were close to zero (and the one’s that weren’t cancelled each other out). We also found that temporal beta diversity (turnover in species composition within one local community) changed much faster than any reasonable null model would predict.

From a “the earth is doomed” prophesy point of view this is mixed results. Local alpha diversity is not looking as bad as we expected (also see this recent paper by Vellend et al), but species composition is churning really fast (and although we didn’t measure this, there is a reasonably good chance that its already widespread species moving in to replace rarer species that is driving this). This is probably bad news for those that care about the state of the planet. And finding of no trend in alpha diversity does NOT contradict declining global diversity (scale matters!). But all of us authors can quickly imagine certain subsets of the public cherry picking results and trumpeting “scientist’s prove claimed modern mass extinction not occurring”.

So I want to expand beyond this specific finding to the more general question, when scientists are working in domains that have strong implications for broad policy debates, how should they handle and think about how their work will play in the policy context vs how they should do their science. This plays out in questions of extinction, invasion, climate change, etc. It was very vividly played out in “climategate” and before that, in creationism, where Stephen Jay Gould, testifying before state supreme courts about whether evolution was well understand and widely agreed about by scientists had to back off claims he had made in the academic world that his theories of punctuated equilibrium were revolutionizing and overturning traditional views of how evolution worked.

One view of the relationship of science to the general public is that the public cannot be trusted and so we scientists all have to band together and not show any internal disagreement in public. If we reveal even one crack in the edifice we are building towards the whole thing will be pried apart. They note that there are vested interests who don’t play fair and will take things out of context. They note that modern 30 second sound bites and 140 character tweets don’t leave room for communicating the complexity. This means dissent, nuance, exceptions to the rule, etc should not be published in a way the general public will notice (it’s OK to bury them in obtuse language in the middle of discussion sections). And indeed, I had been told by colleagues about the paper I described above that “you can’t publish something like that”. Lest you think I exaggerate, Mark Vellend shared with me a quote from a review of his aforementioned paper (published in PNAS but this quote is from a prior review at Nature) that I reproduce here with Mark’s permission:

I can appreciate counter-intuitive findings that are contrary to common assumptions. However, because of the large policy implications of this paper and its interpretation, I feel that this paper has to be held to a high standard of demonstrating results beyond a reasonable doubt … Unfortunately, while the authors are careful to state that they are discussing biodiversity changes at local scales, and to explain why this is relevant to the scientific community, clearly media reporting on these results are going to skim right over that and report that biological diversity is not declining if this paper were to be published in Nature. I do not think this conclusion would be justified, and I think it is important not to pave the way for that conclusion to be reached by the public.

This quote is actually a perfect example of the attitude I am trying to summarize playing a role right in the center of the peer review process.

This is definitely a common view, and reasonable people can disagree, but I just can’t get on board with this “united front” approach for a number of reasons:

  1. Ethically, a scientist is obligated to be honest. This inlcudes not just honesty by commission (the things we say are true) which 99% of us do. It also includes honesty by not ommitting (not saying things we know to be true but inconvenient). Indeed this might be central to the definition of what it means to be a scientist, instead of say a lobbyist or maybe even a philosopher.
  2. Practically, a scientist is most likely to be seen as an honest broker by the public when at least some of the time things contrary to the main thinking get published. Or if not contrary at least nuancing (the general belief isn’t true in these particular conditions). Nobody believes somebody is objective when they can’t see and deal with evidence contrary to one’s beliefs. If we sound like a PR machine staying on message we won’t be trusted.
  3. Psychologically, an ecologist is most likely to be heard and paid attention to when they talk about good news related to the environment as well as all the bad news. Nobody can/wants to pay attention to a doomsayer.

For all of these reasons, I think it is a mistake to bury evidence that is contrary the general program that biodiversity is headed towards imminent destruction in every possible way in every corner of the earth. It’s actually a good thing, for all of the ethical, practical and psychological reasons given above to have scientists themselves putting out a more complex, nuanced view.

I can already hear the skeptics saying the public cannot handle complex and nuanced. But I think looking at climate change is informative. Take the the original IPCC reports and how careful they were to break out all the different aspects of climate change and the different levels of uncertainty around it. And then look at what got perceived as a “united front” and “win” attitude that came out in climatgate and how strongly the public reacted (NB: climategate was an overblown tempest in a teapot, but it speaks exactly to my point about how the public perceives scientists – or more to the point how the public wants to perceive scientists and how upset they get when the honest broker role appears inaccurate). The public CAN hear and accept complexity, uncertainty etc (barring an extreme fringe that will always exist). It just takes a LOT of work to communicate it. But I don’t think we as scientists have any other choice.

 

Is requiring replication statistical machismo?

A recent post of mine about why Biosphere 2 was a success stirred mixed reactions. But one of the most common negative reactions was that there was no replication in Biosphere 2, which of course EVERYBODY knows is a hallmark of good science. This actually spilled into a spirited discussion in the comments. So, do we need replication to do good science?

Anybody who has read some of my older posts (e.g. one true route poststatistical machismo post) will know that my answer is going to be no. I’m not going to tell a heliologist that they are doing bad science because they only have one sun (they do have the stars, but most of the phenomena they study like sun spots are not yet studyable on other stars). Nor am I going to say that to people who have developed theories about why our inner solar system contains rock planets and the outer solar system contains giant gaseous planets (although in the last 2-3 years we are actually getting to the point where we have data on other solar systems, these theories were all developed and accepted well before then). And Feynman’s televised proof that a bad interaction between cold weather and a rubber O-ring led to the demise of the Space Shuttle Challenger definitely did not need and would not tolerate replication. Closer to home, I am not going to tell people who have been measuring CO2 on top of Mauna Kea  (aka the Keeling Curve one of the most well known graphs in popular science today) that their science is bad because they only have one replicate. Nor am I going to tell people who study global carbon cycling to give up and go home because it is a well mixed gas on only one planet (I mean come on  N=1, why waste our time!?). In short , no, good science does not REQUIRE replication.

Let me just state up front that replication IS good. The more replication the better. It always makes our inferences stronger. We DO need replication when it is feasible. The only problem is that replication is not always possible (sometimes even with infinite amounts of money and sometimes only due to real world time and money constraints). So the question of this post is NOT “do we need replication?” It IS”do we HAVE to have replication” and “what do you do in these trade-off or limitation situations?” Give up and go home – don’t study those questions – seems to be some people’s answers. Its not mine. Indeed any philosophy of science position which leads to the idea that we should stop studying questions that inconveniently fail to fit a one-stop-shopping approach to science is not something I will endorse. This is the statistical machismo I have talked about before – when one has to make the statistics so beautiful AND difficult that few can achieve the standard you have set and you can then reject others work as WRONG, WRONG, WRONG. Careful thinking (and perusing the examples in the last paragraph) lead to a number of ways to do good, rigorous science without replication.

First let’s step back and define what replication is and why it is important. Wikipedia has several entries on replication, which in itself is probably informative about the source of some of the confusion. When ecologists think about replication they are usually thinking about it in the context of statistics (wikipedia entry on statistical replication) and pretty quickly think of Hurlbert’s pseudoreplication (also see Meg’s post on the paper) . This is an important context, and it is pretty much the one that is being violated in the examples above. But this definition is only saying you need replication to have good statistics (which is not the same as good science). But Wikipedia has an alternative entry on “replication – scientific method” which redirects to “reproduceability”. This definition is the sine qua non of good science, the difference between science and pseudoscience. Reproduceability means if you report a result, somebody else can replicate your work and get the same thing. If somebody is doing science without reproduceability, call them out for bad science. But don’t confuse it with replication for statistics. Ecologists do confuse these two all the time. Thus to an ecologist replication means multiple experimental units well separated in space (not well separated=pseudoreplication, not multiple=no replication=degrees of freedom too small). As I said, those are both good goals (which I teach in my stats class and push students to achieve). But they are not the sine qua non of good science.

It is instructive to think about an example that came up in the comments on the Biosphere 2 post: the LHC (large hadron collider) and the hunt for the Higg’s Boson. Pretty blatantly they did not have ecological replication. Each LHC facility costs billions of dollars and they only had one (ditto for Biosphere 2). But the physicists actually had an extremely well worked out notion of rigorous reproduceability. Despite only having one experimental unit, they did have multiple measurements (observed particle collisions). Thus this is a repeated measures scenario, but notice that since there was only one “subject” there was no way to correct for the repeated measure. The physicists made the assumption that despite being done on one experimental unit, the measures were independent. But what I find fascinating is that the physicists had two teams working on the project that were “blnded” to each others work (even forbidden to talk about work with each other) to tackle the “researcher degrees of freedom” problem that Jeremy has talked about. They also had very rigorous a priori standards of 5σ (p<0.0000003) to announce a new particle (I seem to recall that at 3σ they could talk about results being “consistent with” but not “proof of” but I haven’t found a good reference to this). So, in summary, the Higg’s test had an interesting mix of statistical replication (5σ), reproduceability (two separate teams) and pseudoreplication (uncorrected repeated measures) from an ecologist’s perspective.

So what do we get out of statistical replication? The biggest thing is it allows us to estimate σ2 (the amount of variance). We might want to do this because variance is innately interesting. For instance, rather than ask does density dependence exist, I would rather ask what percent of the year-to-year variance is explained by density dependence (as I did in chapter 8 of this book and as I argued one should do in this post on measures of prediction). Or we might want to quantify σ2 because it lets us calculate a p-value, but this is pretty slippery and even circular – our p-value gets better and better as we have more replication (even though our effect size and variance explained don’t change at all). This higher p-value due to more replication is often treated as equal good science, but that is poppycock. Although there are valid reasons to want a p-value (see Higg’s Boson), pursuit of p-value quickly becomes a bad reason for replication. Thus for me, arguing to have replication to estimate σ2 is a decidedly mixed bag – sometimes a good thing, sometimes a bad thing depending on the goal.

However, and to me this is the biggest message in Hurlbert’s paper but often forgotten against the power of the word “pseudoreplicationn”, is the #1 problem driving everything else in the paper is the issue of confoundment. If you only have one site (or two or three), you really have to worry about whether you get the effect you observed because of peculiarities of that that site and any weird covariances between your variable of interest and hidden variables (Hurlbert’s demonic intrusions). Did you get more yield because of pest removal as you think or because its downhill and the soil is wetter? One way to kill the demon of confoundment is to have 100 totally independent, randomly chosen sites. But this is expensive. And its just not true that it is the ONLY way to kill the demon. I don’t think anybody would accuse the LHC of confoundment despite only having one site. You could spin a story about how the 23rd magnet is wonky and that imparts a mild side velocity (or spin or I don’t know my particle physics well enough to be credible here …) that fools everybody into thinking they saw a Higg’s boson. But I don’t hear anybody making that argument. The collisions are treated as independent and unconfounded. The key here is there is no way to measure that or statistically prove that. It is just an argument made between scientists that depends on good judgement, and so far the whole world seems to have accepted the argument. It turns out that is a perfectly good alternative to 100’s of spatial replicates.

Let me unpack all of these examples and be more explicit about alternatives to replication as ecologists think about it – far separated experimental units (again these alternatives are only to be used when necessary because replication is too expensive or impossible but that occurs more often in ecology than we admit):

  1. Replication in time – repeated measures on one or a few subjects do give lots of measures and estimates of σ2 – its just that the estimate can be erroneously low (dividing by too many degrees of freedom) if the repeated measures are not independent. But what if they are independent? Then its a perfectly valid estimate. And there is no way to prove independence (when you have only one experimental unit to begin with). This is a matter for mature scientists to discuss and use judgement on as with the LHC – not a domain for unthinking slogans about “its pseudoreplicated”. Additionally there are well-known experimental designs designs that deal with this, specifically the BACI or before/after/compare (just Google BACI experimental design). Basically one makes repeated measures before a treatment to quantify innate variability, then repeated measures after the treatment to further quantify innate variability and then compares the before and after difference in means vs. the innate variability. The Experimental Lakes Area eutrophication experiments are great examples of important BACI designs in ecology and nobody has ever argued those were inconclusive.
  2. Attention to covariates – if you can only work at two sites (one treatment and one control) you can still do a lot of work to rule out confoundment. Specifically you can measure the covariates that you think could be confounding. Moisture, temperature, soils, etc and show that they’re the same or go in the opposite direction of the effect observed (and before that you can pick two sites that are as identical as possible on these axes).
  3. Precise measurements of the dependent variable – what if σ2=0? Then you don’t really need a bunch of measurements. This is far from most ecology, but it comes up sometimes in ecophysiology. For a specific individual animal under very specific conditions (resting, postprandial), metabolic rate can be measured fairly precisely and repeatably. And we know this already from dozens of replicated trials on other species. So do we need a lot of measurements the next time? A closely related one is when σ2>0, but the amount of error are very well measured and we can do error analysis that ripples all the error bars through the calculations. Engineers use this approach a lot.
  4. We don’t care about σ2 – what if we’re trying to estimating the global NPP. We may have grossly inaccurate measurement methods and our error bars are huge. But since we have only one planet, we can’t do replication and estimate σ2, but does that mean we should not try and estimate the mean? This is a really important number, should we give up? (note – sometimes the error analyses mentioned in #3 can be used to put confidence intervals on, but they have a lot of limitations in ecology). And note I’m not saying having no confidence intervals is good, I’m saying dropping entire important questions because we can’t easily get confidence intervals is bad.
  5. Replication on a critical component – The space shuttle example is a good example of this. One would not want to replicate on space shuttle’s (even if human lives were taken out of the equation cost alone is prohibitive). But individual components could be studied through some combination of replication and precise measurement (#3 above). The temperature properties of the O-ring were well known and engineers tried desperately to cancel the trip. They didn’t need replicate measures at low temperatures on the whole shuttle. Sometimes components of a system can be worked on in isolation with replication but still generalize to the whole system where replication is not possible.
  6. Replication over the community of scientists – what if you have a really important question that is at really big scales so that you can only afford one control and one experimental unit, but if it pans out you think it could launch a whole line of research leading to confirmation by others in the future? Should you just skip it until you convince a granting agency to cough up 10x as much money with no pilot data? We all know that is not how the world works. This is essentially the question Jeff Ollerton asked in the comments section of the Biosphere 2 post.

So, in conclusion: Ecologists have an overly narrow definition of what replication is and what its role in good science is. High numbers of experimental units spatially separated is great when you can do it. But when you can’t, there are lots of other things you can do to deal with the underlying reasons for replication (estimating σ2 and confoundment). And they are not places for glib one-word (“pseudoreplication” sneeringly said) dismissals. They are places for complex, nuanced discussions about the costs of replication and how convincingly the package of alternatives (#1-#6) are deployed, and sometimes even how important the question is.

What do you think? Have you done work that you were told is unreplicated? How did you respond? Where do you think theory fits into the need for replication – do we need less replication when you have better theory? Just don’t tell me you have to have replication because its the only way to do science!

 

Policy relevant science: the unreasonable effectiveness of boundary objects

In a recent post on policy-relevant science I talked about boundary workers and boundary organizations. The boundary I am talking about is between science and policy and the notion of the boundary between scientists and policy-makers is something receiving increasing attention by social scientists. Briefly (read the last post if you want more), the idea originated with people who span between inside and outside of a company, mutated to be the boundary between scientists and others, and led to adding a new concept – boundary organizations (e.g. conservation NGOs, land trusts, etc).

But today, I want to talk about another idea that emerges from thinking about the boundary: the boundary object. As the name implies a boundary object is a thing, not a person or group of people, that helps to span the boundary. In the original corporate model, things like price lists and user licenses were boundary objects. In the science policy boundary, there are many possibilities – maps and models being two of the most commonly cited, but many many other objects can (should?) be thought of as boundary objects as well.

To be a good boundary object, an object needs several properties:

  1. Spans/lives in two worlds/translates – this is the most obvious point. It needs to have genuine uptake of the language and concepts of the scientists but also genuine uptake of the language and concepts of the policy makers. It probably needs to be enough of a compromise to maks both sides a bit uncomfortable. A bit too certain for scientists. A bit too quantitative for policy makers. An ANOVA table or regression table does not span (full comfort for scientists, all the discomfort for the policy makers). A bar graph of standardized coefficients is a bit better. A decision (regression) tree is a lot better (and it makes both groups stretch their comfort zones)..
  2. Central – a good boundary object needs to get to the heart of the matter and show the key variables enough to stimulate discussion and yes, provoke disagreement, or it is not doing its job. Just a map of the area without any layers is not a boundary object. A map that identifies existing resources, existing practices, proposed regulatory zones is a good boundary object.
  3.  Highly public – a boundary object needs to be easily available to everybody on both sides of the boundary – probably on the internet in this day and age. A drawing on a cocktail napkin from a discussion between two people is not a boundary object. But if it is scanned and put on the internet (or emailed to a group) it could be.
  4. Credible – a boundary object needs to be reasonably credible as objective and neutral. If it is seen as a tool for one side to win, it won’t be used. Indeed, even if it is unintentional (e.g. bad initial parameters in a model), just being perceived as biased can be the kiss of death to the life of a boundary object.
  5. Changeable/editable – boundary objects need to be changeable. As the discussion across the boundary changes and moves, the boundary object needs to capture and reflect. In some cases, a boundary object can become the centerpoint of negotiation.

I stated earlier that both maps and what-if (scenario-driven) models are great boundary objects.  Assuming they map or model the right variables it is pretty clear how they meet the five criteria. Especially maps. The ideal model to serve as a boundary object has a number of clear input parameters that can be changed to see how the outcomes change. This is especially powerful when the model is fronted on the web where anybody can tweak the parameters and play with the model. A model is also powerful when the assumptions can be written down clearly (although just making clear the what the inputs and outputs are is useful).

As the title of this post suggests, boundary objects can be extraordinarily, surprisingly successful in invoking boundary spanning. I’m sure almost any ecologist who has put themselves in a policy context (hearing, forum, etc) has seen the power of a map. I saw it a couple of weeks ago in my town – there was a public hearing on building a giant student apartment complex in close proximity to some wetlands. The whole conversation centered on the architectural drawings (which were mounted on a 3×5 poster board). And when a scientist got up and started talking about why he thought the soil survey was wrong, he didn’t just say it, he took the time to hook up to a projector and show soil maps. Maps just change the whole conversation. They don’t make people magically agree (of course!). But they make the conversation much more concrete, much less talking past people and not being heard, and ultimately much more productive.

Models are used much less often in environmental policy in my experience (but still frequently). They can also be game changers. It doesn’t mean people agree with the model. But they do mean people can start to understand what the most important variables are. And they can start to have concrete dialogues about what the right assumptions are. A great example where maps and models interesect is the diagrams being produced of sea level rise in response to climate change. To a large degree the map aspect dominates. But in more nuanced conversations outside of the press, they start to lead to error bars (whats that map look like if the seas only go up 20 cm or up 1m), they start discussions about what we do and don’t know about ice melt, etc. My whole job before I returned to academia was building models to be used as boundary objects in the business world. I spent 5 years of my life modelling the changes to mail sorting that would happen with automation (optical character readers and bar code sorters).  These models served as a focal point for launching 100s of discussions from the impacts for unions, to change in the number of facilities needed to what types of machines to buy and what types of mail to incentive in the future.

Maps and what-if models aren’t the only useful boundary objects. I already mentioned decision trees (output from a regression tree). While a single regression tree might be less trendy and statistically proper than a random forest, it is a WAY better boundary object. Managers intuitively get regression trees and can immediately start discussing limitations of the statistical model, matching the model vs their mental model of reality, and see policy implications. Another boundary object is forcing a quantitative weighting of priorities. This can be done with techniques as simple as voting for rank order. Or as complex as using analytical hierarchical process. Having a discussion to conclude that genetic diversity deserves 27.2% of our attention, taxonomic diversity 37.8%, and functional diversity 35% is totally arbitrary and inherently wrong by being so one dimensional – but it is a fantastic way to have a constructive conversation! (again that theme of a good boundary object takes everybody out of their comfort zones). Similarly a “health of nature in Maine” index combining different factors with arbitrary weights would be stupidly oversimplified from the reality ecologists know, but a great boundary object. Even reports serve as boundary objects – think of the just released IPCC report (of course the many maps, charts and models are each boundary objects) but the wording of the report itself stirred great debate and discussion on what we know, how certain we are, etc. Scenario generation (sensu this paper) is another less-quantitative boundary object.

As a slightly more detailed case study … even just simple access to data can serve as a boundary object so long as the effort is made to genuinely put the data in the middle of the boundary, not just in the scientist’s world.  I’m working on a project for the Maine Sustainability Solutions Initiative to take a mix of freely available, but technically complex data (e.g. shapefiles) and new data (e.g. model projections) produced by our researchers and just put it in a simple, map and trend chart interface in a web-browser. I keep getting told – well those GIS layers are already available or you’re missing the complexity behind the forecasts, but they’re kind of missing the point of a boundary object. Its about putting conversation starting information out in a genuine spanning (lives in two worlds) context. The average state legislator or town councilor is not going to pull out a GIS layer. But they will spend 5 minutes on the web. And if they do they will be able to get a trend chart of climate change in their town or a map of land cover change over the last 20 years in their town or the changing economy of their town (part of the appeal of maps and part of the spanning is people always want to look at where they live) . And they will start putting patterns together. Start comparing past to projected future. Start looking for more information on assumptions behind the models. And have a lot of conversations they wouldn’t have had. Time will tell if this specific project serves its purpose, but if past experiences with boundary objects are any guide, my money is that it will. This ties into the theme of how to make sure research gets across the boundary and doesn’t just mold away in journals – which will be the topic of my next post on boundaries.

But my bottom line experience is that getting a bunch of people in a room with different opinions and “just talking about it” or “letting everybody be heard” is vastly less effective than getting a bunch of people in a room with different opinions with a boundary object at the center of the discussion. It focuses things in very concrete ways and towards negotiation and compromise and increases the level of understanding and minimizes the amount of talking past each other between the sides.

One could speculate for a long time about the psychology of why boundary objects work (an irrational belief that anything coming out of a computer is correct, the ability to find “my house” on the map, focusing people in a constructive direction, genuine success at translation and spanning, and etc). These are interesting topics of study in their own right (and are being studied), but not my own field of research. I just notice how well they DO work. Its almost like its magic (except of course the reality is its a lot of hard work behind the scenes). Hence the title of “unreasonable effectiveness”

What boundary objects have you used in your work? Were they effective? What made them effective? Any experiences with what made them more effective?