In (tentative) praise of shopkeeper science (UPDATED)

Posted on July 11, 2013 by Jeremy Fox

I grew up around grocery stores. My grandfather on my father’s side started Fox’s Market in Middletown, a small town in rural central Pennsylvania. He ran the business along with my dad and my aunt. Fox’s Markets was a success, eventually growing to four stores. No mean feat, given how competitive the grocery business is. Most people in the US have more than one grocery store within a few miles, even in a small town like Middletown. So profit margins are small, and to stay in business as an independent grocer like Fox’s you have work hard and effectively. Growing up, I remember my dad often working in his home office late into the evening. I also remember how in touch he was with everything going on in the stores. He knew all the employees by name, and when walking through the stores he’d often talk to them about all sorts of little details of the business. Some of this was him gathering information and staying in touch with how things were going (e.g., how well was a particular item selling), and some of it was him making sure things were running as they should (e.g., were the shelves neatly stocked). Because my grandfather, father, and aunt were so closely involved with every day-to-day aspect of the business, Fox’s Markets was very much a reflection of them. Some of items sold in the deli were our family recipes. Fox’s was very involved in supporting the local community, for instance by buying a lot of lambs from local farmers at the annual 4H auction at the Farm Show. Even the look of the stores–my grandfather was a serious antique collector, and so the stores were furnished with his antiques. Old advertising signs, farm implements, mannequins…One store even had an antique carousel for children to ride.

My dad and aunt sold Fox’s Markets to another local independent grocer when they retired. My brother and I both had both gone into other lines of work by that point. But while I ended up becoming a scientist rather than a grocer, in many ways I think I run my lab in the same way my grandfather, dad, and aunt ran Fox’s Markets. I don’t run a big lab. I have three graduate students, just down from four because my first one just finished. I consider that an ideal lab size–just large enough to have the critical mass for a “group”, but small enough that I have time to meet one-on-one with every student as often as needed (typically weekly during the fall and winter). In the summers I usually hire a couple of undergrad research assistants, and I supervise the occasional undergrad honors student (I’d supervise more but it’s hard to convince most undergrads that they can do ecology indoors…). While I wish I had the funds for a technician and/or postdoc (I’ve had both in the past, but infrequently), I don’t think I’d want several of them, as then I wouldn’t be able to stay in sufficiently close touch with everything that’s going on in my lab. And while I can imagine research projects I could spend massive amounts of money on, I don’t mind not having massive amounts of money, as I have plenty of good ideas for cheap research projects. Not all really good science is expensive science. To get access to large amounts of money, I’d probably have to join some kind of big collaborative project. I’d rather have the freedom to do the science I want to do, as I want to do it, contingent only on my own ability to convince others that it’s good science. So if I want to work in microcosms, or write a blog, or let my students work in systems totally different than mine, I can do that, just so long as it leads to good science.

In short, I do what might be called “shopkeeper science”. I run my lab more or less the way a shopkeeper might run his shop.

Shopkeeper science is under increasing pressure. Funding agencies in many jurisdictions are funding fewer individual investigators, in order to fund only the very best grants, or to fund “star” investigators and expensive collaborative projects (for instance, see here for an excellent summary of the Canadian situation). I think that’s a bad idea. It amounts to putting too many eggs into too few baskets, and it wastes money by throwing too much money at researchers who are probably not money-limited.

A recent analysis of Canadian data seems to back me up on this, although as I’ll discuss below the analysis has some limitations, and the policy implications are not as straightforward as they might seem. Fortin & Currie (open access) have documented impact of researchers funded by the Canadian Natural Science and Engineering Research Council (the equivalent of NSF in the US or NERC in the UK) in relation to their funding. They find that impact is only weakly related to funding, for several different measures of impact. They suggest that this is because really high impact work is rare (indeed, it’s rare almost by definition), not necessarily highly expensive, and difficult to predict in advance. Further, the impact-funding relationship is decelerating, indicating that per-dollar impact is lower for larger grant holders. And researchers who also hold grants from other agencies are not any more productive on average than those who hold only NSERC grants. The simplest interpretation is that, beyond some relatively low level of funding, researchers become increasingly limited by other factors (presumably time) rather than money. And while other measures of impact, such as number of students trained, probably do scale more tightly with funding, presumably taking funding away from big labs and giving it to smaller labs would lead to about the same number of students being trained, just by different people. So for a funding agency with a finite budget, seeking to maximize “bang for its buck”, Fortin & Currie suggest that the best policy is to give smaller pots of money to more researchers.* (UPDATE: Thanks to Carl Boettiger for bringing Fortin & Currie to my attention)

I’ll note in passing that other lines of evidence indicate that there’s deceleration in the impact-funding relationship even among highly-funded “elite” ecologists. And Fortin & Currie cite a previous study of the US NIH that found a decelerating relationship between impact and funding. So Fortin & Currie’s results aren’t necessarily just due to lack of a sufficient range of variation in funding levels, and seem like they could generalize to other funding agencies in other countries.

Fortin & Currie’s analysis does have some limitations (first noted to me by a Calgary colleague). First, their data come from a time period when NSERC funding was set very differently than it is today. It used to be the case that there was a lot of inertia in funding levels. Your funding level was ultimately determined by the amount of your first successful application, since at every renewal after that you could expect your grant to increase by about 9%, no matter what the impact of your work. Under the current system, in which there is little inertia in funding levels and in which 1/3 of your grant score depends on your track record over the past 6 years, there might actually be a stronger association between impact and funding. Second, the fact that they’re using data from the old funding system may explain why they fail to find much relationship between change in funding levels and change in impact. Observed changes in funding levels are concentrated around small positive values, due to the high inertia in the old system. Third, there may often be longer time lags between when work is funded and when it is published than Fortin & Currie’s analysis allows for. You could do more sophisticated analyses involving covariates to try to get at some of this. But offhand, I don’t see how any of those caveats would explain the observed deceleration in the impact-funding relationship.

A more important caveat is to do with the policy implications they draw. Fortin & Currie explicitly assume a world in which the impact-funding relationship for every investigator can be described by exactly the same curve (plus residual error), and use the data to estimate whether that curve is straight, accelerating, or decelerating. But of course, different people’s research impact (however measured) might scale with their funding level in different ways, for all sorts of reasons. For instance, some investigators might just be better scientists than others, able to produce more impact for any given level of funding. Or maybe the impact of some investigators increases slowly as their funding increases from low levels, but saturates at a high level, whereas the impact of other investigators increases rapidly as their funding increases from low levels but saturates at a low level. Etc. Let’s call all this “investigator heterogeneity”. Investigator heterogeneity is a really important thing to know about when it comes to optimally allocating funding. Even if the impact-funding relationship is decelerating for every individual investigator, and decelerating overall, it is not necessarily true that reallocating funding away from highly-funded investigators towards other investigators will increase bang for the buck.

Here’s a simple hypothetical illustration of that point. The figure below shows the relationship between impact and funding for three different hypothetical investigators: Dr. Green (highest line), Dr. Red (middle line), and Dr. Blue (lowest line). Every line takes the same power law form y=ax^b assumed by Fortin & Currie. All the relationships are decelerating (b=0.5 for all investigators). But I introduce investigator heterogeneity by allowing the parameter a to vary among investigators (a=1, 2, 3 for Drs. Blue, Red, and Green respectively). Allowing a to vary is an arbitrary choice; I could’ve varied b, or both a and b, to make the same point.

The filled black points show the funding levels for each investigator, and the corresponding impact. Notice that these points define a decelerating curve. This (not the green, red, and blue curves) is the sort of decelerating curve Fortin & Currie found. The fact that the green, red, and blue curves don’t all fall right on top of the curve defined by the black points represents investigator heterogeneity, which matters for reasons I’ll explain in a second. In this example, the three investigators share $50,000 of funding and produce a total impact of 822.1, for an impact/funding ratio of 0.016

The open points show what happens if we reallocate that $50,000 equally to all three investigators. Obviously, the funding and impact of Drs. Red and Blue increases, while the funding and impact of Dr. Green drops. But here’s the key point: because of investigator heterogeneity, the increased impact of Drs. Red and Blue doesn’t fully compensate for the decreased impact of Dr. Green, much less overcompensate as it would if there were no investigator heterogeneity. Total impact drops to 774.6, for an impact/funding ratio of 0.015. Oops.

Put another way, if all investigators were identical as Fortin & Currie assume, changing their funding levels would shift their impact along the curve defined by the black points. But investigators aren’t all identical, and so that’s not how their impact shifts.

Now this is an arbitrary example. I could’ve constructed an example in which allocating funding equally among investigators dramatically increases total productivity–or decreases it even more. And I could’ve shown an example in which funding is reallocated in some other way that doesn’t equalize everybody’s funding. But none of that matters, because my point is simply that investigator heterogeneity hugely complicates the task of optimal allocation of funding.

If I’m not mistaken, the optimal thing for a funding agency to do, at least if all investigator’s impact-funding curves are decelerating, is to fund investigators so that they all produce the same impact per dollar (correct me if I’m wrong!) (UPDATE: in the comments, Ric Charnov reminds me of something I should’ve remembered: the optimal funding allocation equalizes the marginal productivity of each investigator–the slopes of the curves.) But you can’t do that unless you know the shape of the impact-funding curve for each investigator, and I doubt you can reliably estimate those shapes from the available data.

So where does this leave the argument for shopkeeper science? I’m not sure. I still think that giving a few investigators large grants (or multiple large grants!) while giving most investigators nothing is a really bad idea. High impact science is sufficiently unpredictable that putting all your eggs in a few baskets just seems really unwise. But even if you grant me that point and agree that lots of people ought to get at least small grants, there’s still a huge range of possible allocations of funding among those people. And I have no idea how to choose among those allocations. Plus, there’s clearly much more that goes into determining the collective scientific impact and “bang for the buck” of a country’s scientists besides the funding allocation system used by its granting agencies.

p.s. Fortin & Currie find that a researcher’s own past impact is a much stronger predictor of their future impact than is their funding. So that changes in funding for individual researchers aren’t associated with changes in researcher impact. This could be, and probably is, for various reasons.

p.p.s. The policy implications of Fortin & Currie’s results actually are unclear for other reasons besides those discussed above. One could argue for redistributing the pot: cutting funding from highly-funded researchers and redistributing it to researchers who have less, or have nothing. I take it that’s more or less what Fortin & Currie would argue for. But as a Calgary colleague of mine suggested to me, one could also argue for shrinking the pot: just cut funding from highly-funded researchers and don’t spend it on science at all! On the grounds that, if scientists’ impact is only weakly related to their funding levels, you can just cut everyone’s funding down to some low level without causing much damage. My point about investigator heterogeneity might provide one way to push back against this argument for shrinking the pot, but I’m not sure.

*Not that there might not be reasons to do the opposite. For instance, here’s a provocative suggestion that giving out many modestly-sized grants leads to lots of false results because researchers can only afford modest sample sizes.

25 thoughts on “In (tentative) praise of shopkeeper science (UPDATED)”

colin! on July 11, 2013 at 1:37 pm said:

Hi Jeremy! Great post. I think the way Fortin & Currie treat impact/$ may be a little disconnected from the reality of the situation. The best microcosm ecologist may be able to produce high impact work on theoretical ecology on a small budget, generating high impact/$. However, the best ecosystem ecologist may have the same impact as microcosm ecologist, but require a much larger budget to do ecosystem scale science (warming forest plots, steeping forests for years at elevated concentrations of CO2). Ecosystem ecologist has less bang/buck than microcosm ecologist, but I doubt anyone would argue that high impact ecosystem experiment was a waste of funds.

Sometimes we need to drop big bucks to scale our ideas up, that may lead to less impact / $ due to scale. Two rockstar pubs in theoretical ecology and ecosystem ecology with similar impact may have wildly different price tags.

Reply ↓
- Jeremy Fox on July 11, 2013 at 2:02 pm said:
  
  Fortin & Currie do discuss the cost of research issue a bit. For NSERC-funded researchers, costs aren’t actually nearly as variable as you might think. Most every NSERC Discovery Grant holder’s single biggest expense–often something like 75% of the budget–is people. Summer students, summer salary for grad students, etc. And people cost the same whether they’re doing microcosm work or ecosystem ecology or whatever. NSERC does have cost of research supplements for people whose research really is just inherently expensive. People who work in the Arctic often get such supplements for instance. But those supplements are actually a small fraction of the total budget. This isn’t to say that cost of research isn’t an issue–you’re absolutely right that it is–just that it might not be quite as big an issue as one might think.
  
  In Canada, and probably in other places, a lot of really expensive stuff is done by big collaborative networks of investigators rather than by individuals acting alone. For instance, you mention ecosystem ecology, which often is a pretty collaborative enterprise. In Canada and elsewhere, those collaborative networks typically are funded by different programs than ones dedicated to funding individual investigators. I think it’s tricky to compare the per-dollar impact of such work to that of individual investigators because it’s something of an apples-to-oranges comparison.
  
  Reply ↓
- Brian McGill on July 11, 2013 at 2:08 pm said:
  
  I agree with you that ecosystem science is important, so lets take a specific field out of the discussion. But you are absolutely right that different fields costs different amounts of money per unit impact. In this world of increasingly tight money, I don’t think it is correct to be so glib in assuming that all fields deserve the amount of money they need to achieve the same level of impact as other fields. An economist thinking about optimal portfolio allocation would demand that fields that are inherently more expensive are obligated to justify that their extra money is important and well spent. Things like societal benefit, appropriateness/usefulness of proposed scales, age/maturity of the field (more specifically whether there is a lot of low hanging fruit leading major discoveries available) all would enter the conversation.
  
  When the money for one experiment in field A could fund 6 experiments in field B, we need to be having open and candid conversations about how much money each field should get. I very much doubt the outcome would be that field A should get only the same amount of money (and thus 1/6 the impact) as field B, but it is not an obvious given that field A should get 6 times as much money just because it is 6 times as expensive either. In the end that is a really perverse incentive to do science as expensively as you can. Science funding is a close approximation to a zero-sum game. We might need to be a little more careful and picky what we fund in expensive fields.
  
  Note that this argument applies across major disciplines too. Everything in physics is so expensive these days (often measured in billions of dollars) that anything in ecology looks cheap in comparison (math/stats is a really cheap research field too). Prudent porfolio allocation would have to ask if we are getting enough bang for the buck when we allocate money into the more expensive field of physics.
  
  Reply ↓
  - Jeremy Fox on July 11, 2013 at 3:15 pm said:
    
    Oh absolutely Brian. In suggesting that cost differences among disciplines or subdisciplines are not as big as one might think, I certainly didn’t mean to imply that all disciplines or subdisciplines should get the same amount of money! The allocation issue you raise–how a granting agency should allocate its money across all of science, given differences among fields not just in costs but in “impact” or “importance” (however defined)–is incredibly hard. Implicitly, every granting agency has some “revealed preference”–their actual allocation of money to different disciplines or subdisciplines. But I doubt those actual allocations are the function of some sort of well-defined optimization algorithm. For instance, NASA has a political constituency, because it’s a major employer in certain states. Legislators from those states would never allow a big chunk of NASA’s budget to be reallocated to, say, NSF, no matter what arguments one marshalled about the relative costs and benefits of manned spaceflight vs. the sort of work NSF funds. I agree that the conversation you suggest is necessary, but it’s a very difficult conversation to have. At least to have productively. Doing proper optimization of funding allocation across major disciplines is surely impossible, and so all sorts of other decision-making methods–political considerations, historical inertia, etc. etc.–rush in to fill the vacuum.
    
    One other minor point: another source of cost differences among disciplines is cost of major equipment. Which is why many granting agencies have separate programs to support purchases of things like field vehicles or MRI machines or next-gen sequencers or whatever. EDIT: And of course, allocating money to those programs vs. to other programs is just one more part of the allocation process, which as you correctly note is something that ought to be up for discussion.
    
    Along these lines, perhaps worth noting that in astronomy (where progress depends on very expensive instruments like telescopes and satellites), the whole field has a formal mechanism of priority-setting. The relevant US funding agencies take the results of that exercise very seriously. Whether something like this would work in other fields, or whether it would work across as well as within fields, I’m not sure. Just off the top of my head, I’d guess probably not…
  - Jeremy Fox on July 11, 2013 at 3:50 pm said:
    
    Another recent example of how decisions on how much money to allocate to different disciplines get made in practice:
    
    http://themonkeycage.org/2013/07/11/the-war-on-social-science-and-its-consequences/
Terry McGlynn on July 11, 2013 at 9:56 pm said:

A wonderful post on many counts. Thanks for taking the time to share this.

One issue about the functions in the figure, is that all have intercept at 0. There are lots of people who get stuff done with no funding. I wonder how the shape of the curve varies with the kind of science one does. Hmm.

Reply ↓
- Jeremy Fox on July 11, 2013 at 10:09 pm said:
  
  Thanks Terry! Thought you might like this one.
  
  Yes, there’s no substantive reason for the functions in the figure to have zero intercepts. I just used zero-intercept power laws b/c that’s what Fortin & Currie used. Using some function that could potentially exhibit a non-zero intercept would indeed be more realistic. And that would give another way in which function shape could vary among investigators. For instance, mathematicians, and mathematical modelers in other fields, can probably do a lot of research just with pencil, paper, and a computer, so their intercepts would probably be well above zero.
  
  As to how the shape of the curve varies with the sort of science one does, good question! I think to answer it you’d need to distinguish between different sorts of needs for funding. For instance, if your research requires you to have some expensive piece of equipment, your productivity might be zero or near-zero without it. But if startup funds or some other funding allows you to make that one-off purchase, you might be able to remain quite productive (at least for a good while) on fairly low levels of ongoing funding. That would kind of describe my work. You can’t do my sort of microcosm work without good microscopes and refrigerator-incubators. But once you have those pieces of kit (and other minor kit like glassware), the ongoing running of the lab is pretty cheap–cheap consumables, and salary for summer undergrads. But then again, that doesn’t mean that my productivity as a function of funding is flat beyond some low level of funding. I can make just as much use of a postdoc or a technician as anyone, for instance. So yeah, figuring out the shapes of those curves is really tricky. I wonder if anyone’s ever tried to do it…
  
  Reply ↓
Jeremy Fox on July 11, 2013 at 10:16 pm said:

In Nature this week, there was an article in which various science policy types were asked what they’d cut from the US federal science portfolio in order to deal with sequester-generated budget cuts. One answer I found particularly interesting: do a randomized experiment. That is, cut stuff (even individual investigators!) at random and see what happens. Use the information you gain to then do optimal reallocation of funding. That’s certainly one way that you could try to get the information needed to estimate the shape of the curves in my figure, for instance.

Reply ↓
jeffollerton on July 12, 2013 at 6:58 am said:

An interesting and thoughtful post, Jeremy. In relation to your last comment about randomness, I’ve long thought that the fairest way to award UK Research Council grants is to have an initial sift to remove the obviously unfundable and then to have a lottery draw to award funding to the rest. Keep drawing projects until all of the funding is used up. It would save a lot of time and effort on administering and reviewing the grants. Because, frankly, trying to differentiate between the quality and “importance” of the majority of submitted applications is, like all aspects of peer review, a highly subjective exercise. I sat on a NERC review panel recently where it was clear that the disparate ecologists around the table, from different fields, had widely differing views on some of the grants.

On a different topic, I note that the definition of “impact” used in the Fortin and Currie paper (“numbers of articles published, numbers of citations to those articles, the most cited article, and the number of highly cited articles”) differs completely from the definition that’s being used in the forthcoming Research Excellence Framework (REF) exercise for assessing UK research since 2008. For the REF, “impact” refers specifically to societal impact of the work, rather than academic impact. In fact the latter is not being judged at all (at least not explicitly). It’s a really, really hard thing to gauge and evidence, and on this side of the pond there’s a lot of concern about how REF Impact Statements are to be presented later this year. Unknown territory for us. But the link to level of funding that supported the research that resulted in societal impact should be possible to track and would make an interesting study post-REF.

Reply ↓
- Simon Leather (@EntoProf) on July 12, 2013 at 7:13 am said:
  
  Great post Jeremy and I like your idea Jeff – always thought that as getting grants is a bit of a lottery already (so very many subjective decisions) that might as well make it a real lottery – would save us all a huge amount of time and let us get on with research!
  
  Reply ↓
Pingback: DEB Numbers: Award Size and Duration | DEBrief
Pingback: Interesting Links: 3 hobbies for scientists, illustrated book of bad arguments, and more | fossilosophy
Pingback: Follow the money – what really matters when choosing a journal | Dynamic Ecology
Pingback: Ask us anything: how do you fix the entire academic research and funding system? | Dynamic Ecology
Pingback: What’s the ideal size for an ecology lab group? | Dynamic Ecology
Pingback: Book review: Community Ecology by Gary Mittelbach, and Community Ecology by Peter Morin | Dynamic Ecology
Pingback: Happy second birthday to us! | Dynamic Ecology
Pingback: Friday links: breaking down NSF success rates, funding people not projects, and more | Dynamic Ecology
Pingback: Why Canada’s research granting councils mean so little to this government’s agenda | Piece of Mind
Pingback: Why grant funding should be spread thinly | Scientist Sees Squirrel
ric charnov on May 20, 2016 at 3:01 pm said:

Hi Jeremy; wasn’t reading the blog when this was published. optimal allocation of a fixed dollar amount across the 3 investigators , where optimal means ‘sum of the 3 impacts’, is where the marginal impacts are equal, not the impact/$: its the impact/’last $ spent’, the slopes of the curves at the edge.
I used to use a similar model to amuse my behavioral ecology students by solving how they should allocate some fixed time to studying for 4 classes, where each class had a gain curve and one wanted to max sum of gains.
ric charnov

Reply ↓
- Jeremy Fox on May 20, 2016 at 3:12 pm said:
  
  Ah, thank you Ric. Now that you point it out, I’m not sure why I didn’t just say that at the time. Brain cramp, apparently.
  
  Reply ↓
Pingback: I just got my first papers accepted in almost two years. Which is ok. | Dynamic Ecology
Pingback: Friday links: lab hacks and more! | Dynamic Ecology
Pingback: The story and lessons of the NutNet experiment: an interview with Elizabeth Borer | Dynamic Ecology