Hoisted from the comments: Ben Bolker on “other people’s data”

Posted on December 3, 2013 by Jeremy Fox

In the comments on Carl Boettiger’s recent guest post on “Data Science” culture, Ben Bolker notes a 2005 essay of his in BioScience on “other people’s data” (username “bbpapers”, password “research”). It’s about how the sort of work pursued by computationally- and mathematically-sophisticated ecologists seems to have shifted over the decades, from developing new theories to testing existing ones. That testing often involves making use of other people’s data, hence the title of the piece. Ben muses on the drivers and consequences of this cultural shift (if that’s the right term).

It’s a lovely piece. I wish I’d known about it earlier, so I could’ve included it in our old compilation of theoreticians eloquently explaining themselves to empiricists. To whet your appetite and encourage you to click through and read the whole thing, here’s an extended quote from the beginning:

Several years ago, at an Ecological Society of America (ESA) meeting, I overheard a colleague explain his work to a stranger as “help[ing] other people find more in their data than they knew was there.” Over the course of the meeting, I talked to several friends and colleagues who were in the same position, working on other people’s data, helping other people answer other people’s questions. Like me, these quantitative ecologists come out of good labs, have good jobs, and are not lacking for resources. Are we quantitative ecologists really spending our time answering other people’s questions, and not answering—or even asking—our own? If so, why?

Quantitative ecologists are only loosely anchored by the natural history of particular systems. Even the word “systems” is a giveaway; we see organisms as realizations of ideas, not as furry, feathery, or green individuals. Many of us came to ecology from physics, or mathematics, or statistics, because we loved its ideas. If we didn’t care about the organisms, we would have been content as mathematicians or physicists, but our true love was for the way that real ecological communities could embody general mathematical concepts of dynamics and variation. Our attachment to ideas gives us great flexibility, even more than other ecologists. Some of us are drawn to model systems, such as microcosms of flour beetles or plankton, where we can put ideas to searching experimental tests; others are drawn to the opposite extreme, that is, to long-term observational data from systems such as lynx populations or measles epidemics that challenge our ability to infer ecological processes from patterns. In either case, we are primarily interested in how we can use organisms to understand general principles rather than in the particular organisms themselves. This flexibility lets us pursue interesting questions wherever they lead.

Read on for telling remarks about everything from physics envy (we should envy physicists, but not for the reason you might think), to the trade-offs between asking one’s own questions and answering someone else’s.

In an old post I jokingly posed the question “What would Ben Bolker do?” Well, one thing Ben does is write wonderfully. Click through already!

13 thoughts on “Hoisted from the comments: Ben Bolker on “other people’s data””

Nathan Lemoine on December 3, 2013 at 2:45 pm said:

I just read through the Bolker paper and I though it was pretty excellent, in particular the section on physics envy, which is a topic I’ve thought about a lot. I’ve always admired the ability of physicists to generate theories (that actually work with ‘unrealistic effectiveness’) then test them empirically. I have my doubts this is even possible in ecology (due to the sheer complexity of ecological systems). Regardless, I’ve always wondered if the larger divide between theory and experiments in ecology when compared to physics is due to the larger gap in mathematical.. savvy-ness.. between theoretical ecologists and experimental/empirical ecologists. Presumably (and I’m guessing), experimental physicists still have a pretty high level of mathematical sophistication to comprehend and then test theories. I know many experimental ecologists, definitely including myself, who really struggle with some of the more complicated theoretical ecology papers because no one told us in undergrad that a biology curriculum should require at least Calc I and II. It’s really hard to empirically test theories when we struggle to understand what the theory actually predicts (or flat out don’t read the papers because, let’s face it, they’re really hard sometimes. If I have the choice between struggling to read a single theoretical paper in a day or multiple other papers, I usually choose the latter). It’s also one thing to test general patterns (i.e. up vs. down) versus testing actual quantitative predictions (which are ostensibly more useful). I rarely see experiments that use theory to generate an explicit prediction of, say, effect size and then experimentally test that prediction.

Reply ↓
- Simone Vincenzi on December 3, 2013 at 10:28 pm said:
  
  I have some simple comments.
  (1) What is mildly surprising to me (and I say mildly because it is common across fields, but still surprising) is that less theoretical guys (not referring to you Nathan, it is just the usual broad generalization) could very well spend the time spent pondering about the relative difficulty/utility/whatever of quantitative work vs less quantitative work in other ways, for example by opening that math/statistic/programming book/website and spend time getting new tools (it takes time and effort, I know).
  (2) Some theoretical papers are hard. A lot of things are hard, but valuable. Some papers are poorly or annoyingly written. Some authors do a disservice to the readership by “it is obvious that a leads to b after some simple manipulations (which will take 3 pages to explain, btw)”.
  (3) You wrote “I have my doubts this is even possible in ecology (due to the sheer complexity of ecological systems).” I disagree. It is not that with quant methods we want to understand everything, we want to understand more. Weather prediction, by using an example already used by Brian multiple times, is incredibly difficult since weather is a complex result or complex mechanisms plus their complex interaction. The use of quant methods in meteorology (applied physics) has allowed to gain a better understanding + better prediction of weather. Can we predict the weather with reasonably accuracy one month in advance using physics (instead of statistics, like avg temperature for that day across years)? No. Does it mean that meteorology is useless?
  
  Reply ↓
  - Simone Vincenzi on December 3, 2013 at 10:45 pm said:
    
    By re-reading my reply, it sounded harsh for no reason, thus I want to reformulate point (1) this way. Basic quantitative methods can be learned even if we skipped quant courses in college/grad school/ post docing. It takes time, like a lot of other things. I do not think it is generally an excuse, if we think that at least basic quant methods in ecology/evolution/environmental science are valuable (they are). Then, everybody who wants to be relevant specializes.
    Regarding (2) I’d like to add that I was reading that Feynman himself had to spend considerable time trying to understand other people’s papers. Most of the time he preferred to rediscover the thing.
  - Nathan Lemoine on December 4, 2013 at 12:22 am said:
    
    I don’t think it sounded harsh. I agree that quantitative skills can be learned without formal training. In fact, I am someone who spends an incredible amount of my ‘free’ time reading stats/maths books (I’m currently working through a linear algebra textbook) and watching iTunes U courses because I want to be mathematically proficient. However, my point was less that ‘it can’t be done’ and more that the reality is that many biologists are coming out of undergrad much less mathematically prepared than physicists and are much more ill-conditioned to spend time trying to bridge theoretical and experimental biology than physicists. Many (most?) are too scared of math or don’t see the utility in it enough to spend the time learning (I tend towards the former, as someone who used to be the former but claim the latter). Also, as someone who is trying to be self-taught , there are some things that are just hard to pick up on your own.
    
    I also don’t like the meterology example. I don’t know why, but it doesn’t seem like the right comparison to me.
mspencer2013 on December 4, 2013 at 12:03 pm said:

A footnote on page 183 in the second edition of Differential Equations with Applications and Historical Notes by Simmons (a great textbook for self-study, incidentally):

“As a boy [Sir George] Airy was notorious for his skill in designing peashooters; but in spite of this promising start … he developed into the excessively practical type of scientist who is obsessed by elaborate numerical computations and has little use for general scientific ideas.”

I think that most current quantitative ecologists are of this type, and our work will quickly be forgotten unless we start coming up with some original ideas.

Reply ↓
- Jeremy Fox on December 4, 2013 at 12:51 pm said:
  
  “our work will quickly be forgotten unless we start coming up with some original ideas.”
  
  Yes, but I think that’s true of every scientist, quantitative or not. Most everybody’s work is going to be quickly forgotten, for various reasons (including that good original ideas are rare).
  
  Reply ↓
  - Simone Vincenzi on December 4, 2013 at 4:04 pm said:
    
    The inevitability of statistical distributions
Kit Prendergast on December 5, 2013 at 3:16 pm said:

Dear Jeremy Fox, what are your thoughts in response to this?
http://www.nature.com/news/global-change-ecology-must-evolve-1.14137

Reply ↓
- Jeremy Fox on December 5, 2013 at 3:18 pm said:
  
  Brian commented on that a couple of weeks back: https://dynamicecology.wordpress.com/2013/11/15/friday-links-the-history-of-big-data-in-ecology-and-more/
  
  Reply ↓
cboettig on December 5, 2013 at 6:49 pm said:

I found Ben’s paper more pessimistic than I expected. While it is an excellent piece with much I agree with, I take issue with some of the generalizations.

I don’t feel that most of my theoretically-minded colleagues work “on other people’s questions”, even when they do work on “other people’s data”. Indeed, I’m a bit confused about the notion of “other people’s questions.” I feel that most ecologists work on questions that belong to the field at large, or at least to a subdomain, rather than being questions of interest to only one particular person.

Ben frets that “the slippery slope toward answering other people’s questions is important, and worth worrying about.” I certainly understand the concern that theoretically-minded people moving from the Bob May role to the laboratory technician role, but I think this has a lot more to do with preception than reality. Most interesting questions require both clever empricism and clever theory. Some may appreciate the contribution of the empericist and view the theoretical work as relatively routine statistics, others may see the empricial work as routine and feel the real contribution comes from the theoretical side (I leave it to the reader to decide which group is more likely to hold which opinion). So while this might look to some like the Ben’s slippery slope towards “other people’s questions”, I contend that if the question is truly worth pursuing the contribution is greater than the sum of those parts, and there really is no cause to worry — this is good science going forward. If on the other hand the question is not so inspired, why then yes we may critique one party for wasting time on “other people’s questions”, but we might call a spade a spade and critique wasting time on bad questions instead.

Reply ↓
Pingback: How to improve theory-empirical collaborations | Navigating the flood
Pingback: Stats vs. scouts, polls vs. pundits, and ecology vs. natural history | Dynamic Ecology
Kyle Coblentz on December 14, 2015 at 7:46 pm said:

On Mathematical education for Biologists: The mathematics required for the biology majors (Cell & Molec and Ecology & Evolution) at my undergrad institution (Tulane University) only went through the second semester of Calculus. This seems to be fairly normal for biology majors across universities (maybe even less math). I was fine with this up until my senior year when I became deeply interested in interaction strengths and their quantification (a pretty math-y matter). So, I began working on my math chops and continue to throughout grad school. However, I discovered that nearly every other “hard” science (physics, chemistry, math) and all of the engineering students at my undergrad institution were required to take three semesters of calculus, differential equations and linear algebra. I imagine the reason for this ‘double standard’ is historical, but I also imagine I would have been much better prepared quantitatively as an ecologist if this weren’t the case.

I understand that many biologists have no or little interest in mathematics, but are we doing ourselves a disservice by not requiring the same amount of math that nearly all of the other scientists and engineers have?

Reply ↓