Friday links: admin vs. you, what makes for good exploratory research, and more

Also this week: the pluses and minuses of preregistering your research, does more pressure to publish really make publication bias worse, the first “man on the street”, and more.

From Jeremy:

Andrew Gelman on what makes for good exploratory research. I think this is really important to think about. On the one hand, there are strong arguments that hypothesis development is at least as important as hypothesis testing, and that both are badly degraded if you try to treat the former as the latter (intentionally or not). On the other hand, exploratory studies can be done badly even when they’re not disguised as confirmatory studies. The fact that a study is exploratory doesn’t get it off the hook for being badly thought out or poorly designed or boring or whatever. And there’s the somewhat orthogonal question of whether we need to publish more exploratory studies, and if so in what form. Brian and I chatted about this a long time ago in the comments here.

Margaret Kosmala preregistered one of her research projects. Her thoughts on the pluses and minuses.

The senior guy in my field acts like a total jerk sometimes. What should I do?” From economics, and somewhat specific to that field, but much of the advice generalizes.

“While centralized staff often create more work for faculty members, localized staff reduce it.” Yup. (ht Frances Woolley)

Quit saying that one field is “broader” than another.

A long while ago I linked to a PNAS paper by Fanelli and Ioannidis, finding that publication biases in the “soft” behaviorial sciences are stronger in the US than in other countries. I was far from the only one who paid attention; the finding went viral and has become a key piece of evidence that higher pressure to publish leads to publication of shoddier research. I recently learned that the finding has been questioned: it doesn’t hold up under an alternative, at least equally-reasonable method of analysis. In reply, the authors argue that the original result is robust. I’m not comfortable adjudicating without digging into the original data myself (though I confess I don’t really see the signal I’m supposed to be seeing in the graphs in Fanelli & Ioannidis’ reply…). Just wanted to pass it on for interested readers. (ht Andrew Gelman)

The Dutch Royal Academy is holding two special elections to which only women can be nominated.

The first reference by a scientist or mathematician to the importance of being able to explain one’s work to “the man on the street” has been traced. (UPDATE: link fixed)

7 thoughts on “Friday links: admin vs. you, what makes for good exploratory research, and more

  1. RE: exploratory studies
    I would think that for exploratory studies of ecological questions the regional publications (Southeastern Naturalist et al.) or some society journals (Copeia) would be the perfect outlet.

    • The Gelman link and Brian’s early praise of exploratory studies are on the mark. I think this comment reflects a common misunderstanding of exploratory studies generally and a common opinion that exploratory studies are not of broad interest. By contrast, many papers published in top ecology and evolution journals are exploratory; they are embedded (naively?) within a hypothesis driven Intro and discussion because this is how the fields teach and mentor students (hypothesis driven research!). Most of these hypotheses I would consider extremely weak and fuzzy so just about any result is a “prediction”. On my more cynical days I wonder how many of the “predictions” are results that become predictions as the patterns emerge during the data exploration.

      • I think you, Brian, and I all agree that publishing exploratory studies dressed up as confirmatory studies is a bad idea.

        “On my more cynical days I wonder how many of the “predictions” are results that become predictions as the patterns emerge during the data exploration.”

        Years ago I linked to a sobering/depressing study in the business literature that looked at a whole bunch of dissertations in the field, and at the papers arising from those dissertations. People quite commonly dropped, revised, and even reversed their “hypotheses” when going from the dissertation to the published version, as needed to make the data match the hypotheses.

        Whether and how to publish exploratory studies if they weren’t dressed up as confirmatory is a different question, I think. One could take the view that since they’re really exploratory and we’re happily publishing them in leading ecology journals already, we should just keep publishing them but make sure they’re written and presented in such a way as to make clear their exploratory nature. Or, one could take the view that those studies were only published in selective, widely-read ecology journals because of their confirmatory disguises, and that they would be less worth publishing in such venues if they were openly exploratory.

        I’m unsure where I sit on this. I recently reviewed a paper for a leading ecology journal that was exploratory/hypothesis-generating and to its credit was quite open about it. I found it challenging to evaluate. The hypothesis developed was very interesting. But it was developed based on quite limited data, which had been compiled from various sources and only “fit together” if one made numerous strong assumptions that weren’t easily checkable. Had the data not been there at all, the hypothesis might’ve been regarded as mere armchair speculation. For a hypothesis to be worth publishing in a selective venue, so as to encourage others to further pursue it, I think there has to be some reason that it’s *worth* pursuing. Such as “there’s reason to think it might be true”. But to the extent that an analysis is exploratory, it doesn’t give us reason to think the hypothesis might be true (at least not very much reason). Which leads to the difficult question of “what makes a hypothesis-generating paper worth publishing?”

        It’s not clear to me that many ecologists would want to see lots of hypothesis-generating papers published in leading journals, if those papers were openly written that way. After all, we already have many such papers–they’re called theory papers. They’re mostly published either in theory journals, or in Am Nat. (At least, that’s what ecologists think: I don’t think that’s a healthy state of affairs, relegating purely hypothesis-generating papers to a separate subsection of the literature. But I do think that’s the state of affairs.

        Of course, there’s more to exploratory work than hypothesis generation. Think for instance of Hatton et al. 2015 Science, which I consider a leading candidate for the George Mercer Award ( It compiles a *lot* of data to demonstrate an unexpected pattern quite convincingly. And it rules out the most obvious hypothesis to explain that pattern, but isn’t able to generate any new hypothesis to explain it. I think it’s a great paper–but is it “exploratory”? Yes and no. No, in the sense that the data analysis isn’t tentative or preliminary. No in the sense that it does test a hypothesis. And no in the sense that it doesn’t develop a new hypothesis (it’s not hypothesis-generating). Yes, in the sense that it’s not presented as (and I don’t think was originally planned as) a hypothesis-testing study. And yes in the sense that it raises a question rather than answering one. It gives theory a target to shoot at.

  2. One application of thinking about exploratory or observational designs is for long-term monitoring in context of global change. We can all agree (I hope) that if our descendants 100 years from now were not to have access to consistent data runs about ecosystems over the intervening 100 years, that would be a pretty shocking failure on the part of our current generation.

    Ordinarily when a research program is getting under way, there’ll be a chorus of advice saying the first thing you have to do is spell out what the question is, and only then can you design the research. But the problem is, we don’t know what questions our descendants 100 years from now will have on their minds.

    The situation is often compounded because anytime it looks as though long-term funding might be available, lots of people will put up their hands for their favourite measurement or organism. (And I don’t blame them — I’m sure they sincerely believe that measurement or organism is really important, otherwise they wouldn’t be working on it.) But it’s going to be difficult to get consensus across the research community.

    Anyhow, I think it’s a really interesting challenge how to work out what sort of ecological monitoring systems to set up, in order for knowledge to be advanced some time in the future by people who aren’t born yet.

    • This is a very interesting question Mark.

      One possible answer is: the reason we don’t know what information our descendants 100 years from now will wish we had collected (or started collecting) is that they’ll be able to make creative use of whichever information we happen to have collected. Yes, 100 years from now, our descendants might well be glad we collected information X rather than information Y. But it might *also* be the case that, if we collect information Y rather than X, our descendants will be glad about *that*. That is, our descendants’ informational wants and needs aren’t necessarily well-defined. I have an old post that’s sort of about this:

      There’s an analogy here to paleontology. A common criticism of paleontology is that it’s just not possible to learn that much about extinct organisms from fossils. Too much information has been lost. And that’s true to some extent–but it’s *really* hard to say *precisely* to what extent. After all, probably nobody would’ve imagined that could reliably infer, say, the coloration of a dinosaur–until some clever paleontologist figured out how to do it (“black stripes”, as I recall).

    • “Anyhow, I think it’s a really interesting challenge how to work out what sort of ecological monitoring systems to set up, in order for knowledge to be advanced some time in the future by people who aren’t born yet.”

      And of first order importance as well. It’s the large scale inventory and monitoring programs, often by government agencies, that collect the really important, primary, observational data that are going to inform us as to what’s actually happening out there. Particularly studies on this or that species or system by 100 different research teams here and there cannot help but suffer from non-standardized methods and the lack of a unifying direction and standardized data collection and analysis protocol. Then, after the fact, we have to try to make sense of all that stuff by various meta-analyses, possibly without the raw data even being available, working from tables and graphs, etc. NOT the way to make progress.

      Although a lot of these monitoring data are inherently of a time series nature, I don’t think that’s even necessary, as long as there’s a commitment to collect quality data whenever it IS collected. Even just a couple of high quality time points can do wonders–it’s always the first few data points that inform the most, and if you document it well, you set up the possibility of future repeats whenever the resources come available. I’m doing exactly that right now using two forest demography data sets collected just one time each, over a century or more ago, but documented well enough to allow very specific analyses of several kinds. One of these was not even collected by people interested in forests.

      I agree with your 2nd comment that we know what to measure–it’s obvious. We’re not trying to test exotic ecological theories here, we’re trying to simply document what’s going on out there, because that’s always the critical step one and it kills two birds with one stone as well. The real problem is getting a system in place (agencies, money, coordination, commitment) to collect the desired biological and physical data in a very coordinated, and perhaps model-designed, way. Thus even now, the best of monitoring systems, e.g FIA, are completely independent of say climatological or bird monitoring or air quality systems.

      Until you bring large and coordinated resources together under a definite and committed plan, you just end up with a random and weird set of jigsaw puzzle pieces and a lot of arguing.

  3. Yes I agree, and it’s a consoling thought that with luck they’ll be much brighter than us and be able to make good use of almost any sort of data. (Though it’s also possible they’ll be naturally peevish and will spend their lunchbreaks complaining about their forebears.)

    The principles I’ve been able to identify so far are pretty basic

    1. Cheap information is better than expensive information, because we’ll be able to collect more of it
    2. Go for things that seem obviously important to the ecosystem because they contribute biomass and physical structure — like say trees, in the case of forest ecosystems.
    3. Don’t worry about within-year and within-decade dynamics — our descendants will be able to study those for themselves

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s