Ask Us Anything: descriptive research vs. hypothesis-driven research

A while back we invited you to ask us anything. Here are our answers to our next question, from Pavel Dodonov: should ecological research always be prediction/hypothesis-driven, or should there be more space for descriptive research?

Jeremy’s answer: The question of whether there should be “more space” for descriptive research does puzzle me a little, because there’s never been more space for it than there is now! It’s never been easier to publish descriptive, hypothesis-free research (or any sort of research). That’s what megajournals like Plos One are for–to publish anything technically sound. And if you just want to get your work out there for others to evaluate for themselves without having to go through peer review, well, that’s what preprint servers are for.

Of course, there’s no guarantee that others will pay attention to your descriptive work after it’s published, but that’s a different issue. I think the science that gets the most attention ideally should be the work that answers important, interesting questions, and that identifies new important, interesting questions. Which will sometimes be descriptive work (e.g., Brian’s recent very high-profile collaborative work describing local-scale changes in biodiversity), but sometimes not. Rather than worry whether we pay enough attention to work using any particular methodology (descriptive work, hypothesis-driven work, quantitative work, field work, work to develop new methods, or whatever), I think it’s better to ask whether science’s “attention concentration” mechanisms concentrate attention on the work that most advances our understanding of important, interesting questions.

Brian’s answer: A lot of people have strong opinions on this. Many graduate programs treat hypothesis driven research as a mark of research quality. So does NSF (good luck getting a proposal admitting you are going to do primarily descriptive research funded). But I am very blase about the distinction. I prefer much more to think about the questions and where the questions come from. Good questions make good science. As an example, most of you have probably taught or at least TAd a class where students are supposed to come up with a hypothesis and test it (because that is what we pretend good scientists do). But the hypothesis is usually very uninformative (I’ll add this to that system and response variable Y will increase). It is a hypothesis, but it is not a good question. What makes good science is a really good question that comes from a really good knowledge of the system, a lot of reading of the literature, and a creative mind. Those three tools can combine to do good hypothesis-driven or descriptive research. And many good questions are inherently descriptive.

8 thoughts on “Ask Us Anything: descriptive research vs. hypothesis-driven research

  1. Follow up question to both of you:

    Do you think the way we publish hypothesis-driven and descriptive research should differ?

    I ask because I just finished reviewing a natural history manuscript for a small journal. It was solid, but unspectacular and all the useful information was clear from its one table. Any readers interested in a review or a meta-analysis would find what they need from that same table. Yet the authors still went through the rigmarole of writing an introduction and discussion to justify the paper’s existence. It seems like such a waste of time and effort that could be used differently.

    I know that some journals, like Ecology, now have category for data papers and other journals, like Nature’s Scientific Data, only publish data. But I wonder if this is a fad or whether it will fundamentally split the way we publish our work in the future?

    • Good question.

      Ethan White and Morgan Ernest like to say that your data *on their own* aren’t nearly as valuable as you think they are. By which they mean that most datasets only gain value from being combined with other datasets in a meta-analysis (there are exceptions, obviously). If that’s right–and I think it’s a good point–then the best thing to do with many datasets is just add them to a centralized public repository, like eBird or GenBank or DataDryad. Perhaps without even publishing an associated peer-reviewed paper.

      Your question leads to a further question: what’s worth publishing? Is it actually optimal to just publish *everything* and let everyone use search engines or whatever to filter it? Just because something is worth publishing in the sense of “making it available” (e.g., by adding it to a data repository) doesn’t mean it’s worth publishing in the sense of “making it available in a way meant to draw the attention of others” (e.g., by publishing it as a paper in a journal). Personally, I wouldn’t consider it optimal to publish everything in a way meant to draw the attention of others. But then again I’m old so I would say that.

      We have an old conversation on this in the context of exploratory analyses. Brian argued that exploratory analyses can be valuable in their own right and should not be dressed up as hypothesis testing, and I agree. But that raises the further questions of what makes for a good exploratory analysis and whether all exploratory analyses are worth publishing. Starts here: https://dynamicecology.wordpress.com/2013/10/16/in-praise-of-exploratory-statistics/#comment-19951.

    • In the last year at Global Ecology and Biogeography we just added a data paper category. It maintains the same scope of our journal (needs to be big spatial or temporal scales). But we only ask for 2500 words, mostly focused on methods. So slightly more opportunity/burden for intro/results/discussion than an Ecological Archives, but definitely short of a full paper. Ironically, we find most authors want to put that intro/results/discussion in – we have to work to scope them down.

      Anyway, you can see where my bet is.

      When the EiC team goes through new submissions, several times we have said, that “result” is very descriptive but its a great dataset. Now we have a way to capture those. And authors do too.

      In general we have to find ways to reward people whose skills/joy is collecting data and not putting them in conceptual framings and I think data papers are the best way to do it. I haven’t seen many promotion and tenure committees who will take a Dryad accession number on a CV as a major contribution. Papers are our currency (for better and worse) and data papers are a nice compromise.

      • I’d really like to echo Brian’s opinion that the solution likely lies in providing publishing opportunities for good data, but not because of the need for compromise between current culture and the need for accessible descriptive data.

        Sometimes data just aren’t good. I don’t mean that in the sense that a given dataset, well gathered and curated, can hold little new information. I mean that sometimes the way data were collected, tidied (/not tidied), or how a database has been populated can make it (at best) useless or (at worst) misleading.

        I can’t count the number of times I’ve read a paper that reports “all sequences were lodged in GenBank under accession numbers AAXX-AAXY” and thought to myself “wow, those sequences could be really useful, I could use those (in addition to my own) to answer some neat questions.” More often than not, the sequences aren’t useful (e.g. sequencing read errors haven’t been edited, the wrong region was amplified, the sequence doesn’t translate, etc.), and it’s clear that the submitter didn’t verify/check their submission in any way.

        If some review or vetting went into the submission of data, then this problem would probably be less common. Someone needs to be asking people how their data were collected and how they verified the data’s accuracy, and requiring a brief “Methods” section in a dedicated data-cataloguing journal would probably sort that out very nicely.

        (Granted, I’m aware that eBird/Dryad have some moderation processes, but numerous sequence catalogues like GenBank/BOLD don’t.)

  2. And to follow up on Brian’s comment, I think of the extreme motivation to reframe every project into hypothesis-driven research does damage to a field because its a bit like Feynman’s cargo cult science. By this, I mean it forces researchers to generate very qualitative and weak hypotheses (“I hypothesize that there will be a positive effect”) as opposed to encouraging a strong inference type prediction (“this result is consistent with hypothesis A but not with B, C, or D”) or a quantitative prediction (“The model predicts extinction of X in 3.7 +/- .12 years”). Forcing hypothesis driven research often encourages poor scientific practices. With a weak hypothesis, and assuming that everything affects everything, then you have a 50% chance of being in the correct direction – although your p-value may be > 0.05 :-(. If the question isn’t ready for strong inference or quantitative modeling, an explicitly exploratory approach should be the default.

    • I agree weak hypotheses are painful to read and not particularly useful. But even worse are “implicit hypotheses”–the author’s notions that motivate the work, never explicitly laid out in the Intro–but revealed in the Discussion, where you find the authors claiming their results “support niche theory and do not support neutral theory” or some such. A weak, vague, or squishy hypothesis just takes up page space.
      But a strongly articulated hypothesis(s) is a wickedly useful tool: hypotheses have a structure–assumptions+logic+predictions–that lay bare the argument and point out the collection of things testable. This is one reason why teaching hypotheses early is doubly useful–just as your first coding course forces you to think incrementally and logically, crafting hypotheses over and over, inductively or deductively, forces one to think about the relationship between data and conclusion, and how to unpack abstract logic and nouns into samples, units, and statistical tests.
      I am involved in Ecology’s new Scientific Naturalist feature, which celebrates natural history. At the same time, it asks the authors to take these new data and convert them into an interesting hypothesis. Every project needn’t start with an hypothesis. But if it doesn’t, I want the authors induce a new one.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s