Ask us anything: what’s a scientific hypothesis, anyway?

Better (very) late than never, hopefully: it’s our last “ask us anything”!:

What’s a scientific hypothesis, beyond an educated guess? What’s the best way to ask the question (how, what, if…else, etc.)? (from Bruno)

What’s the difference between a prediction and a hypothesis? (from Kim)


Some people would say a hypothesis is an educated guess. Its hard to get philosophers to engage in defining a hypothesis for reasons not clear to me – neither the Stanford Encyclopedia of Philosophy nor the Philosophy of Science A-Z by Psillos have entries on “hypothesis” despite using the word lots in other entries (FYI these are my two goto references on philosophy of science – the first is online free). I would define a hypothesis more precisely as “a claim about how the world works that is not yet well tested but is testable”. This is similar to what most dictionaries give for “hypothesis”. If something is well tested it is a theory rather than a hypothesis. If it is not testable it is not scientific. “Claim about how the world works” is obviously rather general but it is at a minimum a declarative statement about a pattern or process.

What follows a hypothesis in the scientific method is pretty clear – namely a deduction of an observable outcome from the hypothesis and then an empirical test of whether this outcome is observed (with an experiment being the most controlled way of testing empirically but experiments are not perfectly controlled nor required to test empirically). This deduced outcome is what we call a prediction. So in contrast to a hypothesis, a prediction is a specific statement of something observable (usually) in the future, while a hypothesis is a more general claim about how the world works. In practice, a prediction is often more quantitative as well. Predictions can be stronger (there will be 492 individuals of Alces alces in this plot next year) to weaker (the abundance will go up). Arguing for strong quantitative predictions of regression rather than the weak prediction of “they’re different” implicit in an ANOVA is something I have argued for before. What if the prediction bears out? Is the hypothesis therefore true? Most people would say not so quick but in different ways. Popper would say you can never prove a hypothesis is true, just fail to disprove it over and over. Lakatos would say that sometimes hypotheses can lead to a failed prediction, but that doesn’t automatically result in throwing out the hypothesis. Rather the ultimate decision is the cumulative weight of evidence, with a social process, and with attention to how risky and bold are the predictions that are confirmed.

Furthermore, predictions can be made in ways other than deduction from a hypothesis. Wild guesses (e.g. Nostradamus) are one way that doesn’t have much to do with science. Much more controversial is prediction by statistics (e.g. prediction using just a regression line or using machine learning). Rob Peters argued strongly that this was a superior (or at least a more accurate description of how we work) than the hypothetico-deductive method.

So if predictions and tests follow a hypothesis in the scientific method, what precedes a hypothesis? This part is in my opinion the most interesting part but is relatively overlooked. And there is not much agreement. Some versions of the scientific method would say a question precedes a hypothesis and thus a hypothesis (a declarative claim about how the world works) is an answer to a question. In this scenario I don’t think there is one right or best way to frame the question. Other serious philosophers of science say a hypothesis comes from inductive logic – making many observations and generalizing from things that are observed frequently. My own favorite goes back to Peirce and abduction (something Marco mentioned in the comments recently and on which a recent conversation with a colleague reignited my interest and  I hope to write a post on soon).

So to summarize, hypothesis followed by deduction of a prediction followed by empirical test of the prediction is nearly universally agreed upon as part of the scientific method. What goes before and after are more controversial. Do questions, observations and induction, or abduction precede the hypothesis? And what exactly do we conclude if the prediction is or is not confirmed? (although most would agree that if many predictions are confirmed repeatedly it becomes more certain and is called a theory). The scientific method is not as simple and clear cut a recipe as some would have us believe!


What Brian said.

I’d also note that scientists have various goals that don’t always go hand in hand. Explanation, understanding, prediction, and the ability to manipulate or control a phenomenon don’t all go hand in hand. At least not in all circumstances.

And whether you need a hypothesis or prediction may depend on what sort of work you’re doing. There’s a place for exploratory work that lacks either–it gives theory a “target to shoot at”. For instance, early experiments on biodiversity and ecosystem function, like Naeem et al. 1994, were out ahead of theory. They didn’t really have any hypotheses or predictions worthy of the name, to my mind. Ok, they had vague verbal hypotheses, like the “rivet hypothesis” analogizing species performing some ecosystem function of interest to rivets on an airplane wing (lose enough of them, and the wing falls off). But those hypotheses were too vague in terms of both their assumptions and predictions to do any real intellectual work. For instance, Naeem et al. present the “rivet hypothesis” as predicting a linear increasing relationship between biodiversity and ecosystem function. But why is that the rivet hypothesis, rather than that ecosystem function will either be constant and high, or else zero if biodiversity drops below some critical level? Analogous to how an airplane wing either remains attached and functional if it has sufficiently-many rivets, or falls off if it has too few? And if you don’t know exactly what the hypothesis is (what it assumes, and what follows logically from those assumptions), how can you test it? The point isn’t to criticize Naeem et al.–it’s a great paper. I’m just using it as an example to illustrate two points: (i) there’s a place for work lacking any hypotheses or predictions, and (ii) there’s not much point in having vague hypotheses or predictions, or pretending you have a hypothesis or prediction when in fact you don’t. If there’s no rigorous deductive connection between your assumptions and the hypotheses that derive from them, then rejecting or failing to reject the hypothesis doesn’t teach you anything about the truth or falsehood of your assumptions. And if your prediction is vague, like “variable X will increase with variable Y”, that’s not much more helpful than a coin flip. Your prediction could well hold just by dumb luck. And as Brian’s noted before, pretending you had hypotheses or predictions when in fact you didn’t (because you were really doing exploratory work) is at best unhelpful and at worst will lead you into mistakes associated with circular reasoning.

Other examples could be given. For instance, there’s no point to the intermediate disturbance hypothesis in my view. We either can’t agree on what the assumptions and predictions are, or else the assumptions don’t actually imply the predictions they’re widely thought to imply. And you can see this in how the IDH, and criticisms of it, mostly get cited in the literature–as window dressing that doesn’t actually do any intellectual work. And the comment thread on this post suggests to me that the hypothesis that species interactions are stronger and more specialized in the tropics is too vague to be testable. Because when Angela Moles and Jeff Ollerton pointed out that the data reject it, the response was mostly to argue about what the hypothesis “really” is. There may be a place for vagueness in science, but I think it’s a fairly small one, and it’s restricted to early on in a line of research (as with Naeem et al. 1994). Yes, hypotheses often need to get revised and refined over time. But there’s a thin line between productive revision and refinement of hypotheses, and just refusing to give up on one’s pet hypotheses. I’d argue it’s a bad sign that, after decades of work on the IDH, or the species-interactions-are-stronger-and-more-specialized-in-the-tropics hypothesis, we still can’t agree on what the hypothesis is or if data X test it or not.

In passing: I like Brian’s example of Nostradamus. In the past, I’ve used Ouija boards to make the same point. If you think that “testing predictions” is the be-all and end-all of science, why not get your predictions from a Ouija board?

14 thoughts on “Ask us anything: what’s a scientific hypothesis, anyway?

  1. Brian and Jeremy, really enjoyed these posts – it’s always a little startling to me how little time is given by practicing ecologists to sorting out what hypotheses and predictions are. It seems to me that they often think of this as the provenance of science philosophers – that the practical bits are already sorted out and now we’re down to the abstract hairsplitting. My take on hypotheses and predictions falls almost entirely in line with yours. One way that I try and clearly delineate the separation between hypotheses and predictions is that a hypothesis should never refer explicitly to the test you will use but, ultimately, the predictions should refer explicitly to your test (of course, they won’t early in the process before the experiment/study has been designed). I think one thing that has added to the confusion has been terminology – some people talk about biological and statistical hypotheses in the same way that I talk about hypotheses and predictions (I think).
    Jeremy, you make the critical point that hypotheses can be vague and not very useful until they are tied to an agreed upon set of predictions – hypotheses are really most precisely defined by the predictions they make. If a hypothesis makes a vague set of predictions that would also be consistent with many other hypotheses it implies that the hypothesis needs to better defined. Two scientists might use identical words to describe their hypotheses but if they would lead to different and conflicting predictions then either one of them is deductively wrong or they have hypotheses that are different (in some unstated way).
    It seems to me that both the IDH and “interactions stronger in the tropics” are statements about how the world works that are testable and reasonable candidates for scientific investigation – but they do require a consensus on which data would support and which would refute. (In fact, I thought that the main problem with IDH was that there was little empirical support – is the bigger problem that there is no agreement on what the hypothesis implies about patterns in nature?). Best, Jeff

    • “(In fact, I thought that the main problem with IDH was that there was little empirical support – is the bigger problem that there is no agreement on what the hypothesis implies about patterns in nature?”

      Hi Jeff,

      I think it’s both. If you actually write down and solve an explicit mathematical model, you find that the disturbance-diversity relationship is quite sensitive to model assumptions, and to parameter values. See Miller et al. 2011, for instance.

  2. Here’s a common example of a non-hypothesis in ecology: that variable X will affect variable Y. No details about the expected direction or nature of the effect; just that there will be an effect. Basically, writing as if your statistical null hypothesis and your substantive scientific hypothesis were one and the same thing. This is just a way of dressing up an exploratory study as a hypothesis-testing study. You often see such non-hypotheses in grant proposals, but you see them in papers too.

      • I’m just reviewing a major grant proposal right now, from a top person in EEB, that includes several examples of non-hypotheses. Which is what prompted me to comment.

        In fairness, I may well have used non-hypotheses myself at some point. We are all sinners.

  3. Hmmm, well, I am scratching my head and wondering if what you are getting at is one of the issues philosophers have ruminated on for centuries. The following was taken from wikipedia:

    A priori knowledge or justification is independent of experience, as with mathematics (3+2=5), tautologies (“All bachelors are unmarried”), and deduction from pure reason (e.g., ontological proofs).

    A posteriori knowledge or justification is dependent on experience or empirical evidence, as with most aspects of science and personal knowledge.

  4. An hypothesis follows the formation of a theory that explains a body of observation. The hypothesis is not a guess, nor is it a prediction. It is a test.

    The hypothesis is used to test the theory as an adequate explanation of the body of observations. It is an “If… then:” statement, as: “If the theory is true, then we would expect to find that.” If we find “that” then the hypothesis is “true” and our theory is supported.

    However, if we don’t find “that,” it could be because “that” doesn’t exist, or it could be because we haven’t looked everywhere yet. We have no way of knowing. This is a weak hypothesis test.

    A stronger hypothesis test would be the null hypothesis, which states, “If the theory is true, then we would not expect to find that.” If we don’t find “that,” then the hypothesis is true and the theory is supported. If we do find “that,” then the hypothesis is false, and the theory is revealed to be a less adequate explanation of the body of observations.

    This is the hypothetico-deductive method of theory confirmation, expressed by Douglas Adams as “Look first, think later, then test.”

    • Can’t quite agree with some of your definitions, but this is definitely a domain where different people have different opinions.

    • I agree completely. If one can demonstrate the null supports the theory, then you have achieved much, much more than the other way around. Often it is difficult to construct these kinds of hypotheses and I believe that is due to maybe not having developed the theoretical approaches to their fullest potential.

  5. “And if your prediction is vague, like “variable X will increase with variable Y”, that’s not much more helpful than a coin flip.” Following a short, quasi-systematic review of factorial experiments in the ecology literature, I’ve largely concluded that this is why NHST is so attractive, because most “hypotheses” (what I’d call exploratory questions) aren’t well grounded in a rigorous theoretical model and so are too weak to predict effect magnitude, so effect sizes are ignored and the focus is simply on “is there an effect”.

  6. Pingback: Interview with EICES Director Shahid Naeem - The Earth Institute Center for Environmental Sustainability

  7. Pingback: Revisiting Naeem et al. 1994 – Reflections on Papers Past

  8. Pingback: Friday links: Epstein fallout continues at Harvard, memes vs. intro biostats, and more (includes quick poll) | Dynamic Ecology

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.