Friday links: new p-hacking data, grant lotteries, things ecologists will never experience, and more

Also this week: a blogging anniversary, betting on replication, Shakespeare vs. dead animals, Brian and Jeremy have a link fight, and more. Also terrible philosophy puns.

From Brian (!):

Does which countries whose researchers you coauthor papers with affect the impact factor of the journal you get in? Apparently yes: in this piece from Emilio Bruna.

In the always entertaining and provoking Ecological Rants blog, there is a quote from Thomas Piketty’s book (setting the economic world on fire in the topic of income inequality for its careful empirical compilation of historical data). The quote is pretty harsh about economists’ obsession with little toy mathematical models that don’t inform about the real world.  Krebs  argues this critique applies to ecology as well (and cites no less than Joel Cohen one of the great theoretical ecologists who regularly chides ecologists for their physics envy). While I am an advocate for more math education in biology, I have to confess a certain sympathy with the quote. We’re so busy obsessing with equilibrium math models and small scale manipulative experiments we’re missing a lot of the story that is sitting in front of us in the massive amounts of data that have been and could be assembled. (There’s a controversial statement to make you sit up on a Friday)

Following up on my post about NSF’s declining acceptance rates there is a well argued blog by coastalpathogens suggesting we should just revert to a lottery system (one of my suggestions but not one that received a lot of votes in the poll).

From Meg:

Things ecologists are unlikely to learn firsthand: it’s hard to fly with a Nobel Prize. (Jeremy adds: is it hard to fly with the Crafoord Prize?)

The Chronicle of Higher Education had an article on increasing scrutiny of some NSF grants by Congressional Republicans (subscription required).

From Jeremy:

Link war! Brian, I’ll see your Thomas Piketty quote, and raise you Paul Krugman. Krugman’s long advocated the value of deliberately simplified toy models as essential for explaining important real-world data, making predictions, and guiding policy. See this wonderful essay on “accidental theorists” (and why it’s better to be a non-accidental theorist), this equally-wonderful essay on how badly both economists and evolutionary biologists go wrong when they ignore “simple” mathematical models, and this one in which Krugman explains his favorite toy model and how it let him make several non-obvious and very successful predictions about the Great Recession. Oh, and as important as Piketty’s empirical work is, it’s worth noting that even very smart and sympathetic readers have had a hard time figuring out what his implicit model is. If your model’s not explicit (and if you don’t care much for doing experiments), then your big data might as well be pig data. While I’m at it, I’ll raise you R. A. Fisher too.*

Statistician Andrew Gelman has been blogging for 10 years. I was interested to read his comments that there used to be more back-and-forth among blogs 10 years ago, and that these days that only happens in economics. I share the impression that economics is the only field that has a blogosphere. I also share Andrew’s view that Twitter is no substitute for blogs. Twitter has its uses. But “in depth conversation and open-ended exploration of ideas” is not one of them.

Speaking of Andrew Gelman, he passes on a link to a new preprint on the distribution of 50,000 published p-values in three top economics journals from 2005-2011. I’ve skimmed it, it seems like a pretty careful study, which avoids at least some of the problems of similar studies I’ve linked to in the past. The distribution has an obvious trough for marginally non-significant p-values, and an obvious bump for just barely-significant p-values. The authors argue that’s evidence not just of publication bias, but of p-hacking (e.g., choosing whichever of a set of alternative plausible model specifications gives you a significant result). They estimate that 10-20% of marginally non-significant tests are p-hacked into significance. The shape of the distribution is invariant to all sorts of factors–the average age of the authors, were any of the authors very senior, was a research assistant involved in the research, was the result a “main” result, were the authors testing a theoretical model, were the data and/or code publicly available, were the data from lab experiments, and more.

One more from Gelman: You can now bet real money on whether a bunch of replication attempts in psychology will pan out. I think it would be really fun, and very useful, to have something like this in ecology.

Most tenure-track jobs do not have 300+ applicants (and even the few that do tend to have an unusually-high proportion of obviously-uncompetitive applicants).

Speaking of tenure-track job searches: soil ecologist Thea Whitman with a long post on what it was like to interview (successfully!) for a tenure-track job. Go read it, it’s full of win.

Shakespearean insult or animal on display at Harvard’s Museum of Natural History?

Philosophy student karaoke songs.

*I’m guessing that Brian saw this response from me coming from 10 miles away, but I figure he (and y’all) would have been disappointed if I didn’t actually follow through and provide it. My boring predictability clockwork reliability is one of my most endearing features. That, and my refusal to to take second place to any ecologist when it comes to making half-baked analogies with economics. [looks over at Meg, sees her rolling her eyes, coughs awkwardly]🙂 In seriousness, I actually do see what Brian means and probably don’t disagree with him that much here. And for what it’s worth, I think current trends in ecology are mostly running in the direction Brian would like to see them run (e.g., away from MacArthur-style toy models of a single process.)


14 thoughts on “Friday links: new p-hacking data, grant lotteries, things ecologists will never experience, and more

  1. Jeremy – the thought that my quote and opinion would tweak you never crossed my mind🙂

    The fact that people don’t understand Pikkety’s implied model is beside the point. He went back and collected highly compelling empirical data that has completely dominated the conversation and forced people to wake up and pay attention to ideas that had been under-appreciated before. His work has been enormously influential with no model but really powerful data.

    In the end, science is about a good balance between induction and deduction. Science spent 1000+ years in the dark ages because it only used deduction (and just Aristotle’s deductions at that). The Renaisance and Enlightment were about a good balance.between induction and deduction returning. I think it is important to have an ongoing conversation about whether we have the balance right at any point in time in ecology (a balance which has changed over the decades).

    What I think is so interesting is where small-scale manipulative experiments fit into this. They in many ways seem the ideal blend of induction and deduction, and they’re the strongest form of inference available to us, so I’m not suggesting we abandon them, but I can’t help but feel they cause us to ignore obvious facts sitting in front of our faces about the natural world. I guess I’m in a more quantitative vein of the people arguing for placing more value on natural history.

      • Brian, Jeremy: thanks for your economics ‘link-off’, I quite enjoyed that. I’ve been working my way through “Capital in the 21st Century” (the audiobook edition is a great companion during weekend chores). Piketty and colleagues are a great example of the light that can be shed on standing disputes in a field through the use of both more extensive and more long-term data.

        However, I was completely surprised to see you links suggesting that Picketty’s approach is somehow an alternative to “simple models”, or that has models are only implicit. I have found his work every bit as much a nice example of the use of simple models, which I believe he makes quite explicit, as it is an example of long term data. (Okay, perhaps he is a bit more explicit in stating the underlying models in his Science paper with Emmanuel Saez,, but he outlines these rather clearly if more verbally in the book too).

        So to me, his work is a good example of both your points: the value of more comprehensive, careful, and long-term data, but also the value in using models to understand that. One more thing I enjoyed: he provides several good illustrations of how popular summary statistics (things like the Gini coefficient, Pareto distribution) are often useless or misleading in understanding the question at hand. Possible lessons for ecologists there too.

        Lastly, he is very careful to distinguish between “what the model predicts”, when we extrapolate trends into the future, vs any actual forecasts of what the future will really look like, when we remember what is not in the model (e.g. social shocks), reaching largely the same conclusion Brad DeLong does at the end of your linked post. Again something we might think about in the context of predictive ecology.

    • Perhaps a minor point, Brian, but it’s not strictly true that “Science spent 1000+ years in the dark ages because it only used deduction (and just Aristotle’s deductions at that)” It’s more accurate to say that “Europe spent 1000+ years in the dark ages….” because all of the interesting stuff, such as inventing chemistry, medicine, astronomy, etc., was going on in the Middle East and China. A lot of which we (re)discovered during the Renaissance🙂

  2. It’s actually slightly depressing to me that so far twice as many people have clicked Steve Vamosi’s piece about number of applicants per job than have clicked Thea Whitman’s piece about her interview. She’s a great example to follow, her piece is chock full of good advice, and reading it will put a huge smile on your face. Steve’s piece just debunks a myth that shouldn’t need debunking in the first place.

    • Had a quick look, it’s a very interesting mix of serious bets on serious matters by serious people, highly speculative scientific bets (like bets about extraterrestrial life), fun bets (like whether the US will win the men’s World Cup before the Red Sox win the World Series again), and nutters getting taken to the cleaners by sane people (like the guy who bet in 2007 that the US would have no Federal government by 2012).

  3. That piece on grant lotteries is very cogent, Brian, thanks. If an agency isn’t going to go to the NSERC system (or something like it), then next best might well be to just identify the top 20% or so of proposals and then choose the “winners” from that group via lottery.

  4. Thanks for the link, Jeremy! I got a lot of useful ideas and advice from this blog and others when I was preparing, and wanted to do my part to try to add something to the information out there.

Leave a Comment

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s