Also this week: fake it ’til you make it (look like you work 80 hours/week),
great Canadian minds think alike, evolutionary biologists vs. ecologists, E. O. Wilson vs. the OED, Wyoming vs. data, the evidence on anonymity and openness in peer review, subtle gender biases in award nomination, and much more. Lots of good stuff this week, you might want to get comfortable first. Or skim, if you must. But whatever you do, stick with it until the end so you can read about a runaway trolley speeding towards Immanuel Kant.🙂
From Brian (!):
A neat example on the importance of nomination criteria for gender equity is buried in this post about winning Jeopardy (an American television quiz show). For a long time only 1/3 of the winners were women. This might lead Larry Summers to conclude men are just better at recalling facts (or clicking the button to answer faster). But a natural experiment (scroll down to the middle of the post to The Challenger Pool Has Gotten Bigger) shows that nomination criteria were the real problem. In 2006 Jeopardy changed how they selected the contestants. Before 2006 you had to self-fund a trip to Los Angeles to participate in try-outs to get on the show. This required a certain chutzpah/cockiness to lay out several hundred dollars with no guarantee of even being selected. And 2/3 of the winners were male because more males were making the choice to take this risk. Then they switched to an online test. And suddenly more participants were female and suddenly half the winners were female. It seems so subtle and removed from the key point (who wins the quiz show) but airline flight vs online test seems to make a huge difference. What are accidental but poorly designed nomination criteria doing in academia? Several bloggers including Meg and Morgan have commented on how the nomination process can have a big impact on equitable gender outcomes in an academic context.
This article on how some men (and some, though fewer, women) fake 80 hour work weeks is interesting. To me, the most interesting part was the end:
But the fact that the consultants who quietly lightened their workload did just as well in their performance reviews as those who were truly working 80 or more hours a week suggests that in normal times, heavy workloads may be more about signaling devotion to a firm than really being more productive. The person working 80 hours isn’t necessarily serving clients any better than the person working 50.
The article is based on a study in the corporate world, but definitely applies to academia, too. (ht: Chris Klausmeier)
Apparently I wasn’t the only woman to have a post appear on Monday about how it’s possible to be a #scimom and the importance of role models! I really enjoyed this piece by anthropologist and historian Carole McGranahan. (My piece from Monday is here.)
Hilda Bastian (an academic editor at PLOS) takes a deep dive into all of the comparative and experimental evidence on anonymity and openness in peer review. It’s a blog post rather than a paper and so hasn’t been reviewed itself, so I’m trusting her to have done it right (FWIW, it has all the signs of trustworthiness). I love that she’s up front about the serious design and sample size problems of many studies. That’s one of the main take-homes, actually–on several issues, the available evidence sucks, so you can’t draw conclusions on those issues. And I love that she’s looking at all the available evidence, not just focusing on whichever study (or appalling anecdote) gets talked about most or supports her views (she favors openness over anonymity). Among her conclusions:
- Reviewers often see through author blinding
- Revealing reviewer identities causes many reviewers to decline to review, but may make reviews somewhat better
- Author blinding can reduce, increase (yes, increase), or have no effect on gender bias. But the evidence is pretty unreliable and hard to interpret.
Stephen Heard on why scientific grant funding should be spread fairly evenly among investigators. Echoes an old post of mine (we even independently came up with equivalent graphical models!), though Stephen goes beyond my post in considering how uncertainty in predicting PIs’ future productivity should affect funding allocation.
Caroline Tucker comments on the opposing papers deriving from the ASN meeting’s debate on ecological vs. evolutionary limits on continental-scale species richness. Haven’t read them myself yet, but judging from her comments I’m wondering if the competing hypotheses are too vaguely defined to actually be testable. Whenever people disagree on whether evidence X even counts as a test of hypothesis Y, that makes my
spidey sense vague hypothesis sense tingle.
The always-thoughtful Arjun Raj muses on when to retract a paper. Not as easy a call as you might think.
This is old but I missed it at the time: great This American Life episode on the fuzzy boundary between bold science and crackpottery, as exemplified by a collaboration between an NIH-funded cancer researcher and a musician. A meditation on the importance–and frustration–of looking for evidence against your ideas (“severe tests“) rather than evidence for them. Here are my related old posts on pseudoscience and scientific lost causes. (ht Andrew Gelman)
His own recent claim to the contrary, no, E. O. Wilson did not coin the term “evolutionary biology”, though it’s possible that he helped to popularize it.
Dismantling the evidence behind the most-viewed TED talk ever. The first bit (before the p-curve stuff) would be a good example for an introductory stats course.
Speaking of good examples for an intro stats course, here’s Nate Silver committing the most common and serious statistical mistake made by people who should know better: letting the data tell you what hypothesis to test, and then testing it on the same data. This mistake goes by various names (circular reasoning, “double-dipping”, “Texas sharpshooter fallacy“). Here, Silver notices an unusual feature of some ice hockey data, and then calculates a very low probability that the feature would occur by chance. Which is very wrong (and no, the fact that P is way less than 0.05 here does not make it ok). Every dataset has some “unusual” features, just by chance. You can’t notice whichever feature that happens to be, and then test whether that particular feature would be expected occur by chance alone. Because if the dataset had happened to exhibit some other “unusual” feature, you’d have done the test on that feature instead (Andrew Gelman calls this “the garden of forking paths“). It’s the equivalent of hitting a golf ball down a fairway, and then declaring that it’s a miracle that the ball landed where it did, because the odds are astronomical that the ball would land on that particular spot by chance alone (can’t recall where I read that analogy…). Nate Silver’s on record saying that frequentist statistics led science astray for a century. But ignoring its basic principles (here, predesignation of hypotheses) isn’t such a hot idea either. Come on, Nate, you’re better than this.
In praise of linear models. From economics, but non-technical and applicable to ecology.
Wyoming just criminalized gathering environmental data if you plan to share the data with the state or federal government. IANAL, but I can’t imagine this passing constitutional muster. But in a weird way, I’m kind of impressed with Wyoming here. Go big or go home, as the saying goes–even when it comes to data suppression. (ht Economist’s View)
This is from last month but I missed it at the time: Paige Brown Jarreau summarizes her doctoral dissertation on why science bloggers blog and what they blog about. Looks like Meg, Brian, and I are typical in some ways, but atypical in other ways.
There’s an out of control trolley speeding towards Immanuel Kant. You have the ability to pull a lever and change the trolley’s path so it hits Jeremy Bentham instead…
(ht Marginal Revolution)