Note: this post grew out of an email exchange I had with Stephen Heard last week. Stephen suggested an idea for a post that we both wanted to write. We decided to write our posts independently and post them on the same day. So read this post and then go see what Stephen has to say. I’m curious to see what he has to say too! I predict we said much the same thing, but I hope I’m wrong because that would be more fun. 🙂
Last week I linked to a major case of scientific fraud in psychology. It involved a study of the odometer readings people report to their car insurance companies. Here’s a histogram of one of the key variables in the study:
These data are obviously fake. You ask thousands of people to report how many miles they drove over some period of time, and you get a uniform distribution between 0 and 50,000 miles? Pull the other one, it’s got bells on.
This is a common feature of scientific frauds that involve fake data. Often, the fake data does not stand up to even casual scrutiny.
Which is puzzling. If you’re going to commit scientific fraud, presumably you want to get away with it. So why commit fraud in such a transparently obvious way?
After all, plenty of frauds in other walks of life are designed to stand up to scrutiny. Think of art forgery. Art forgers are very skilled artists. They go to great lengths to make their forgeries stand up to visual inspection by expert art historians who are on the lookout for forgeries. It’s a striking contrast with the laughable obviousness of many scientific frauds.
Or is it? Because there’s an important sense in which the shoddiest scientific fraud and the most careful art forgery are exactly the same. Both are designed to stand up to the scrutiny they’re likely to receive.
If you’re forging a Rembrandt, you know that your painting is going to be closely inspected by expert art historians. That’s why you have to go to great lengths to make it look like a Rembrandt. Whereas if you’re faking the data in a scientific paper, the data are unlikely to be closely inspected by anyone. Heck, until fairly recently your data were unlikely to be inspected by anyone because you weren’t expected to show them to anyone! And even these days, when post-publication data sharing is increasingly the rule in many fields, it’s still rare for the shared data associated with any given paper to be inspected by anyone. Not even casually, never mind closely. So a Rembrandt forgery and a typical fake dataset are similarly obvious fakes, relative to the differing levels of scrutiny they’re likely to receive.
The commonality here is well-illustrated by the fact that many art forgeries are obvious–if scrutinized in a novel way. For instance, the linked piece on art forgery discusses forger Wolfgang Baltracchi, who was exposed via chemical analysis of the pigments he used. His forgery of a 1914 painting used a titanium-based pigment that didn’t exist in 1914. From a visual perspective, it’s not at all obvious that the painting is a fake. But from the perspective of chemical analysis, the fakery is laughably obvious. Chemical analysis of pigments wasn’t a routine method of scrutiny when Baltracchi forged the painting in question; the chemical analysis was only conducted decades later. Something similar seems to have happened in the scientific fraud case I linked to last week. The fake odometer data were originally written up for publication in 2012. The fake data weren’t shared publicly until 2020, in association with a paper that failed to replicate the 2012 result. Data sharing, and replication attempts, are a lot more common now than they were in 2012. They’re novel forms of scrutiny to which the 2012 paper wasn’t originally subjected.
This is a general principle of fraud in all walks of life. Frauds are usually only designed to beat whatever fraud-prevention measures the fraudster knows they’re likely to be subjected to. Hence Dan Davies’ “Golden Rule” of financial fraud detection, which applies to non-financial frauds as well. Paraphrasing, the rule says, in part, that if you think something might be fake, you need to check it out in a way it hasn’t already been checked.
p.s. Before anyone points it out: yes, there are some fraudsters in science, and other walks of life, whose frauds aren’t designed to stand up to the scrutiny to which the fraudster knows they’ll be subject. For instance, consider plagiarism by undergraduate students. I talk to my undergraduate biostats students at length about what plagiarism is, and why it’s against university rules. I tell them that I have ways of detecting plagiarism on lab assignments, even if they plagiarize from an old assignment in an online repository, or only plagiarize part of the assignment, or plagiarize from someone in a different lab section, or paraphrase to try to hide their plagiarism, or etc. I point out the various resources that are available to support them if they’re struggling with their coursework, and encourage them to use those resources rather than resorting to plagiarism. And I tell them that, every semester, I give this same speech and still catch some students plagiarizing lab assignments. And yet, every semester I catch some students who committed plagiarism anyway, and it’s often dead obvious. I don’t know why they do it, and I’ve given up trying to understand it. People aren’t always rational. Sometimes they aren’t even “predictably irrational“.* Sometimes they’re just panicked, or lazy, or drunk, or dumb, or whatever. And sometimes, people act in ways that are completely inscrutable. The point of this post is that the obviousness of many scientific frauds has a rational explanation (nobody’s likely to check for fraud), not that the obviousness of scientific frauds always has a rational explanation.
p.p.s. Now I’m wondering: what are the least obvious scientific frauds in history? The ones that best stand up to scrutiny from many different angles? My first thought was duplication and relabeling of images across papers, from back in the days before automated image matching software. Here’s another candidate, though it apparently involved elaborate steps to cover up the fraud after investigation began, rather than elaborate steps to disguise the fraud in the first place. I’d be curious to hear from paleontological commenters as to how careful a fake Piltdown Man was, relative to the methods of scrutiny available at the time.
*Sorry, couldn’t resist this…obvious…joke. 😉