Book review: The Silwood Circle

This is the story of “the Silwood mafia”.

At least, that’s the term I first heard, during my time as a postdoc at the NERC Centre for Population Biology (CPB), located at Imperial College London‘s Silwood Park campus. The CPB no longer exists, but at the time it was one of the very best places in the world to do population and community ecology. I arrived there knowing something of the history of the CPB, and of the Silwood Park campus more broadly. For instance, some of the pioneers of modern ecology, folks like Mick Crawley and Mike Hassell, were still on the Imperial College faculty and pursuing active research programs (some of them are still there today). And others like Bob May and John Lawton still maintained close ties to Silwood. But it wasn’t until after I arrived that I fully realized just how important and influential the leading Silwood-associated ecologists had been in shaping the development and direction of ecology, both in Britain and internationally. And it wasn’t until after I arrived that I realized that this influence inspired a fair bit of jealousy and resentment in some quarters. To the point where some outsiders began referring to them as the Silwood mafia.

In The Silwood Circle, Imperial College London historian Hannah Gay charts the careers of a small, close-knit, informal “core” group of ecologists with similar but non-competing research interests: Richard Southwood (the informal leader of the group and a mentor to the others), Robert May, Gordon Conway, Mike Hassell, Roy Anderson, Mick Crawley, John Lawton, John Beddington, John Krebs, and David Rogers. If that sounds like a “Who’s Who” list of many of the most famous senior ecologists in the world, well, that’s no accident. Gay has a longstanding interest in generational groups and how they make their way in science, and how the social dimension of science affects its direction. Given those interests, she could hardly have picked a better case study. This isn’t history by the winners–but it’s history about the winners.

The backdrop of the story is the changing face of ecology as a whole. In the first half of the 20th century, ecology was very much a peripheral discipline, widely seen as the domain of amateurs if indeed it was noticed at all. By the end of the century, it was widely seen as much more central, and had become professionalized. In the 1950s and 60s, growing environmental awareness created a demand and opportunity for professional scientific advice on a broad range of ecological topics. And pioneers like Hutchinson, Slobodkin, MacArthur, and Holling had begun to point the way towards a new ecology, placing a much heavier emphasis on mathematical theory, hypothesis testing, manipulative experiments, and rigorous statistical analysis. The members of the Silwood circle recognized that opportunity early and seized it. Their work blazed the trails that many other ecologists subsequently followed.

A theme of the book is the interplay of chance and determinism in the fates of scientists and their ideas. The Silwood circle emerges as Exhibit A for the truth of the aphorisms that “chance favors the prepared mind” and “it’s best to be good and lucky”. For instance, Mick Crawley was lucky to hear about an upcoming opening on the faculty at Imperial College. He heard about it at a pub lunch with John Maynard Smith and Mike Hassell, an illustration of the importance of social connections. But it wasn’t blind luck that had caused him, years earlier, to do a Ph.D. involving a lot of statistics and computer programming. That was a very unusual choice at the time–but it was a prescient one. Silwood was ahead of the curve in terms of hiring quantitatively-oriented ecologists, but quantitative skills soon were much in demand throughout ecology. Through both the example of his own research and several textbooks, Mick Crawley taught a whole generation of ecologists how to think quantitatively and statistically. And it wasn’t luck that caused him to say, as soon as he heard about the opening, “I could do that job”. Put another way, to say that Crawley had a “lucky opportunity” is to say something about him as well as his luck, since for someone else without Crawley’s skills, hearing about that job opening wouldn’t have been an opportunity at all. Or consider physicist Robert May. He was fortunate to get the mentoring and advice he did, which helped him meet some very good biologists. But those meetings would’ve come to nothing if May hadn’t had some really good ideas and been very good at linking those ideas to big questions biologists wanted to answer. Stories like this come up throughout the book.

Correctly, Gay doesn’t just write off the members of the Silwood circle as naked careerists, or the multifarious formal and informal ways in which they helped each other as a nepotistic conspiracy. For instance, she notes the importance of interpersonal trust for enabling people to make decisions in the face of imperfect information and the inherent unpredictability of the future. For example, later in their careers several members of the Silwood circle became trusted government advisers. And in most cases that trust was justified, though Gray discusses the controversy over an arguable exception (the handling of the vCJD crisis, in which Southwood was involved). And if the joke that the title “F.R.S.” stood for “Friends of Richard Southwood” rather than “Fellow of the Royal Society” has more than a little truth to it, well, ambitious scientists have always worked together to advance their shared causes. Including by nominating one another for awards like membership in the Royal Society. The Origin of Species would hardly have been noticed had Darwin’s many friends not worked behind the scenes to support it, and one another. For better or worse, science is done by people, and so you can’t effectively support ideas without also supporting the people who hold them.

Gay notes that members of the Silwood circle shared various traits, which she tentatively suggests help explain their success. For instance, all were keen naturalists as children–collecting insects, watching birds, etc. I found this interesting because they subsequently took ecology in a direction that (incorrectly in my view) is seen by some as inimical to natural history or even field work. For instance, May’s landmark book Stability and Complexity in Model Ecosystems is famously abstract and divorced from system-specific biological details. And the Ecotron controlled environment facility that John Lawton spearheaded at Silwood was criticized for purportedly being unnatural, unhelpful, and diverting resources from more traditional ecological work (Lawton 1996). Gay suggests that other shared traits include ambition, competitiveness (May in particular is famously competitive), a heightened yet selective awareness of what others are doing and thinking, and a greater than usual willingness to make evaluative judgments about other scientists and their work. For instance, members of the Silwood circle mostly didn’t like the systems ecology work of the Odums.

The book also is a case study in how to build a topnotch research group that’s greater than the sum of its parts. Tellingly, it was not a matter of the people in charge just trying to hire clones of themselves. Richard Southwood didn’t himself pursue the same sort of research as many of the people he hired or mentored. Southwood was no modeler or statistician, for instance. But he was very good at spotting talent, and at nudging people in productive directions.

Gay also tries to give the reader a sense of the science itself. What questions, fundamental and applied, were circle members trying to address, how did they address them, and what was new and important about their approaches? In this I think she’s fairly successful, but it’s tough for me to judge since as a professional ecologist I wasn’t the intended audience for these explications.

The book isn’t hagiography. For instance, Gay is matter of fact about how the Silwood circle included no women. She talks at greatest length about Southwood’s own attitudes towards women. He seems to have viewed women as capable of playing only a supporting role in the advancement of science.

One thing I wonder is whether it would’ve been possible for the Silwood circle, or its equivalent, to achieve equal success in the US. Back when I first heard the pejorative term “Silwood mafia”, I remember thinking how weird it was that a small cohort of people associated with one university campus could ever attain so much influence that they’d be seen as a “mafia”. I mean, a place like UC-Davis has lots of hugely-famous and influential ecologists–but nobody ever talks about a “Davis mafia”! Don’t get me wrong, there are schools of ecological thought associated with particular places in the US. For instance, according to Meg, people sometimes talk about the “Yale school” and “Wisconsin school” of limnology. But the US just seems like too big a pond to ever be dominated by any one “school” of fish, as it were.

I probably read the book with more interest than most, because of my personal connections to Silwood and my pre-existing admiration for several members of the Silwood circle. But I think the book will be of broad interest to many. Certainly, it’s unusual and intriguing to read a serious, detailed socio-historical study of people whom you’ve met, and who are mostly still alive. And it was interesting to hear someone articulate not just a description of how science works as a social activity, but why it works that way and why that way of working can often be good for science (rather than bad, or merely inevitable rather than good or bad). You may find it slow-moving in places. Gay describes in great detail the career trajectory of every member of the circle, down to the level of key conferences they attended, specific projects on which they worked, their movements from one job to another, etc. And there’s no juicy gossip and little in the way of personal anecdotes. Gay’s goal was to produce a serious study of the sociology of science, not dig dirt or even write potted biographies. I think she succeeded, and I recommend The Silwood Circle.

*Dick Southwood, the father figure of Silwood ecology, occasionally referred to the group he mentored as the “Silwood Mob”. He meant it affectionately.

R isn’t just R anymore

Something I’ve heard very little about in ecology (despite the field’s rapid and near complete embrace of R) is the rapidly forking versions of R. For those who aren’t software engineers, forking is when the body of code implementing a program splits into two different, alternate, even competing versions or “branches” (with different people behind each branch).

After years of social cohesion around the “R Core Group” version, R has recently seen a number of forks:

  1. pqR (short for pretty quick R)
  2. renjin 
  3. FastR
  4. CXXR
  5. Riposte
  6. TERR

There are a number of differences between these versions. Renjin and FastR are being rewritten in Java (original R is written mostly in C). CXXR is being rewritten in C++. This might not matter to most ecologists, but should lead to some performance and memory advantages. TERR is also a bit of an outlier in that it is being developed commercially and is targeted at big data (bigger than fits in memory) which is a well-known weakness of R (yes, before I get flamed, I know there are a number of open source packages in R that try to deal with this, but it is not built into R from the ground up). Some are clearly at more advanced stages than others (e.g. FastR and Riposte just take you to GitHub=source code pages, while the others have friendlier home page screens with explanations). pqR and CXRR are building on top of core R and therefore have very high odds of working with whatever package you want to use. TERR and Renjin are not innately compatible but have put a lot of effort into building compatibility with common R packages. FastR and Riposte don’t yet seem to have good answers on package compatibility but they are still in early stages. In general, pqR is the most conservative – just tweaking the core R product for speed – and probably the best at compatability.  A nice review of these six alternatives (if you’re a programmer) is found at 4D Pie Charts in part 1, part 2 and part 3 (skip to Part 3 if you want the bottom line and not all the computer programming details).

The one thing they all have in common is trying to speed up R. This matches my own experiences (and is why I never use R for my own personal research unless pressured into it by group consensus in, say, a working group). It is just really slow. Not a big deal if you have a field dataset with a few hundred rows. But even the comparatively small datasets I work with like the Breeding Bird Survey and US Forest Inventory (few million rows) really bring R to a very slow crawl (and again yes, I know there are tools, but I have better things to do with my time). Matlab and Python are both noticeably faster on most real world tests (no programming language is fastest on every test). Recently I was implementing a relatively complex MLE (Maximum Likelihood Estimation) routine (on detection probabilities – so complex formula but still analytically formulated), something you think R would be awesome at – and to my surprise the same code in Matlab ran 10-100 times faster than R (subsecond vs 30 seconds).

To me, the most fascinating aspect of this is the social. Why all these forks? Now forks happen in almost every open source project (people disagreeing and taking their toys and going home is human nature). But as a long term watcher of open source, I would say the number and seriousness of the forks is unusual. I don’t think this is a coincidence. As stated in the last paragraph, Core R has not risen to the challenge of performance which is something people crave, and there is only so much you can do to fix this by add-on packages. Or put in different terms, the rate of innovation in the 3rd party packages of R has been exceptionally high (one of the reasons for its uptake) but the rate of innovation in the core (affecting fundamental issues like ease of use, performance, memory management and size of data) has been slow*.

As an example, consider pqR. Although pqR author Radford has taken the high road in what he says, he very clearly offered his first batch of improvements to the core R group and they mostly got rejected with not much more than a “we don’t do things that way here” attitude (http://tolstoy.newcastle.edu.au/R/e11/devel/10/09/0813.html  and http://tolstoy.newcastle.edu.au/R/e11/devel/10/09/0823.html – and  reading between the lines good old fashioned “not invented here syndrome”). His frustration levels clearly reached a point that he decided to work from the outside rather than the inside for a while. While I have never offered patches to the core, my own experiences with trying to submit a package to CRAN shocked me. The number of people involved was ridiculously small (Brian Ripley replied to one of my emails) , the openness to different ways of thinking was zero (the pedanticism about you have to do X even if X makes no sense in my context from obviously smart people surprised me), and the rudeness levels were extreme, epic and belonged in a television soap opera, not science.

All of this has created  a muddle for the poor person who just wants to do some statistics! Now you have to figure out which version to use (and package/library writers have to make their software work with several versions of R).

I have not yet tried any of the alternatives (as noted I mainly use R for teaching and Python and Matlab for personal research – I’m not taking students to the bleeding edge of technology without a reason). But, given that I basically don’t know what I’m talking about 🙂 , my recommendations would be:

  • If R is fast enough for you – ignore the whole thing and wait for the dust to settle (which it will in 2-3 years, probably either with one clear widely accepted alternative or with changes rolled back into the R core after they decide they’d better start listening to other people)
  • If you really need a faster R right now, try pqR and Renjin (and maybe CXXR if you’re gung ho). Both are freely available, seem to offer real speed improvements, seem to have high compatability with packages (although pqR is probably higher) and are moderately mature.

I am really curious to hear from our readers. Have you heard about the alternative implementations of R? Do you care about them or are they off your radar? Has anybody tried one? 


*The GUI (graphical user inteface) is another area where internal innovation has been slow, but fortunately 3rd parties can and eventually have stepped up. If you don’t already use R studio, you should be.

The one true route to good science is …

.. a myth up there with the unicorn and guaranteed ways to lose weight without eating less or exercising more. But that doesn’t stop people from espousing their version of this myth (Peters 1991, Paine 2010, Likens and Lindenmayer 2011). The usual form is to attack some now trendy but supposedly horrendous version of science and then mildly conclude that the way the author does science is the only really good way to do science. In the latest version of this archetype, two esteemed ecologists, David Lindenmayer and Gene Likens (hereafter L&L) penned an almost vitriolic piece attacking “Open-Access Science”, “Big Science” and I don’t know what all else (that I’m going to call for short hand “new-fangled ecology” for now)*. Now I respect the work of both of these scientists. Gene Likens is deservedly a member of the US National Academy of Sciences. Likewise, Lindenmayer is a fellow of the Australian Academy of Science and were he American and if the US NAS did not have a bias against conservation biology you could make a good argument he would be in the US NAS. These are people who have earned the right to be listened to. So I don’t want to engage in just a slam down on their slam down.

But they did call me a “parasite” and implied it is likely I am doing “context-free, junk science”, which I feel at least entitles me to a response 🙂 And their arguments are oft repeated (although they’re usually expressed in a politer fashion) so the polite versions of the claims do deserve a thoughtful response. Stripped of vitriole, some of their arguments are even made by my own fellow blogger (but of course there the dialogue was civil even when I disagreed in the comments) (UPDATE from Jeremy: I think Brian actually meant this old, not very good post of mine). So let me briefly respond to three points they raise in their piece, then I want to use their piece as a launching point to talk about larger issues.

Three commonly repeated concerns about “new-fangled ecology” that are raised in especially vivid terms by L&L are:

  1. People who analyze data without collecting data aren’t pulling their fair share (or in L&L terms “There is also the emerging issue of a generation of what we term here as “parasitic” scientists who will never be motivated to go and gather data because it takes real effort and time and it is simply easier to use data gathered by others.”) – Ouch. Full confession – by their definition I am a parasite. I have never published on a dataset I personally have collected. But I respectfully disagree. The usual counterarguments are valid and usually are along the lines of noting that ecology is at the extreme end of the spectrum in terms of believing in individual ownership of data. Meteorology, astronomy, particle physics, even economics all see the data as a public good to a much greater degree than ecology. Of course in their cases, the collection of the data is funded by the government, wholly unlike in ecology 🙂 And taxonomy didn’t exactly go down the tubes when journals required placing sequences in GenBank before you could publish. Notwithstanding all of that I am sympathetic to the plight of somebody who has spent years collecting a dataset. I personally don’t think compelling an individual to share is the right path (as distinct from institutions like LTERs or NEON which are and should be compelled to share their data). So to me the most compelling reason I am not a parasite is that I have students and faculty (from my institution and literally from across the world) emailing me and coming to my office to ask me questions about what kind of statistics to use on their hard-won empirical data, or how to set up a simulation or what theoretical context they can place their data in. Happens almost daily, certainly multiple times a week. Sometimes I can answer in 5 minutes but often its an hour. Occasionally I am formally on their committee and thus in some sense obligated and compensated for doing it. And sometimes it turns into a full-blown collaboration where I am a coauthor. But the vast majority of the time, neither of these apply. Should I start calling my friends, colleagues and students parasites? Hardly! This is called being a good scientific citizen. If you want to spend time setting up a formal credit system where every hour of my consulting time earns me 100 lines of a dataset go ahead. Personally, I’m pretty happy with the current system. Academics don’t live in a world where everything can be put into one unit of currency and traded. We work much better in a spirit of generosity and openness and freely giving (and yes taking) building synergies across our distinct skill sets and circumstances.
  2. Data before questions (or in L&L terms “do[ing] science backwards … now that we have all the data, what question shall we ask? … junk science”) – From the tone of the article one envisions a couple of troll-like ecoinformaticians skulking into a secret room at ESA and chuckling about the data they’ve stolen from the poor field ecologists and then saying “now that we’ve captured the data, what do you think we should do with it?” and the other troll replying – “Gee questions are hard – we should have stolen the questions from the field scientists too”. OK a flight of fancy there. In all seriousness, I don’t know what secret rooms of troll-like ecoinformaticians L&L hang around with but I’ve been in a lot of those rooms at ESA and NCEAS and etc and I have NEVER heard a conversation that went like “here’s some data, what question can I ask”. If it happened, then I would agree that that was junk science, but it doesn’t happen. I have also edited at least 100 ecoinformatic papers and I’ve never seen a hint of this thought process in those papers either. The conversations I hear go much more like “I am asking question X and I just cannot for the life of me find the appropriate data – do you have any suggestions?” (exactly proving that even ecoinformaticians put primacy on the question). But “what kind of question can I ask now that I have data X?”. Literally never heard this among ecoinformaticians. There’s really nothing left to say on this topic except show me some hard facts and stop criticizing how you imagine other people do science. On the other hand, I have had a few (just a few) students who showed up in my office with a dataset they collected after two years in the field who didn’t seem too clear on which question motivated them to collect the data … 🙂 (I’m just saying!).
  3. Using data without a detailed knowledge of how it was collected and the ecology of the organsims is dangerous (or in L&L terms “Our extensive experience from a combined 80 years of collecting empirical data is that large data sets are often nuanced and complex, and appropriate analysis of them requires intimate knowledge of their context and substance to avoid making serious mistakes in interpretation.”) – As you might expect by the more moderate tone of the language from L&L, this is probably the most reasonable concern. Who would be opposed to users of data having more detailed knowledge of the data collection and the organisms? No one. But L&L go on to say “There is an increasing number of examples where increased knowledge is missed or even where substantially flawed papers are being published, in part because authors had limited or no understanding of the data sets they were using, nor any experience of the ecosystems or other entities about which they have written.” Sadly they don’t provide any citations to support this claim which makes it hard to refute. Surely if this were a scourge of ecology there would be a few dozen examples? Certainly there have been some meta-analyses published where people dispute the authors’ interpretations (they discuss one such an example later but fail to recognize that discussion is still ongoing and not decisively settled as flawed). But these disputed interpretations are found in all areas of ecology. And if you talk to an experienced ecoinformatician, they go to great care to know the data. I can guarantee that you you don’t want to be part of the some of the conversations I have had about the details of datasets like the BBS, US Forest Inventory or even the Barro Colorado 50 ha tropical tree data. They get into incredibly dry boring detail about survey methods, variations between years, spatial heterogeneity, species that are well or poorly sampled, and etc. Ethan White has even set up a Wiki to capture such knowledge in public form. Lack of knowledge about the context of data is not evident! But ultimately, I think this point is misguided because it is not part of a one-directional goal (“more knowledge=better”) but part of a trade-off – more knowledge=smaller spatiotemporal scales and fewer parts of the world and taxonomic groups covered. If I am comparing 10 regions (or 10 orders of organisms), it is unavoidable that I will know less about each specific dataset. First because it is really unlikely that one person could collect all that data. Or even if they did, they’ve probably forgotten quite a lot about the first dataset by the time they’ve collected the 10th dataset 12 years later. Such cross-region and cross-taxa comparisons are obviously important for the advancement of science. Does somebody who spent 1000 hours collecting a dataset really want to argue that no general principles can be drawn from it and that it cannot meaningfully be compared and contrasted with a dataset from another part of the world or part of the taxonomic realm? Down this road ultimately lies a Simberloffian (2004) view that there are no general principles in ecology and the best we can do is spend our whole lives studying one place. That may work for some people (and more power to them – we need that view), but it’s not why I got into science (and I doubt it is why agencies are giving me funding).

So this brings me to the larger point I want to make that goes beyond the L&L piece, beyond the handful of papers I cited in the beginning, to what I perceive as an unfortunately all too common attitude in ecology. I call it the “not my kind of science=bad science” attitude. The bottom line is we throw around the “bad science” label at other ecologists way too often.

Try this thought experiment. Imagine a congressional staffer (or worse a congressperson) reading the L&L piece. What do you think their reaction is? Do you think it made them more or less likely to increase funding for ecology? Nobody knows for sure. There might be a few percent who actually thought “I don’t know if those L&L guys are right or wrong but at least they’re policing each other and having a strong internal debate about what is good science over in ecology”. But I’m pretty sure that would be an exceedingly rare response. I’m pretty sure much more common responses are “Scientists always say they have special methods for finding truth but they cannot even agree amongst themselves” or “Those ecologists are always bickering amongst themselves over petty philosophical disagreements and never stepping up all hands on deck to solve the problems society needs to solve”.

I have been told that in the 1970s during the strong environmental movement in the US (that is when the clean air and water acts and endangered species acts were passed) there was a move afoot in congress to create an NIE (national institute of the environment) similar to the NIH (national institute of health – which by the way is the major funding agency for all medical-related research in the US). But during congressional hearings, different ecologists started showing up and arguing about whose type of ecology was rigorous or not. This story is hearsay, but it sounds very credible to me.

Truly good science is a rare thing everywhere and junk science happens everywhere, so being able to find not-good-science in a field is a poor reason to label a field as “bad science”. Not only is there no one true route to good science, but good science inherently involves many independent routes converging. We as ecologists need to stop shooting ourselves in the foot and pulling out the mantra “bad science” as a way to put down the other side in our divides (theoretical vs empirical, ecosystem vs population, animal vs plant, and etc). It might help win a battle but it is losing the war (for funding, for respect, for scientific progress). Fields like physics and astronomy have been vastly more successful at attracting funding for fields which, at this point, probably are of less immediate urgency to society than ecology. There are many reasons, but at least one of them is they work together. Beyond the prosaicness of funding, the healthiest branches of science making the most progress are those were people reach across diverse fields and value the multiple perspectives and approaches, using each to their strength. I would like to see ecology become such a field. But its not going to be as long as we keep pulling out the “bad science” card every time we see a few extra dollars or pages in a good journal going in a direction different than our own.


* So here goes the world’s longest footnote – feel free to skip it if you aren’t interested:

As I mentioned, I am not clear exactly what it is that L&L are critiquing (they mention open data and big science, but some of their critiques seem not relevant to either of those). It does seem to me that several distinct ideas have been conflated into some sort of “new-fangled ecology” that L&L and others have been criticizing of late. So let me unpack “new-fangled ecology” into 5 distinct ideas. Each can be done alone or in any combination with one or more of the other ideas.

  1. Big-science – This is a project that requires many people of diverse skills to perform. Big here is # of participants. Physics with >1000 PhDs searching for the Higgs Boson is the best example, but NEON is no wilting violet either. Big science is an inexorable trend in all fields of science and this is probably a good thing. The days of Einstein dreaming up 3 Nobel-worthy papers while working as a patent clerk, or MacArthur reinventing theoretical ecology while being bored in the army, are over.
  2. Big-data – This involves really large datasets. As our capacity to store and process large data has grown exponentially, so has our capacity to fill such data storage. However, I would argue that ecology does not have any truly big-data. Big data is measured in terabytes and petabytes and exabytes. Gigabytes barely qualify. Yet most “big” datasets in ecology are under 100MB. They fit in memory. Ecologists are however collecting data at increasingly large spatiotemporal scales and this is noteworthy but probably needs its own name (Big-scale?) from Big-data which is well-claimed already by the computer scientists.
  3. Data-mining – Using machine learning to find patterns in the data. This is the source of the data before the questions idea. But genuine data-mining in ecology is exceedingly rare. Despite my post praising exploratory statistics, I see data-mining as one step further and one step too far. Good exploratory statistics still starts with clear questions and even tentative theories (that will not be formally tested) in mind. The one place you find data-mining in ecology is in applied purely predictive contexts. EG what will the malaria-carrying mosquito population be next year. I have no problem with saying that data-mining should stay in this limited domain. And it is seriously misguided to think that all (or most) data-oriented ecologists are doing true context-free data-mining.
  4. Open-data/Metadata – The push to have datasets be: 1) clearly documented (aka metadata) and 2) available for public access (e.g. downloadable on the internet or in journal archives). This is not black and white – one can argue for more or less open-data requirements. It would be hard to argue that ecology wouldn’t benefit from more open-data, but that doesn’t have to mean every single dataset has to be immediately published on the internet the day after it is collected. Also most pushers of open-data are strong advocates of appropriate methods for giving credit to data collectors.
  5. Meta-analysis/synthesis – The push to do NCEAS-style analysis across many datasets to assess the generality of many individual research projects. This goes all the way back to some of the first meta-analyes on all the competition projects by Gurevitch 1992 and Goldbergon&Barton 1992. Again, nobody is saying all science should be synthetic. On the contrary, meta-analysis implicitly assumes the need for individual data-collection experiments. But it’s hard to argue that its not good to stop once in a while and summarize all the data we’ve collectively gathered in a formal, rigorous, quantitative way.

To repeat – you can have any one of these alone or in any combination. Which of these involve good science? All of them (even Data-mining in the right context). Which of these involve junk science? All of them. In which way is this different from experimental ecology, observational ecology, phylogenies, analytical model development or etc? Its not! All subdisciplines of ecology (and of science) and all distinct methodologies involve good science and junk science.

Book review: Darwin Deleted by Peter J. Bowler

What if Charles Darwin had died on the Beagle voyage?

Hardly a far-fetched possibility. Some of Darwin’s family and friends thought the journey risky and tried to talk him out of it. Darwin was a lousy sailor, and the ship passed through severe storms; Darwin could easily have been swept overboard. Argentina was having a civil war while Darwin was there; he could’ve been shot. He could’ve caught some tropical disease, worse than the Chagas disease he’s widely believed to have contracted. Heck, when trying out the bolas used by Argentinian gauchos to hunt rheas, he tangled the legs of his own horse. The horse managed to stop without throwing him, and he and the gauchos had a good laugh. But it wouldn’t have been funny if the horse had gone down and he’d broken his neck.

In Darwin Deleted, eminent historian of evolutionary biology Peter Bowler asks what would’ve happened if, as was quite possible, Charles Darwin had died young. How would history have been different?

Darwin Deleted is what’s known as counterfactual history: trying to understand why history happened as it did by asking if or how things would’ve been different under some alternative scenario. The approach is most popular in military history (I once read a fun book of popular counterfactual histories that was heavy on military examples). The counterfactual method is controversial. Some historians argue that asking counterfactual questions is not only impossible in practice but meaningless in principle. But I suspect that, to scientific readers, the counterfactual method will seem a natural choice. As scientists, we’re taught that the best way to reveal causality is via manipulative experiments. Historians can’t run manipulative experiments, so the the best they can do is run the thought-experiment equivalent. Historians are the last people who should fall into the fallacy of post hoc ergo propter hoc (“after, therefore because”). Consideration of counterfactuals is a form of mental discipline that helps prevent you from falling into this trap. Plus, the counterfactual method is a close cousin to Stephen Jay Gould’s famous notion of “rewinding the tape of life” and playing it again. Gould was interested in whether things would come out differently starting from the same initial conditions, due to the accumulation of stochastic events. Here, Bowler discusses the likely consequences of a single stochastic event.

Of course, scientists also will be quick to wonder how to evaluate Bowler’s counterfactual claims. How can we tell if he’s right about what would’ve happened following Darwin’s early, untimely demise? We don’t have a time machine or a bunch of identical Earths, so can’t actually perform a controlled, replicated experiment. And we don’t know enough about history to build and validate a dynamical model that could then be tweaked to simulate alternative scenarios, analogous to what’s done with climate models. So the best we can do is engage in grounded speculation–plausible storytelling.

And while one could argue that plausible storytelling is all too easy–lots of quite different stories might all seem plausible–Bowler’s speculations are very well-grounded and so very plausible. Bowler knows a massive amount about Darwin and his times, and not just in Britain. One key line of evidence for Bowler’s arguments comes from his comparative approach. By looking at how evolutionary biology developed in countries like France and Germany, where Darwin’s ideas never gained much of a foothold or did so only in distorted form, we can get some insight into how evolutionary biology would’ve developed in the UK and the US if Darwin wasn’t around to propose his ideas.

Another key line of evidence comes from the fact that the idea of evolution by natural selection was highly controversial and remained so for decades. There were plenty of alternatives to evolution by natural selection that were widely-discussed at the time, so we have a good sense of what ideas would’ve been pursued instead, had natural selection never been proposed. Indeed, Bowler makes the case that natural selection was never an especially popular hypothesis even before it was nearly eclipsed in the late 19th century. Paleontologists and anatomists (as distinct from biogeographers) didn’t even share Darwin’s very modern sense that explaining adaptation was the key goal for evolutionary biology. For many biologists at the time, the key thing was to document the fact of evolution, to describe the tree of life. Interest in doing this was driven by many factors independent of Darwin, like Chambers’ Vestiges, discoveries of fossils, and comparative anatomy and embryology. From that perspective, adaptation of organisms to their local environments is an unimportant epiphenomenon–even a nuisance because it obscures what we’d now call phylogenetic signal and what many at the time would’ve called the evidence of nature’s purpose, God’s hidden plan. Part of what makes it possible to write a counterfactual history with Darwin deleted is that, in our world, Darwin’s ideas didn’t actually have as much influence as you might’ve thought.

Bowler’s counterfactual is a great vehicle for discussing all sorts of interesting issues in the history of evolutionary biology. For instance, Darwin’s idea of evolution by natural selection often is claimed to have been “in the air” in Britain at the time, on the grounds that Alfred Russel Wallace independently hit on the same basic idea. But on the other hand, Wallace’s original version of the idea was somewhat different than Darwin’s, and much less well-developed. His original 1858 paper, like Darwin’s, famously went unnoticed. And Wallace was poor, and poorly-connected. It’s hard to see his initial idea ever gaining a foothold in the absence of Darwin, given that Darwin was rich and well-connected and still struggled to get people to understand and agree with his ideas. Plus, Wallace’s own subsequent development was heavily influenced by reading Darwin, which is another strike against the notion that, if Darwin had died young, Wallace would’ve stepped neatly into his intellectual place.

The picture that emerges is of Darwin as a man decades ahead of his time in his thinking about the mechanisms of evolution, sending natural selection out into a world that wasn’t ready for it. Because of that, the so-called Darwinian revolution caused a big stir but didn’t actually greatly alter the course of science during Darwin’s lifetime. Teleological neo-Lamarckian theories accepted the fact of evolution but saw it as progressing in a directional manner towards a predetermined, higher goal, analogous to embryonic development. In the absence of Darwin, these ideas–which were much easier for scientists, clergy, and the lay public to understand and accept–would’ve dominated the field even more than they actually did. Insofar as biologists were interested in explaining adaptation, they’d have done so much as most of them actually did in our world–via appeal to Lamarck’s notion of the inheritance of acquired characters. Without Darwin, society at large, and the Christian religion in particular, would’ve had decades to get used to the idea of evolution in a (to them) relatively congenial form. So that, when natural selection was finally proposed, it would’ve been much more easily integrated into both science and society. Indeed, Bowler goes so far as to suggest that what we call the “Modern Synthesis” would’ve emerged earlier had Darwin died young! With the side benefit that the fundamentalist revival in the US would’ve ended up settling on other bogeymen to vilify (perhaps geologists, suggests Bowler), or at least viewing evolution as only one bogeyman among many. In other words, no Darwin, no creationism or intelligent design.

One really big idea under the surface here is the notion that scientific progress has a “natural pace” and can’t be accelerated by geniuses. I wish Bowler had talked a bit about whether he believes that’s generally true, because I found that a fascinating idea.

Don’t misunderstand, Bowler’s a Darwin fan–he’s absolutely not out to denigrate Darwin or downplay his importance. But he’s a historian’s historian, keen to scrape away all the accumulated myth-making by both Darwin fans and haters and get history right.

Bowler also spends a fair bit of time exonerating Darwin from the charges that eugenics, “social Darwinism”, and the Nazis are his fault. Bowler argues that, if anything, the influence ran the other way–from eugenical ideas into evolutionary biology, and that Darwin’s ideas actually slowed the progress of eugenics. No Darwin doesn’t mean no Final Solution. There’s valuable stuff here, in particular the discussion of how the meaning of “social Darwinism” has changed hugely over time. But some of this felt a little like Bowler settling scores with fringe-y nutters who arguably are best ignored.

There are all sorts of interesting tidbits in the book. For instance, Bowler suggests the two men whom he thinks would’ve finally developed the idea of natural selection in a world without Darwin, around 1900. One is hugely famous, the other is obscure. No, not Galton. No, not Spencer. Keep guessing… 🙂

I don’t know that I fully buy Bowler’s story, and not just because no historical counterfactual can ever be conclusively established. Like Marek Kohn, I wonder if the notion of natural selection would’ve been revolutionary no matter when it was proposed. But that’s a quibble, because even if that’s right you’ve still got a counterfactual picture in which evolutionary biology in more or less its modern form develops at about the same time it did in our world. I also wonder if teleology would’ve been harder to eradicate from evolutionary biology in Bowler’s counterfactual world. Bowler’s a fan of modern evo-devo (though not its most extreme, we-need-a-whole-new-evolutionary-synthesis manifestations). But I think Bowler sees rather more similarities between modern mainstream “evo-devo” and 19th century teleological thinking than there really are. And so I think he may be underrating the difficulty of incorporating natural selection into teleological evolutionism.

The first chapter gives a great roadmap and summary of the argument. And the whole book is clearly written, so I never felt like I was getting lost in the weeds and losing track of the main thread.

After reading the book, I found myself wondering about other counterfactuals. For instance, what if Darwin had worked up the courage to publish the Origin in, say, 1844, the same year Vestiges was published? That year Darwin expanded his 1842 sketch of his theory and the evidence for it into a longer (200 pages) sketch. My understanding is that, except for the “Principle of Divergence”, the 1844 sketch basically contains every big idea that would eventually be published in the Origin in 1859. So Darwin could’ve published something similar to the Origin in 1844 if he’d wanted to. That would’ve put Darwin so far ahead of his time that he would’ve been publishing simultaneously with the speculative tract that in our world prepared the ground for the Origin 15 years later. What would’ve been the consequences? For instance, would Darwin have been tarred with the same brush as the then-anonymous author of Vestiges was, derailing his career as a gentleman naturalist? Would a world in which Darwin published even further ahead of his time look like a world in which he’d never published at all because he died young?

If, like me and most other biologists, you only know the potted history of the development of evolutionary biology, or at best you’ve only read Ernst Mayr’s interpretation of that history, you’ll learn a lot from this book and have a lot of fun doing so. Even if you don’t buy the counterfactual approach, it’s an entertaining and accessible way to learn a lot about Darwin and his times. I highly recommend it.

Book review: The Signal and the Noise by Nate Silver

I’m just back from a week in China. On the flight I read the Kindle version of Nate Silver’s The Signal and the Noise. Here’s my review.

Nate Silver, for readers who don’t know, writes the FiveThirtyEight blog, which used to be independent but is now part of the New York Times. The bread and butter of FiveThirtyEight is forecasting the outcome of US elections, based on polling data and other information. In the most recent US presidential election, FiveThirtyEight and other quantitative political blogs got a lot of press for confidently predicting a narrow victory by President Obama, when many non-quantitative political pundits were saying the election was too close to call, or even (if they were conservative) predicting a victory by the challenger, Mitt Romney. And FiveThirtyEight’s forecasting model didn’t just correctly predict the overall outcome, it correctly predicted the winner in all 50 states, and its predictions of the vote percentage for each candidate in each state were both accurate and precise.

This wasn’t Nate Silver’s first predictive success, within or outside election forecasting. As a baseball fan, I’m also familiar with his previous work for Baseball Prospectus developing PECOTA, a model for predicting the future performance of major league baseball players. PECOTA was the first system of its kind and remains very successful. Before that, he trained as an economist, spent some time working as a consultant for an accounting firm, and (in what seems to have been a very formative experience for him) spent a lot of time and made a lot of money playing online poker.

Now, he’s written a book about the general problem of predicting the future. Specifically, how we’re mostly really bad at it, with a few notable exceptions. He talks about his own experiences with predicting elections, baseball performance, and poker (playing poker well involves making predictions about what cards others might hold, based on limited information). And he also talks about the history of prediction in all sorts of other areas–the weather, hurricane tracks (two rare predictive success stories), earthquakes, computer chess, the stock market, the economy, gambling on sports, climate change, terrorist attacks, and more.

I liked the book a lot and recommend it for anyone interested in the broad problem of making predictions. And it is a broad problem; one of the real strengths of the book is how widely Silver casts his net to get insights into when predictions fail and when they succeed. He considers characteristics of the system one is trying to predict (e.g., is it chaotic). He talks about characteristics of the available data and background knowledge (e.g., is the system well-understood mechanistically, how much data do we have and on what variables). He talks about characteristics of the people trying to do the predicting (e.g., what incentives do they have to make good predictions, are they alert to common cognitive biases). He talks about what sort of predictions people are trying to make (e.g., predicting the time and location at which a particular event will occur, qualitative vs. quantitative predictions). And he talks about different techniques for generating predictions (e.g., betting markets, mechanistic models, statistical models). The book is filled with interesting nuggets I didn’t know about. It’s also very well-written and engaging. And I didn’t find any errors in discussions of subjects about which I know something (baseball, computer chess, the stock market, economics), which is reassuring.

What emerges is that there’s no universal recipe for making good predictions. Good prediction involves a lot of good judgment, by which I mean deciding how to weigh various general considerations in any particular case. A few things are always helpful, such as good mechanistic knowledge (which we have for weather, hurricanes, and poker), a large historical database of cases similar to the ones we’re trying to predict (which we have for baseball), and acknowledging all sources of uncertainty and error. And a few things are always unhelpful, most importantly our tendency to see “patterns” where there aren’t any and so overfit the data and make overconfident predictions. But in between, there are lots of things that are helpful in some circumstances but unhelpful in others. For instance, having more computing power has helped weather forecasters, who have long had exactly the right mechanistic model of the atmosphere but lacked the ability to simulate it at sufficiently fine spatial resolutions. More computational power also has helped in computer chess. But it hasn’t helped in earthquake prediction, because we lack the ability to even write down the correct mechanistic model, much less parameterize it. And having data on more predictor variables might sometimes be useful, but often it just means more noise that you have to filter out in order to find a signal, which dramatically increases the risk of overfitting. “Big Data” usually just means a bigger haystack you have to search to find the same needle. (Graduate students, heed this last lesson when designing your own projects! Don’t measure lots of variables just because you can, or because you feel like more data is always better!)

There are two aspects of the book I didn’t like. Neither is a huge deal, but both of them stuck in my craw a bit because I found them to be in such contrast to the rest of the book. I’ll talk about them both at length just because I feel like it. Don’t let the lengthiness of comments here mislead you into thinking I didn’t like the book, because like I said I actually really liked it.

One thing I didn’t like is something Larry Wasserman and Brian also picked up on: Silver’s confusion about what it means to be “Bayesian” or “frequentist”. Silver spends a whole chapter ripping frequentist statistics as having seriously set back the progress of science, when in fact the proper way to make predictions–or indeed, any inferences whatsoever–is the Bayesian Way. There are several problems with this chapter:

  • Throughout the book, Silver himself adopts what’s clearly a frequentist notion of “probability” as an objective feature of the world. By “probability” Silver means how often something happens in a long series of trials. For instance, he praises weather forecasters because, in the long run, it rains X% of the time when they say there’s a X% chance of rain. What Silver means by “Bayesian” is “using Bayes’ Theorem”, but he doesn’t seem to realize that one can do that quite happily in many contexts while retaining a frequentist notion of probability. Silver’s general explication of Bayes’ Theorem and why it is useful is couched in terms of subjective or “epistemic” probability (though he doesn’t use that term), probability as measure not of the world but of our uncertainty (lack of knowledge) about the world. This is unfortunate because in practice that’s not actually the definition of probability that he himself uses. You’d think from his general explication of Bayes’ Theorem that Silver cares about subjective Bayesian credible intervals–but it’s clear from the specific case studies he discusses that he actually cares about frequentist confidence intervals. For instance, he explicates Bayes’ Theorem using a toy example of estimating whether your spouse is cheating on you, given the fact that you found a strange pair of underwear in your dresser drawer. Of course, your spouse either is or isn’t cheating on you–the true probability is either 1 or 0–but you can’t be sure which. Silver says this is a problem of epistemic uncertainty. But in practice, he treats your particular case as one of many such cases. That is, he views you and your spouse as members of a frequentist statistical population. So that one can objectively estimate quantities like the prior probability of your spouse cheating on you, at least roughly. Silver interprets this probability as the frequency with which people (presumably people sufficiently similar to you in relevant respects) are cheated on by their spouses.
  • Silver makes the mistake, unfortunately all too common among certain sorts of Bayesians, of identifying “frequentist” with “mistaken or unhelpful applications of classical frequentist null hypothesis tests”. That is, he identifies all of frequentist statistics with the worst examples of it. This is unfair on multiple grounds. First of all, the purpose of null hypothesis testing often is not to aid prediction in Silver’s sense. Predicting the future is hugely important, but it’s not the only hugely important thing to do in science (you’d never know from Silver’s book that the Higgs boson was discovered and confirmed using classical frequentist statistical procedures). So Silver here is doing the equivalent of criticizing a car because it can’t fly. Second, he neglects to mention anywhere that frequentists themselves are among the strongest critics of many of the same practices he criticizes, such as failure to correct for multiple comparisons or using one’s data to suggest hypotheses that are then tested on the same data. In contrast, throughout the book we meet all sorts of bad Bayesians, such as people who are too attached to their prior beliefs and refuse to update them in light of new evidence. Whom Silver criticizes not for being bad Bayesians, but for…not being Bayesian at all. Just as no true Scotsman would ever commit a heinous crime, apparently no true Bayesian would ever engage in any of the bad applications of Bayesianism Silver ably criticizes. Not sure why Silver identifies frequentism with the worst of frequentism, and Bayesianism with the best of Bayesianism, but he does.
  • Silver blames a focus on frequentist null hypothesis testing for many problems in scientific practice that just have nothing to do with that, and that would not be fixed if tomorrow everybody adopted Silver’s preferred methods. Most published research findings are false not because of null hypothesis testing, but because of publication biases, and because of all the hidden biases that Silver himself quite rightly identifies and that are not fixed just by using Bayes’ Theorem. Bayesians of any stripe are just as capable as frequentists of finding excuses to exclude inconvenient data points on spurious grounds, just as subject to bad incentives and cognitive biases, just as inclined to fit overcomplicated models with too many predictor variables, etc.

The other aspect of the book I didn’t like was the chapter on forecasting climate change. I found the chapter very confusing; I have no idea what Silver was trying to say. The whole chapter bounces randomly from one issue to the next, with later sections often appearing to contradict earlier sections. Silver waffles between expressing what he calls healthy skepticism about predictions of climate change, and strong confidence. He waffles on whether the mechanistic complexity of GCM models is a virtue or a vice. He waffles between contrasting climate scientists unfavorably with weather forecasters, and saying that climate scientists are actually doing quite well given that they’re faced with an inherently much harder prediction problem.  He presents in a positive way the work of “reasonable skeptics” like Scott Armstrong–and then in other places criticizes Armstrong’s approach. In some places he presents the lack of much increase in global mean temperature over the last 10 years as a reason for healthy skepticism about predictions of global warming, and in other places he dismisses it as expected stochasticity. Silver strongly criticizes climate scientists for developing and providing consensus views, and emphasizing how many scientists agree with that consensus. Which is weird, for two reasons. One is that providing a consensus view is precisely what the IPCC is designed to provide, for rather obvious political reasons. It’s not as if the IPCC could’ve been based on prediction markets or some other non-consensus-based means of aggregating disparate views. Second, climate scientists quite often emphasize how much they all agree precisely because politically-motivated attacks on their work claim, falsely, that disagreement exists where in fact it doesn’t. How else are climate scientists supposed to respond to false claims that they disagree on some point besides saying “Um, no we don’t”? Silver also criticizes the IPCC for changing their minds from one report to the next, which is totally bizarre coming from someone as keen on formal and informal Bayesian updating as Silver is. New information came in, the IPCC updated its reports accordingly–and this is bad? And finally, Silver doesn’t have any plausible suggestions of his own as to how climate scientists could do better. At the end, he just suggests that climate scientists stick to making predictions and completely recuse themselves from the political process or with any engagement with politically-motivated attacks on their work. But tellingly, he doesn’t say anything about how scientists could actually do this. I mean, is Silver suggesting that scientists should boycott the IPCC, decline to provide policy advice when asked for it, decline to give media interviews, decline to comment when their emails are hacked, not start websites like RealClimate, or what? Bottom line, this chapter did not convince me that Silver has much to teach the IPCC about climate forecasting, the clear communication of uncertainties in climate forecasts, or the best way to make use of scientific information in political decision-making.

I think many readers of this blog will like this book. If you like FiveThirtyEight, you’ll like this book. If you’re interested in prediction, this book’s right up your alley. If you want your students to learn good judgement and a healthy attitude about their own approach to statistics and modeling, this book will probably be at least as valuable to them as any technical textbook you might recommend to them. If you’re looking for a source for a good range of examples (both positive and precautionary) to use in your stats courses, this book is a good place to turn.

Stuart Hurlbert rips Sokal & Rohlf and the state of biostatistical training (UPDATEDx2)

In a recent issue of Limnology and Oceanography Bulletin, Stuart Hurlbert reviews (UPDATE: link fixed, venue corrected) the new (4th, 2012) edition of Sokal & Rohlf’s classic biostatistical text, Biometry (HT Carl Boettiger). The first sentence of the review gives you the flavor:

Reader be forewarned: were it allowed the title of this review would be “A readable but overblown, incomplete and error-ridden cookbook”.

Tell us how you really feel, Stuart! And to think that sometimes I worry if I’m too tough on other people’s work…

You should click through and read the whole thing. But if you’re not so inclined, here’s a brief summary of Hurlbert’s beefs with Sokal & Rohlf (the book, not the people; I’ll refer to the book as Sokal & Rohlf because that’s what everyone does). Hurlbert says his beefs apply to all editions, not just the most recent one:

  • No coverage of experimental design, or sampling design of observational studies. Relatedly, and worse, incorrect or confusing implications about experimental design and sampling design. For instance, there are no formal definitions of key terms like “experiment”, “experimental unit”, “block”, “repeated measures”, etc. Worse, observational studies often are described using experimental terms like “treatment”, “control”, and “randomized block design”. This leads to serious confusion, even about matters as basic as what an experiment is.
  • Too much emphasis on “statistical gimmickry” of little or no practical use, such as standardized effect sizes.
  • Superficial, cookbook-type treatment of many procedures, with no conceptual framework for understanding why one might want to use those procedures.
  • Incorrect, incomplete, and confusing coverage of other matters, from when it’s appropriate to use a one-tailed test, to whether to correct for multiple comparisons (Hurlbert apparently believes you should never do so, and so slams Sokal & Rohlf for insisting on this), and many more.
  • Rigid adherence to Neyman-Pearson null hypothesis testing, at the expense of estimation and more refined, quantitative assessment of the evidence for or against any given hypothesis.*

The only value Hurlbert sees in Sokal & Rohlf is as a reference manual for the “recipes” for how to calculate various statistical procedures. He concludes by blaming the popularity of Sokal & Rohlf for what he sees as decades of poor statistical practice in biology. He also laments that no current biostatistical textbook teaches an appropriately-modern philosophy of statistics, in a clear way with a focus on principles, with no errors.

What do you think of all this? I have to say I found it kind of surprising, but not because I revere Sokal & Rohlf. I’ve mostly used it as a reference manual myself. I’d certainly never try to teach from it at any level, if for no other reason than it’s way too voluminous. I guess I always assumed, without really thinking about it, that it was always intended, and mostly used, as a reference manual. Was I wrong to assume that? And while I find Sokal & Rohlf old-fashioned in some ways (e.g., randomization, bootstrapping, and generalized linear models render classical non-parametric tests and data transformations largely irrelevant), that never really bothered me. The first edition came out in 1969; of course it’s going to be old-fashioned. And I don’t know that it’s fair to pick on Sokal & Rohlf and blame it for the purportedly terrible statistical practices of modern biologists, even though the book certainly is popular. Insofar as our statistical practices are terrible (and I don’t know if they are or not), there’s surely plenty of blame to go ’round. And can’t you also give Sokal & Rohlf credit for helping to encourage more biologists to use statistics in the first place? But I’ve never really thought about Sokal & Rohlf all that much, and I actually haven’t cracked it open in years, so I’m sort of a curious bystander here.

As an aside, I found it interesting that such vociferous criticism of Sokal & Rohlf came from someone from basically the same school of statistical thought. Hurlbert isn’t a Bayesian of any stripe, nor is he advocating for computationally-intensive methods, for instance. His criticisms of Sokal & Rohlf mostly aren’t criticisms of what the book sets out to do, they’re mostly criticisms of the book’s execution.

What do you think? Does Sokal & Rohlf deserve the criticism Hurlbert heaps on it? More broadly, what do you see as the biggest problems with how modern biologists teach and use statistics? And what textbook(s) should we be using in our courses in order to fix those problems? (Again, Hurlbert says there’s no biostatistics textbook that’s readable, strong on general principles, and error-free!)

My interest in this isn’t purely academic. I’m not just looking to grab some popcorn and watch proponents and detractors of Sokal & Rohlf argue. 😉 As I noted in a previous post, this fall I’m taking over teaching the introductory undergrad biostats course in my department. So for the first time, I need to think seriously and in great detail about exactly what introductory biostatistical material to teach and how to teach it. I’ve settled on a textbook (Whitlock & Schluter), and I have a tentative list of lectures and the major changes I want to make to the existing labs. But nothing beyond that. And even getting that far has required a lot of thought, in particular about precisely the issues Hurlbert raises. How much emphasis to place on general, unifying principles vs. coverage of specific tests. How much emphasis to place on black-and-white rules of good statistical practice vs. equipping students to make informed judgment calls. Etc.

It occurs to me that teaching biostatistics is something like teaching children good behavior. You start out by teaching kids black-and-white rules, like “don’t lie” and “don’t hit your sister.” And it’s only later that kids learn that good behavior often isn’t black-and-white. Sometimes it’s not only ok to lie (or to hit your sister!), it’s positively a good idea, morally. Heck, there are lots of tricky moral situations that you aren’t even taught about at all until you’re older. And that’s without even getting into competing, mutually-incompatible philosophies as to what good behavior consists of, and what makes it good! So you tell me–what should we be teaching our “kids” about biostatistics if we want to start them down the road towards responsible “adulthood”? (“Don’t hit your sister fail to correct for multiple comparisons!”)

*Hurlbert actually thinks Sokal & Rohlf should’ve based their book on what Hurlbert calls the “neoFisherian” approach. I confess I’d never heard the term “neoFisherian”, which is Hurlbert’s own recent coinage. Hurlbert has a 2009 paper if you want to find out what he means by “neoFisherian” and why he thinks Neyman-Pearson hypothesis testing is so outdated that it should no longer be taught (UPDATE: link fixed). As far as I can tell, what Hurlbert means by “neoFisherian” doesn’t sound too far from Deborah Mayo’s notion of “error statistics” (which itself is actually not all that far from Neyman-Pearson, or even from some forms of Bayesianism). But it’s a little hard to tell because much of Hurlbert’s paper focuses on what seem to me to be rather nit-picky details of current practice (like conventions for reporting P values). Anyway, I think it would’ve been helpful for Hurlbert to briefly elaborate his own philosophy in his review, rather than just refer to it using a term of his own recent coinage.

Book review: The Pseudoscience Wars by Michael Gordin

A little while back I read The Pseudoscience Wars: Immanuel Velikovsky and the Birth of the Modern Fringe by Michael Gordin, a history professor at Princeton. It’s a case study of how professional scientists react to what they see as “pseudoscience”–something that has many of the trappings of real science but is not real science, at least not in the view of professional scientists. It’s a good read, working well as a narrative about the life and times of an unusual character. The story contrasts in interesting ways with that of other “pseuodsciences” with which I’m more familiar, like “creation science” or opposition to climate change science. And the book has implications for contemporary issues I care about, in particular peer review and scientific publishing. I recommend it highly. If you want to know more about why, read on!

The book tells the story of Immanuel Velikovsky (1895-1979), a Russian born Jew and Freudian psychoanalyst. Velikovsky emigrated to the US in 1939. Velikovsky’s research on ancient myths, originally intended as a project in the psychological interpretation of ancient history, eventually led him to publish Worlds In Collision in 1950. In that book, Velikovsky drew on on purported evidence from ancient texts to argue that in the 15th century BCE, Venus was ejected from Jupiter as a comet and passed close to Earth. This changed Earth’s orbit and axis and resulted in massive global catastrophes recorded in ancient texts. The book became a bestseller and caused a firestorm of controversy. The storm later revived in the 1960s and early ’70s, when Velikovsky (who had continued to pursue and publish his ideas) became a countercultural hero to US college students. But despite Velikovsky’s best efforts, his ideas mostly died with him. He and the massive public controversy surrounding his work are mostly forgotten today, even though only a few decades have passed. Gordin tells the story of Velikovsky and his ideas by drawing heavily on Velikovsky’s correspondence and extensive unpublished writings, as well as on the correspondence of his allies and opponents.

Gordin’s book isn’t about whether Velikovsky’s claims were true. Gordin’s interests lie elsewhere. He uses Velikovsky’s story to argue that calling something “pseudoscience” is an act of boundary drawing–and such acts are always contestable. It’s infamously difficult to unambiguously separate “science” from “non-science”, and Gordin argues that it’s actually impossible. One intriguing way he makes this argument is to talk about Velikovsky’s own difficulties drawing boundaries around his own work, policing his own fringes. Velikovsky attracted followers and supporters–but those followers and supporters brought to the table their own ideas, their own interpretations of Velikovsky’s work, and their own suggestions about how it should be pursued. Any discipline, field, or program of inquiry, in order to exist at all, has to draw boundaries that define what it is and what it isn’t.

One way the boundary between science and non-science gets drawn is via peer review. Gordin talks at length about this, and what he found may surprise you. This was one of the most interesting and thought-provoking aspects of the book for me. Gordin notes that one reason astronomers were so upset by Worlds In Collision is that it was originally published as nonfiction by Macmillan, then the most respected publisher of scientific books (especially textbooks) in the US. Getting published by Macmillan was a stamp of serious approval. Outraged astronomers initially accused Macmillan of failing to have Velikovsky’s book peer reviewed–but in fact, Macmillan had had it peer reviewed, in the usual way. The reviewers were all perfectly reasonable choices, and while they all had serious reservations about the correctness of Velikovsky’s claims, they all recommended that the book be published, so that its claims could be exposed to scrutiny. Velikovsky correctly accused his opponents of organized efforts to suppress his ideas (astronomers organized a boycott of Macmillan, and in fear of losing its textbook market Macmillan sold the rights to Worlds in Collision to another publisher). But Velikovsky was no exemplar of openness. As noted above, he himself spent much effort exercising oversight on anyone who wanted to promote his ideas, and purging those who deviated from orthodoxy.

As a blogger, I’m well aware of widespread concern, mostly among more senior scientists, that blogging is just a way for people to do an end run around peer review and publish crazy ideas. So I was very interested to read senior scientists back in the 1950s and ’60s complaining about how it had become too easy for anyone to publish anything, and quite explicitly longing for the days when science was “aristocratic” rather than “democratic”. And remember: their complaint was prompted by a book that went through peer review! Further back, think of cases like Vestiges of the Natural History of Creation, a bestseller in its time (at the dawn of science as a profession), despite widespread and vociferous criticism from the leading experts of the day. I conclude that scientists have been complaining about how it’s too easy for anyone to publish “pseudoscience” for as long as there have been people who call themselves scientists! Which of course is exactly what you’d expect on Gordin’s thesis that, in order for any distinct discipline or intellectual activity to exist, its practitioners have to police its boundaries. So if you long for the “good old days” when it was difficult or impossible for “pseudoscientists” to publish their work and get the public to notice it, sorry–those days never existed!

Gordin is good on the larger context of the time, arguing fairly persuasively that the reason astronomers chose to attack Worlds In Collision publicly (thereby giving it a lot of free publicity and surely helping to boost its sales) was because of then-fresh memories about the progress of Lysenko’s biology in Russia. It was felt that failure of scientists to speak out sufficiently forcefully and publicly against Lysenko had helped him rise to power. Gordin also draws interesting contrasts between the progress of Velikovsky’s ideas, and those of “creation scientists”. American “creation science” in something like its present form has its roots in the 1960s, and its founders actually had some brief contact with Velikovsky, due to their shared interest in the effects of purported recent global catastrophes. For various reasons “creation science” was better able to police its own boundaries than Velikovsky was able to police his, and so became a movement that outlived its founders and still exists today.

For me, Gordin’s book functioned in part as a companion piece to The Price of Altruism, Oren Harman’s recent biography of George Price. Although he was a chemistry Ph.D. and worked on the Manhattan Project, George Price was very much a fringe scientist, with half-brilliant, half-crazy ideas on all sorts of things (most of which he failed to fully develop or publish). He was also a difficult personality. So, not so different from Velikovsky in many ways. Except that Price eventually came up with some big ideas that did pan out scientifically (the application of game theory to animal behavior, and the Price equation). Although even then, those ideas only gained a foothold with the help of established evolutionary biologists (John Maynard Smith and Bill Hamilton, respectively). I suspect that many scientists reading Gordin’s book might resist his thesis–that there’s no clear, bright line separating science from pseudoscience, or scientists from cranks–on the grounds that Velikovsky’s ideas were just obviously not scientific, and Velikovsky himself obviously a non-scientist. This is the “I may not be able to define pornography pseudoscience, but I know it when I see it” view. But if that’s your view, the example of George Price should give you pause. There are infinitely fine gradations from science to pseudoscience, and scientist to crank. Ecologists and evolutionary biologists are comfortable thinking about such fine gradations in their own work–the boundaries between different ecosystems, or between true “species” and mere “varieties”, are famously fuzzy. We should be “preadapted” to be comfortable with the fine gradations from someone like Maynard Smith, to someone like Price, to someone like Velikovsky.

Finally, a point Gordin doesn’t make, but could have. Anyone who thinks that the reason “pseudoscience” flourishes is because the “general public” isn’t sufficiently well-educated will have a tough time explaining the history of “pseudoscience” over the last 150+ years. The average educational level is higher today than it was in the 1950s, and much higher than it was in the 1850s. But yet “pseudoscience” has been around for all that time, and shows no signs of going away (or getting any worse, as far as I can tell). Anti-vaccine campaigns, anti-GMO campaigns, intelligent design, opposition to climate science, unorthodox ideas about economics, health fads…like Velikovsky’s catastrophism, none of that stuff flourishes because of lack of education, or even lack of the right sort of education. So if you want to oppose that stuff (and I do!), focusing on “better education” is unhelpful. Don’t get me wrong, I’m all for good education! I just don’t think that even the most well-educated society would be free of “pseudoscience”, or even have less than we have right now.

Book review: Petchey and Beckerman’s Getting Started with R

I’ve decided to start doing occasional book reviews. There are only a few sources for book reviews in ecology and evolution, and they often miss books I’m thinking of buying. So once in a while I’ll help fill the gap by posting a review.

Full disclosure: I’m starting with a book by two good friends of mine: Owen Petchey and Andy Beckerman’s Getting Started with R: An Introduction for Biologists. I’m starting with this book because it’s short, and because they were kind enough to send me a free copy. That Owen and Andy are friends affects my decision to review the book in the first place, but doesn’t affect my review.

Owen and Andy are ace ecologists, and R evangelists. They’ve used many other stats packages in the past, but now they use R exclusively, and like it so much they decided to write a book about it. This book grows out of their years of experience offering multi-day training courses on how to use R, an open source, freely available, and extremely powerful statistical package. Like their courses, their book is aimed at biologists who know what statistics they want to do, but who have no experience with R and find the prospect of using command line-driven software daunting. If that describes you, you need this book!

It’s six chapters, covering just over 100 pages:

  • Chapter 1 explains why you would want to use R instead of other statistical software: it’s free, it works on any computer with any operating system (Windows, Mac OSX, Linux), it’s powerful, it’s command line driven rather than menu driven (so it forces you to know what you’re doing), and it makes beautiful figures. Chapter I also explains how to download and install R.
  • Chapter 2 gets you ready to work, covering topics like how to store your data so R can find it, the basics of how R works, and how to tell R to do what you want it to do.
  • Chapter 3 explains how to get your data into R, and how to graph and manipulate your data
  • Chapter 4 gets further into graphing, telling you how to make and customize common plots like bar graphs and scatterplots.
  • Chapter 5 walks you through how to do some common, basic statistical tests: chi-square, two sample t-tests, and general linear models (regression, ANOVA, ANCOVA). Coverage includes not just how to execute the test, but how to get and interpret the output you want, how to do assumption checks, etc. This isn’t a statistics textbook: Owen and Andy assume that you understand the stats that you want to do, and just need to be told how to do them in R. Chapter 5 also covers making publication-quality figures to report the results of these tests.
  • Chapter 6 adds some final comments and encouragement.
  • There’s an appendix with some references (mainly R-based statistics textbooks)

Throughout, the book is illustrated with screenshots. The screenshots are annotated with circles, arrows, and little text boxes pointing out and explaining whatever it is that the main text is referring to. And of course all the R commands needed to do everything that the book covers are presented, and their outputs illustrated.

This is an excellent book. It’s totally functional–the focus is on explaining how to get R to do what you want it to do. Further, the focus is on the things that you’re likely to want to do–the book’s not cluttered with extraneous material. There’s an emphasis on good practice about basic things like workflows, so that you’ll always be able to go back later and reconstruct exactly how you processed and analyzed your data. Finally, the book is written in a deliberately conversational style, and pitched at the perfect level for its audience. Working your way through this book (which can be done in a day or two) is pretty much exactly like having Owen or Andy standing beside you, walking you step-by-step through your first experience with R.

I wish I’d had this book long ago. I’ve been using R for years, but I picked it up by teaching myself, reading the help files, using a much longer R-based stats text, and by using an add-on package that lets you perform common statistical tasks with drop-down menus* So until I got this book, I still had really basic gaps in my R knowledge (like “how to make a multi-panel figure”), which this book has now closed.

Want to learn R, or maybe just want to check it out, but scared about the steepness of the learning curve? Buy this book. Problem solved.

Need further convincing? See Graeme Ruxton’s review in TREE.

*Yes, there is an add-on that makes R menu-driven. Using it makes you a sinner. I am a sinner.