To me one of the all time most interesting meta-research papers (research about the process of research) was written by William Shockley way back in 1957 (On the statistics of individual variations of productivity in research laboratories).
Shockley won the Nobel prize for the invention of the transistor and later more or less singlehandedly launched Silicon Valley. He was, by any account, a not so nice person. He attempted to steal sole credit for the transistor when he had collaborated with two other people; in later life he veered into eugenics, culminating in founding a sperm bank only for Nobel prize winners. And you can find roots of his eugenics views in the paper I am about to talk about. Despite all this, I have returned many times to this one paper and found worthwhile new ideas in it (I guess it proves science shouldn’t be about personalities).
The paper was written when Shockley was the director of Bell Laboratories back in the 1950s when this was one of the premier research centers in the world. He gathered statistics on all of the research employees in his lab as well as many other national labs, university departments and other research units and showed rather decisively that productivity as measured by total number of publications, rate of publication and number of patents is log-normally distributed (most researchers had low productivity and a few had extremely high productivity, just like the species abundance distribution in ecology). This observation remains true today and applies to other research related topics which are also lognormal like the # of citations a paper generates (the median number of citations for a paper is zero!), the impact factor of journals, etc. It is where he went with this idea that I find interesting.
I want to start with what is for me the 2ary point of the paper. Namely Shockley was the manager of a lab responsible for setting salaries and he mused on the implications of lognormal productivity for salaries. His observation was pretty simple – salaries are basically additive – the best researcher is likely to be paid only 50-100% more than the average while their productivity will be an order of magnitude (1000%) or more than the average. Thus, his rather inescapable conclusion is that from a manager’s point of view where one is trying to maximize productivity per dollar spent, one should exploit the additive salary/multiplicative productivity disparity by employing the people in the extreme right tail of the productivity curve, even if it means paying them 50-100% more. He was basically railing against the very rigid salary structures of large corporations and governments that prohibited this. Over time, in many intellectually based disciplines (academics, software development, engineering) top managers have followed this advice. And for better and for worse, this is the rationale of the “superstar” system at research intensive universities where the very top people seemingly can negotiate any salary and perks they want (it’s probably also the rationale by which the top departments and universities get disproportionately more money). Of course there are countervailing forces like fairness, innate human dignity, encouraging a collaborative culture, the need for glue people who make the place hang together even if they’re not cranking out the papers (we still need department chairs and peer committees), etc why this might be a bad idea too. I’m not weighing in on what is the right way to do it – but the logic of the superstar system is laid rather bare in Shockley’s paper.
To me the more interesting discussion is on the mechanisms that might lead to a lognormal distribution in productivity. Shockely presents several.possibilities. But the one on the left hand side of page 286 has stuck with me and informed my own approach to research.
Shockely suggest that producing a paper is tantamount to clearing every one of a sequence of hurdles. He specifically lists:
- ability to think of a good problem
- ability to work on it
- ability to recognize a worthwhile result
- ability to make a decision as to when to stop
and write up the results
- ability to write adequately
- ability to profit constructively from criticism
- determination to submit the paper to a journal
- persistence in making changes (if necessary as a result of
Shockley then posits, what if the odds of a person clearing hurdle #i from the list of 8 above is pi? Then the rate of publishing papers for this individual should be proportional to p1p2p3…p8. This gives the multiplication of random variables needed to explain the lognormal distribution of productivity (Shockley goes on to note that if one person is 50% above average in each of the 8 areas then they will be 2460% more productive than average at the total process).
But what I really like and take home from this paper is the hurdle model and how I find it a useful way to think about my paper writing productivity. The model says writing a paper is not about one thing. It is about a bunch of things. And – the really surprising point – all of those things count more or less equally. I think most academics have a mythos that people who are productive scientists are mostly good at #1 (coming up with ideas) and maybe #6 (writing). I don’t think most people think about the fact that being productive is about knowing when to stop or knowing what is an important result. And especially, #7 and #8 are about rejection and dealing with rejection. Did it ever strike you that being a productive scientist is 1/4 about dealing with rejection well? It probably should – recently at a meeting the VPR (vice president of research) on our campus pointed out that the person with the most grants on campus was the person who had been rejected on the most grants.
Another conclusion is that if you are really bad at just one factor (pi close to zero for just one i), it sinks your overall productivity. This is innate in the multiplicative model (it is analogous to the ecological concept of bet hedging*). Being moderately good at everything is better than great at some and terrible at others (the oft heard “I’m terrible at writing but really good at coming up with ideas” doesn’t cut it but nor does the opposite).
In a way the hurdle model is disappointing – I have to be good at lots of areas to get papers out the door. But in another way it is really comforting – I don’t have to be great at any area – I just have to tend to my knitting and plug away at ALL aspects of the process and I’ll do alright. Bottom line – getting a paper published (or a grant or even tenure) is about being pretty good at clearing all of a lot of different types of hurdles but not exceptional at anything.
Concretely, I use this model when I’m worried about my productivity by trying to think about which hurdle is most holding me back (not on one paper but across the sweep of papers).
What do you think? Does the hurdle model work for you? Any steps Shockley should or shouldn’t have included?
*To grossly oversimplify, bet hedging notes that in a sequence of fitnesses over time, evolution maximizes the geometric mean (multiplicative model) rather than the normal arithmetic mean (additive model). As a result fitness is increased both by increasing fitness of each component but also by reducing variance between components. Thus a steady-as-she-goes strategy without a lot of wild up and down swings is favored.