Also this week: why modeling the process that generated the data is the least of your worries, pretty pictures of mammals, and more…

**From Jeremy:**

This will blow your mind (unless you’ve heard of it before). Say that you have sample data on three or more variables. They could be anything, they need not be related in any way. You want to estimate the true (population) means of these variables. Your best estimate is the vector of their sample means, right? *Wrong*. Better estimates can be obtained by shrinking the mean of each variable towards the grand mean of all of them. This is Stein’s Paradox. It’s a famous result in statistics, dating from 1955. It’s totally counterintuitive–until it’s explained properly, and then it makes total sense. And once you get it, you’ll have a much deeper understanding of everything from nonparametric smoothing to empirical Bayes methods. Check out this wonderful, totally non-technical paper on Stein’s Paradox from Brad Efron and Carl Morris. You’ll be glad you did.

Deborah Mayo argues that the replication movement show signs of becoming a bandwagon. As she puts it, “non-significant is the new significant.” A quote to give you the flavor:

The authors seem to think that failing to replicate studies restores credibility, and is indicative of taking a hard-nosed line, getting beyond the questionable significant results that have come in for such a drubbing. It does not. You can do just as questionable a job finding no effect as finding one. What they need to do is offer a stringent critique of the other (and their own) studies. A negative result is not a stringent critique.

Papers that were publicly accused of reporting fraudulent data on the now-defunct website Science Fraud were seven times more likely to be retracted than otherwise-similar papers for which the accusations were privately reported to the relevant authorities. There are various possible explanations, but it sure does look like journal editors often move only in response to negative publicity. Which is unfortunate for various reasons, including the fact that it encourages people to immediately go public with accusations of fraud–accusations that don’t always stand up to scrutiny. But as one of the folks quoted in the linked piece notes, journals need to recognize that a significant fraction of people no longer trust them, and won’t be satisfied by stonewalling.

Via the BBC, a pretty good popular treatment of how human cultural practices have influenced our genetic evolution. Includes discussion of some of the best-studied examples. Good source of examples for undergraduate classes.

Rookie mistakes in statistical analyses of empirical data. Or, “why correctly modeling the data-generating process is the *last* thing you should be worried about.” From economics, but it applies to ecology too. (ht Worthwhile Canadian Initiative)

And finally, the winning shots from the Mammal Society’s Mammal Photographer of the Year competition, which focuses on British mammals. Dolphin 1, salmon 0. And THE BROWN HARE IS WATCHING YOU.🙂

From this side of pond the BBC link tells me:

“We’re sorry but this site is not accessible from the UK as it is part of our international service and is not funded by the licence fee. It is run commercially by BBC Worldwide, a wholly-owned subsidiary of the BBC, the profits made from it go back to BBC programme-makers to help fund great new BBC programmes.”

So how’s it funded internationally if not by the licence fee? By advertising presumably, which I’d happily look at in exchange for a view of this page!

Hmm, surprising. And yes, I presume it’s funded by advertising.

I like the link on Rookie mistakes in statistical analyses. I agree with all four points, but I’ll reserve judgement on whether “modelling the data generating process” is the least important. So many analyses in ecology just don’t make sense if you don’t think about the process generating your data. For instance, ignoring detection errors leads to biases in regression coefficients and probably incorrect standard errors too (their point #3). I know, I know, Brian thinks that’s bogus! Fitting a Poisson model with an offset for the size of the sample/length of the interval makes the coefficients directly interpretable as changes in the rate of the event per unit space/time.

I think the point about internal validity is one we ignore at our peril. So much of what’s published is mindless regression without thought to WHY those covariates are measured. We need better theory, or better understanding of the relevance of the theories we have.

Re: Stein’s Lemma. If you really want to estimate the true population parameters then using the mean for each sample is the BLUE (best linear unbiased estimator), contrary to what you imply in this post. However, if you want a combined estimator of the three population means then Stein’s lemma applies. I think this is an important distinction.

Yes, I know. I tried to write the post in a non-technical way to indicate that what you’re interested in is the vector of means collectively, not each mean individually. Apparently I didn’t do that very well. Sorry about that.

Hmmm. I cannot say I’m too familiar with Stein Estimators but from my understanding the goal isn’t to estimate the means of three different samples but to use the three samples to estimate the true population parameter (from which the means of each sample will deviate). So if one has some measure from the three different samples then the best estimate (in terms of minimizing expected error variance) is the Stein estimator and not the mean of the three samples. The mean of the three samples is BLUE but has large error variance while Stein finds the sweet spot on the bias-variance trade-off curve. This may be what Dominic said but I’m confused by his comment (and Jeremy’s response!)

Thanks for the great Efron article that I’ve read and see now that Stein *is* used to estimate parameters from the different populations. Very nice. But this still raises a point of my confusion with Dominic’s comment that if you want to estimate the true population parameters of the different pops, the mean of each is BLUE. Again, true, but this comes at the cost of higher expected error variance relative to biased estimators like Stein

My only gripe was the use of the phrase “best estimate”. In statistics we usually mean best has the lowest variance among all unbiased estimators (BLUE). You can certainly reduce the variance of an estimator if you are willing to sacrifice unbiasedness. This is the case with many of the modern shrinkage methods in regression like LASSO.

@Domonic:

My informal wording closely follows that used by Efron and Morris in their paper. For instance, in their introductory passages they refer to Stein’s method as “better” and “more accurate” than just using the sample means. So I suggest that if you find my admittedly-informal phrasing seriously misleading, you should also have a beef with Efron and Morris’ phrasing.

Further, I linked to the Efron and Morris piece, and to the quite good Wikipedia page, for a reason: so that curious readers could click through. And I tried to write my comments in such a way as to encourage people to click through (which several dozen have done so far). I highly doubt that anyone would be seriously misled by my phrasing, simply because no one is going to rely on it. Anyone who cares enough about the topic to do anything other than forget my passage as soon as they read it is going to click through.

As I said in a previous comment, I don’t know that my informal phrasing is necessarily the best possible informal phrasing. But with respect, I think you’re misunderstanding the purpose of my passage, and judging an informal paragraph in a linkfest aimed at ecologists by the wrong standards.

@Jeremy. Fair enough, perhaps as a statistics PhD student I was a little too sensitive to the phrasing (you know how students can be!), But I do think this can be really confusing for your typical ecology student (I have consulted for quite a few).

“I think the point about internal validity is one we ignore at our peril. So much of what’s published is mindless regression without thought to WHY those covariates are measured. We need better theory, or better understanding of the relevance of the theories we have.”

Mindful regression is not likely to do much better with observational designs. The problem in identification is that for regression to result in coefficients that measure unbiased causal effects with error variances due only to sampling (and not missing confounders) 1) all of the confounding factors must be in the model or 2) all confounding factors are blocked using something like instrumental variables. Theory about why a covariate should be in a model doesn’t reduce bias and decrease (the real) error variance. For the most part, neither are really achievable in practice. That’s my pessimistic view at least but I’d be happy if someone could convince me otherwise.

I agree that mindful regression won’t necessarily be better with observational designs. However, by “better theory” I was thinking more of actual theoretical models fitted to data, rather than regression at all.

Agreed. And I like Jeremy’s “Tony Ives Rocks” post on this sort of strategy.

(EDIT from Jeremy: here’s the post Jeff’s referring to: https://dynamicecology.wordpress.com/2013/08/06/esa-monday-review-tony-ives-rocks/)

Re Stein: does it only work for quantities with the same dimensions? The formulae in the Efron example involve adding sample means.

Good question. You’d think it would have to be restricted to quantities with the same dimensions, since otherwise how could you calculate Stein’s formula?