Ordinarily I’d save this for a Friday link, but it seemed sufficiently bloggable that I’m giving it its own post. Here’s econometrician (i.e. economics statistician) David Giles’ list of 10 things for econometricians to keep in mind when they analyze their data. I think it’s a really good list for ecology too.
- Always, but always, plot your data.
- Remember that data quality is at least as important as data quantity.
- Always ask yourself, “Do these results make economic/common sense”?
- Check whether your “statistically significant” results are also “numerically/economically significant”.
- Be sure that you know exactly what assumptions are used/needed to obtain the results relating to the properties of any estimator or test that you use.
- Just because someone else has used a particular approach to analyse a problem that looks like yours, that doesn’t mean they were right!
- “Test, test, test”! (David Hendry). But don’t forget that “pre-testing” raises some important issues of its own.
- Don’t assume that the computer code that someone gives to you is relevant for your application, or that it even produces correct results.
- Keep in mind that published results will represent only a fraction of the results that the author obtained, but is not publishing.
- Don’t forget that “peer-reviewed” does NOT mean “correct results”, or even “best practices were followed”.
Economist Mark Thoma adds a couple more:
Don’t take econometric techniques in search of questions. Instead, start with the important questions and then develop the econometrics needed to answer them.
Model the process that generates the data.
I’ll add a few more:
- If any of your data-analytic choices are data-dependent, there’s a very good chance you’re compromising the validity of your inferences. Best practice is to pre-specify everything. (This may be a restatement of Giles’ #7, but I’m not sure because I’m not sure what he means by “pre-testing”)
- Multiple comparisons are a real problem.
- The best analysis is the simplest, easiest-to-understand analysis adequate for addressing the question asked. (I think of this as Brian’s “no statistical machismo” rule)
- Overfitting is just as bad as underfitting.
- These guidelines can conflict. For instance, the desire to model all of the processes that generated the data is one source of statistical machismo. And it’s not always obvious how to apply these guidelines. For instance, in ecology one can often dream up a plausible-sounding post hoc explanation for any statistical result, thereby rendering David Giles’ guideline #3 useless. This means that doing statistics well is a matter of making judgment calls about how to apply the “rules”, and about what rules to break. Good judgment is built a foundation of both technical knowledge and experience.
So, what would you add to this list?
HT Economist’s View
With regard to #3, I would even say that in ecology we are certainly more excited with results that do not make common sense (they still have to somewhat make sense though). Usually, this also happens when results are expected to be trivial (by colleagues at least).
A few more for ecologists comes to mind:
1) I think you reference this above, but to put it more explicitly, make sure your data don’t violate statistical assumptions of the test your using! So many things get to the review stage and it turns out the data are non-independent, highly heteroskedastic, or non-normally distributed, etc.
2) When using R, don’t just use functions and packages at face value. Take the time to read the package notes and understand what the function does. A very common complaint I hear about R is that that aov() funx gives Type 1 error instead of Type 3 error. People are surprised when someone points it out to them for the first time. BUT, if they’d read the the notes of the aov function it flat out says that this isn’t the appropriate function to use on unbalanced data (precisely for that reason, because Type 1 and 3 are the same if the data is balanced). I think this goes along with blindly using code that other people give you. Take the time to understand each and every thing the code is doing!! Better a little pain up front than discovering that your entire analysis is wrong.
3) Last point is a general question posed at Jeremy about Multiple Comparisons. Its kind of a philosophical “Where to draw the line?” type of question. For the sake of argument, lets say that I have 3 very similar sets of community data (each one with multiple sites and 5 important environmental data variables) and I’m running the exact 5 linear regression models on each set of data and publishing them together in one paper. Do I use a corrected alpha based on 15 tests, or 5 tests? I’ve heard arguments going both ways. I tend to err on the side of being conservative, but that’s more to placate potential objections from reviewers….
Re: your #1 and 2, Alan Stewart-Oaten has a great article on rules and judgment in statistics, which if I recall correctly uses testing of assumptions and type I/II/III SS as examples. Briefly, he suggests that neither issue is cut-and-dried:
EDIT: he talks about multiple comparisons too; I’d forgotten that.
I’ll echo Jeremy
Re #1 – you have to really know what you’re doing. Many violations of assumptions work just fine if you plow ahead ignoring them. EG ANOVA & Regression are famously robust to non-normal data as long as the data is not skewed (but kurtosis is not a big deal) Others affect some outputs (e.g. p-values) but not others (e.g. estimates of coefficients) (non-independence being in this category) or the opposite (I think heteroscedasticity is in this category). Especially when the alternative is to just give up and not do the analysis one needs to be very careful insisting to a pedantic level that all assumptions need to be met.
Re #2 – AMEN! and I’ll go one step farther take actions to ensure yourself that the correct answers are being calculated! I can name two R packages in CRAN relevant to ecology that return total garbage results. Reputation (status of author and # of users of package) are probably the best guards against this for those not able to do their own checks.
Re #3 – now you’re into philosophy. In all seriousness you can’t answer your question without discussion of how inference is down in science. Which is why my rule of thumb is to worry only about multiple comaprisons when you have a dozen tests and even then if the majority of p-values are significant and much less than 0.05 it is kind of a waste of time to do a correction.
I knew if I waited long enough Brian would point out the robustness of many approaches to violations of their assumptions, and how pedantry about assumptions is an example of bad judgment about applying the “rules” of statistics. 🙂
Regarding the multiple comparisons problem:
Click to access multiple2f.pdf
“…multilevel models address the multiple comparisons problem and also yield more efficient estimates, especially in settings with low group-level variation, which is where multiple comparisons are a particular concern.”
The title of the paper seems to refute Jeremy’s 2nd point but it actually acknowledges the problem and provides a better solution than ad hoc corrections (which have been rightly criticized here in the past).
It’s for this reason that I just said “multiple comparisons are a problem”, without specifying what to do about them.
My own view is that there are various ways to deal with multiple comparisons and it’s a judgment call which one to use.
Fair enough, though I’d argue that many folks will see that and automatically think Bonferroni adjustments are a necessary and useful tool. Not sure I agree with Stuart Hurlbert’s caustic attitude on the subject, but the multiple comparisons problem has historically delved into the realm of “statistical machismo”.
“pre-testing” probably reflects prime analyses with partial data set (when you’re too curious and cannot wait for actual completion of experiments). Or the very beginning of statistical investigation.
I’d be enclined to think changes may occur in the course of analysing when you eventually consider covariates as better suiting your question or when prime analysis demonstrate some covariables are colinear above tresholds fatal to analysis (and you didn’t expect it).
In this sense, “pre-analyses” are common steps of analysing data.
On the multiple comparisons, I’m curious as to your thoughts on this Hurlbert piece where he asserts that they aren’t the problem we make them out to be, and that this is more of a historical artefact of imprecise language than a real thing. It’s something I still go back and forth on at times.
I’ve seen Hurlbert’s piece. I confess I don’t fully understand his point of view.
Apparently it remains an open discussion. I tend to think if you’re testing multiple independent data sets, it shouldn’t be a serious problem. But if you use the same (e.g., environmental) covariate across multiple tests, as is commonly done in ecological studies, you should seriously start to think about correcting for multiple comparisons (IMO Dunn-Sidak seems like a reasonable compromise) or, more preferably, using multi-level models as mentioned above.
X) Whatever you do, there will be an informed person who will think that you’ve done the wrong thing.
bam! nailed it.
/end obvious joke 🙂
One more piece of advice I forgot:
For analyses of any complexity (or heck, even quite simple analyses), run the analysis on white noise and see what you get. This is a really good sanity check. You should get nothing; if you don’t, that’s bad.
Note that if you used some kind of model selection process (including informal things like eyeballing your data before deciding what analysis to do), you need to repeat the entire model selection process on the white noise, not just refit whatever final model you ended up with in the original analysis.
I’d add: Make sure you’ve read and digested at least 2 good, independent, current experimental design/statistics text-books before starting anything 🙂
Run the analysis several different ways and if they all come out the same, you’re set (i.e robustness is good). If they’re different, you have to figure out why
Pingback: What we’re reading: Next-generation admixture estimates, mutation rates shaped by epidemiology, and whatever happened to that data? | The Molecular Ecologist