Yesterday I presented what I tongue-in-cheek (or arrogantly – take your pick) called “10 commandments for good data management”. In that post I laid out what I believe to be best practices for managing and analyzing scientific data. Key points were to separate a data entry copy from an analysis copy of the data and to organize them differently, to use row-column organization of raw data, to use a star schema, and to denormalize data before analysis.
Here I present a worked example. It is from a hypothetical survey of raptors (data actually generated by a computer simulation). It records abundances for a number of species of raptors at a number of sites and on a number of days. The sites are unimaginatively named alpha, beta, gamma, etc. Dates are American (mm/dd/yyyy) format. Species names are real species names for raptors in North America. Abundances are made up. There is also data on temperature for each of those sites for each of those days. And some ancillary information on sites (including lat/lon coordinates). It is a constellation schema in the terms of yesterdays post. One fact table is abundance with dimensions of time, site, and taxa. The other fact table is measure with dimensions of time and site. It also has a number of errors in the data entry of the types typically seen.
Usually when I am asked to give a few words to describe myself I say macroecologist or large-scale-ecologist. And I might on other days say biodiversity scientist or global change scientist. But a lot of days I would say “ecoinformatician”. Ecoinformatics is the subset of bioinformatics that applies to ecology – that is to say informatic (data) techniques applied to ecology. Some of you may know that I spent 9 years in business before returning to my PhD. But not many know that most of what I was doing was business informatics. Helping companies understand their data. It wasn’t planned. I just have always liked seeing what the data has to tell me. But it turned out to be great training as ecology dived into informatics just as I hit graduate school.
Not surprisingly given my background, I spend a lot of time being asked to make recommendations on how to work with data. I’ve also been involved in some very large data projects like BIEN. Here I don’t want to focus on the large (often social) issues of really big projects (my slides from ESA 2015 on the next 100 years of ecoinformatics are on figshare if you’re interested). Here I want to focus on the much smaller single person or lab-scale project. This post is attempts to summarize what I have learned to be best practices over both my business informatics and ecoinformatics careers. I am intentionally going to stay tool or software agnostic. In this post I really want to emphasize a frame of mind and mental approach that can be implemented in literally dozens of different software packages. In a second post tomorrow, I will give a worked example in R since I know that has the highest popularity in ecology. Continue reading
I attended the BES Macroecology meeting in Oxford last Thursday and Friday. It was a great meeting. Check out a storify of the conference tweets for details. I suppose it says something about me, but everytime I get >24 hours of all macroecology, I get reflective on trends I see. As I noted last year, macroecology is in a self-aware and self-reflective adolesence. And this was evident again this year. A great deal of the conversation was on topics like “what is macroecology?”, “is macroecology working?”, “should we move past pattern to to process?”. and “how does macroecology relate to conservation and the public/policy dialogue?”. For those of you who hang around macroecology, these seem like perennial conversations. There was a great conversation with many enlightening thoughts shared on all of these topics.What follows are my own thoughts on the state of macroecology (as observed in the non-self-reflective science talks in the conference and building on the self-reflective thoughts others shared).
Last week the 2015 ISI Impact Factors were announced. Hopefully this was not a date circled on your calendar. But if you were on a editorial board you could not escape a quick announcement of your journal’s new impact factor, whether it gained or lost in rank relative to other journals, and cheers and (email) back-slaps all around or solemn faces and vows to do better. And in my experience authors will now switch allegiance in which journals they submit to so as to follow those ranked highest in impact factor. Is this justified?
This post has evolved substantially over its writing. It started from a good post over on EEB and Flow by Marc Cadotte arguing that ecology needed a more robust culture of critique to weed out bad papers, and arguing that comments/critiques to the journals that published the original papers was an important way to do this. Despite strongly agreeing with the first part, I instinctively disagreed with the later part. (And have been thinking about critique letters a lot lately in my role as Editor-in-Chief at an ecology journal just as Marc has)*. But unpacking why I don’t like critique letters has led to a lot of musings on how ecology works, how the human mind works, and my own answer to the specific question of how best to steer the field away if you see a bad paper. And just maybe along the way I stumbled on a strategy or two for killing zombie ideas!
(This is a guest post from Isla Myers-Smith, early-ish career academic at the University of Edinburgh, with a conversation at the end with Gergana Daskalova, an undergraduate in her lab)
Sometimes I like to worry about why I have chosen a scientific career path and the meaning of life and big esoteric questions that really have no particular answer. I have wondered many times why do I push myself so hard to succeed in science? I know the pipeline is leaky for early career scientists and many choose to leave the Ivory Tower to make different contributions with their careers, but at least for now, I have stuck with the halls of academia and here is why.
A few weeks ago I suggested that I am a scientist because I put numbers on things. Although even I recognize some limits to that argument, I was quite serious in suggesting that measurement and numeration is a central feature of being a scientist. I am not seriously suggesting that skyping is a central feature of being a scientist. But sometimes it feels like it!
Mark recently wrote a piece musing on the true fact that many ecologists have evolution envy – wishing to find simply general rules in ecology that match the elegance of evolution, which was itself a play on the more common phrase physics envy. He is certainly right this exists. On the other hand, in the comments, I noted that I had the opposite reaction. As an undergraduate I was a math major looking for a field that I could apply math to. And I instinctively avoided physics or chemistry (or hydrology and other applied versions of physics), instead being attracted to fields like business, economics and ecology. And as a graduate student I ultimately gravitated to ecology over evolution because of its complexity and honesty about that complexity. I think ecology, economics, business, sociology (and evolution although they ignore it too often for my tastes), especially in contrast to a field like physics, have one thing in common. They’re complicated because multicausality rules. And I wanted to go into a field that had that kind of challenge. In short, I thought multicausality was fun! Continue reading
Scientists still enjoy a fairly high reputation in society as a whole (notwithstanding creationists and climate deniers). It is worth pausing to ask why scientists are still given credibility in this increasingly doubting age. Continue reading
I have argued before that writing a paper for submission to a journal is about a lot more than having done some work that you can describe in methods and results sections. It is certainly about the nuts-and-bolts mechanics of good writing at the sentence level. But more than anything, it is about having a story tell to tell and taking readers on a journey along the arc of that story.
I’ve gotten a lot of insight into how to communicate this story arc working as an Editor in Chief for Global Ecology and Biogeography. I have to make quick decisions on whether to send out to review over 600 papers a year. This means I’ve gotten very good at skimming papers and learning what captures their core essence and how importance and excitement are communicated. Cover letters are certainly important (that’s a post for another day). And figures and figure legends are also important. And you better have sound methods (although an associate editor is more likely to screen that carefully). But I have increasingly also realized that there are five really pivotal paragraphs in any paper. If you get those five paragraphs right, you are likely to have and communicate the story arc in a way that grabs attention.