Also this week: Contrarianism! Academia isn’t broken! It’s not actually that important for the vast majority of data to be made available and accessible in a standardized form! And also lots of things that aren’t contrarianism but are still thought-provoking! And a video interpretation of my blogging! And a picture of Dr. Evil! And more! Ok I’ll stop now!
Most popular programming languages across all programming (not just academic ecology). R just edges out Matlab in terms of popularity, but Python crushes both (in a virtual tie between Java, C, C++ and Python). Personally, I am midstream switching from Matlab to Python with R pulled in for teaching or specialized stats (or to conform with norms in working groups). Python is clearly a juggernaut in scientific computing in general, but I’m curious how many other ecologists are using it?
First, a reminder as the semester begins that I have a post that contains links to videos for teaching ecology.
Times Higher Education had a very interesting article entitled “Work less, do more, live better”, which focuses on similar themes as my post on how you do not need to work 80 hour weeks to succeed in academia. It includes interesting information on historic work patterns, efforts by companies to restrict how much their employees work in an attempt to prevent burnout, and the importance of exercise and sleep. It says, “This article is the opposite of a call to arms, it’s a call to leisure, a call to lay down your keyboard and take up your knitting needles, your surfboard, your pleasure reading and, especially, your walking shoes.”
And, in a similar vein, Inside Higher Ed had a piece on working 40 hours a week and succeeding in academia. It’s a great piece. Many of the points mirror ones I made in my post on not needing to work 80 hour weeks: if you track your time, you are likely to be surprised at how little work you are actually doing, and, if you do put in longer hours, you are likely to become less and less efficient/productive during those hours. The author (Trish Roberts-Miller of the University of Texas-Austin) says that she discovered as a grad student that “I was spending a lot of time in a fairly draining world of neither work nor play — not fun, and so not a world that rejuvenated me in any way, but also not really work, and so not a world in which I was getting anything useful done. I wasn’t exactly the long-suffering martyr I was imagining — in fact, I needed to work more.” She also talks about how the amount of time required for specific tasks she performs (e.g., grading) increases as she works longer hours each week. All in all, a very interesting read. ht: @scitrigrrl
Ed Yong tweeted that it would be really helpful if lab’s included information on their major accomplishments on their webpage, which got an impressively quick response from Emilio Bruna:
I think Emilio’s page is great. I’m interested in giving this a shot!
I really liked this post by Sarah Bisbing on the lessons she learned in her first year on the tenure track. There’s lots of great advice there. (Jeremy adds: hey, that’s my link!)
And I’m late on this, since I haven’t sent in links in a few weeks, but The Guardian featured a new twitter account, @LegoAcademics, which has used the new women scientist lego set to create scenes of academics acting out their daily lives. It doesn’t include my favorite so far, which is this one:
Terry McGlynn with his impressions of the four conferences he attended this summer, including the ESA. Touches on everything from the prevalence of students (lots more of them at ESA, apparently) to the apparent paucity of new Big Ideas:
Another thing that I noticed about the meetings that, at each one, so many talks were about the same. exact. damn. idea. For example, I’m not joking when I estimate that about every other talk at the Tropical Biology meeting was about biodiversity along elevational gradients…When there is a new Big Idea out there, it makes sense that people are working on it and presenting that work. But here’s what caught my eye about most of these Big Ideas at the meetings. These are ideas that have already played out. The biggest discoveries tied to those Big Ideas have already happened.
Terry also talks about the difference between using talks to sell your science vs. using talks to sell yourself, and says he mostly saw the latter. I confess I don’t really see the distinction Terry sees. If it’s a distinction at all, it seems to me more like a distinction between different ways of selling one’s science. Some old thoughts on showmanship and salesmanship in scientific presentations and papers here and here.
Is ecology explaining less and less? Science has a writeup of a new study I was very impressed with at the ESA meeting. Quotes Brian. The EEB and Flow also comments (in a post that also hits on the theory vs. empiricism survey that we linked to last week).
Sarah Bisbing on the lessons she learned as a first year faculty member.
Academia–publishing, hiring, reviewing, granting–isn’t broken, at least not mostly. It’s merely competitive.
Here’s a pretty cogent argument from Arjun Raj that it’s not necessarily worth it to make any and all data freely available in a standardized form without any need to contact the original authors. Before y’all start flaming me (or him), click through and read it. He actually shares a lot of his own data (and code), and makes heavy use of data collected by others. His point is simply that there are no easy, dead-obvious answers here. Though note that the data sharing requirements many ecology journals have implemented seem less onerous than the sort of thing Raj is mostly thinking of. And at least some kinds of ecological data seem less likely to quickly be outdated or superseded than the sorts of data Raj works with.
Philosopher Deborah Mayo asks whether philosophical superficiality has harmed science. I asked the same question in an ecological context a while back.
A nice piece on Robert MacArthur, from the alumni magazine of his alma mater, tiny and unconventional Marlboro College. I didn’t know that MacArthur was part of their first class of students, and that his father started the science labs there. And this is a candidate for the most MacArthur sentence ever written:
But the college possessed two things the younger MacArthur brother found particularly to his liking: an excellent mathematics teacher in the person of Alan Paine, and an abundance of meadows and forests where he could do fieldwork to his heart’s content.
(ht Small Pond Science)
The Chronicle of Higher Education is going to try to track who gets every new North American tenure track job advertised this year in 11 fields–including ecology. (ht Small Pond Science)
The role of markets and incentives in the recovery of white rhinos in South Africa. Interesting, I didn’t know this. (ht Economist’s View)
A (humorous) taxonomy of papers. I just had a “Face Plant” myself not too long ago. 🙂
The 11 funniest economics papers ever. Anyone care to compile a similar list for ecology?
The REAL reasons academics use Twitter. 🙂
Sociologist Kieran Healy with the greatest syllabus ever, for “Sociology 710: Social Theory Through Complaining.” I love how it’s an equal-opportunity skewering. Commenters are invited to post suggestions for an ecology equivalent. You can get a good start just by using some of Healy’s, or slight variations thereon. “Ecology 710: Ecological Theory Through Complaining”. “Week 1: This class has nothing to do with my research”. “Week 3: It’s not like we can even predict anything”. “Week 9: What is theory without data?” 🙂
Marmot vs. GoPro. I
spit on lick your attempt to take time lapse video! 🙂
In a similar vein, here’s a video of me blogging. Well, sort of blogging. And sort of me. 🙂
And finally, I was procrastinating this week and decided to make a meme for Brian’s “insidious evils of ANOVA” post:
Have a good weekend! 🙂
Hi Meg, thanks for the shout-out and kind words about the page. I loved Ed’s criticism and suggestion – he was right, of course – and since I was looking for an excuse to procrastinate…
By the way, would like to give props to my friend Gaby Hernandez, now a Professor of Graphic Design at Univ. of Houston-Downtown. Working with her on a project when she was an MFA student here at UF was an amazing introduction to the world of design thinking / user-centered design. Her portfolio is at http://www.portfolio.gabrielahdesign.com, check out the “Visualizing Science” link to see some of the amazing posters students made in her class on this topic. I’m actually working on a revamp of my page taking into account some of the stuff I’ve learned from her…if only I had time to finish it!
A question on another topic in your link-fest, which also posted on Small Pond Science. Can anybody tell me what the % of student attendees was at ESA this year? At ATBC 2015 we had 31%. It was a bit lower than usual (data in comments section of link below) but it’s tough for me to agree or disagree with Terry because I have no idea what the numbers are like for other meetings. A pointer to data for ESA, the other ESA, Animal Behavior, etc. would be appreciated.
The data will probably be published in a future ESA Bulletin. Data from past meetings should be in old Bulletins I think.
Yeah, I had looked for it there. I thought it might be in this report, but could only find total attendees by category of presentation. Maybe it’s buried and I just couldn’t find it? http://www.esajournals.org/doi/pdf/10.1890/2013.councilreport
Well, you can agree or disagree with me if you use that 30% figure independent of comparisons to other meetings, whether it’s too high or too low. I think 30% grad students at a major conference is too low, especially considering the somewhat-pyramid shaped distribution of age classes.
I don’t disagree that it’s (too) low – I would love it to be much higher. I’m just curious if it’s low relative to other conferences, which indicates ATBC has a problem, or if it’s similar to other meetings, in which case I think we all have a problem.
Brian, can you please elaborate as to why you are switching to Python? I use Matlab (occasionally writing C++ .mex functions for “heavy lifting”), with some R as well. Thanks!
Advantages of Python:
1. free; and powerful scientific libraries are free, relatively well docummented, and actively maintained.
2. Easy to teach. I teach a graduate simulation class that relies on Pyython. The students (mostly with no programming experience) cann get up to speed reasonably fast.
3. The code is pretty easy to read for programmers who don’t use Python much. It has been compared with psuedocode (that actually runs).
Perhaps a disadvantage: some people think Python is slow relative to their favorite language alternative. You get different answers with ddifferent benchmarks, but performance is generally similar to R and Matlab in my experience. There are ways of increasing performance if that is important, but for many of us the saving in time for algorithm coding and testing is worth more than gain in run time.
Well – the switch is going slowly. I have a large base of existing code in Matlab, and it is still the language I am most comfortable/quick in. I am by no means rejecting Matlab
But long term, my reasons are:
1) Love Python – i’ve programmed in over 20 languages in my life and it is my favorite language.
2) Clearly the long term trend in science (not engineering) is away from Matlab to R or Python. Many of my ecoinformatics colleagues that used to use Matlab don’t any more – I don’t want to be a dinosaur.
3) I’m not an open source maniac, and every university I have been at has had a site license for Matlab, but it is hard to get around the advantages of free, especially when teaching students a skill they can take with them to NGOs, etc
4) My sense is the rate of library innovation is much higher in Python than Matlab – now of course part of that is because Matlab already has such an excellent set of libraries (I use stats & mapping & image processing most and occasionally optimization). But when I go look for code to do something not in the libraries (e.g. GAM) it is increasingly hard to find anybody coding these up in Matlab (I think econometrics folks have moved in the same directions as ecology).
5) Python is a general purpose language – e.g. I can code dynamic web pages in Python much easier than Matlab
6) Don’t want to pick a fight but for me R is a non-starter – I can’t load the size of data I want in it and work with it well.
Thanks very much for the thoughtful replies! One last question. How easy/difficult is it to integrate Python with compiled code (e.g., C or C++, in order to speed up big simulations)?
I’ll just put a plug in for ecologists using python. I still do the lionshare of my coding in R but I like using python when I can.
Worth mentioning that a lot of ecologists use Python for GIS work – ArcGIS is essentially just a Python GUI, and knowledge of the language is very helpful for batch processing etc. Though, with the growing GIS capabilities of R that may change.
Good point – I should have added that to my list of reasons for switching to Python. Python is clearly becoming the choice for open source GIS (good libraries for GDAL, many GDAL command line utilities are written in python – e.g. gdal_calc.py) and Python is now the embedded language in ArcGIS. Notwithstanding the mapping libraries in R and Matlab (the latter of which I’ve used a lot and they’re adequate but not more), Python is THE language for GIS
Will Pearse has an interesting look at the r2 paper on his blog. It looks like the trend really only holds up well if you use the means of each year’s data. A nice argument for the use of open data.
Interesting. The trend is there, but there certainly is a *lot* of scatter around it when you look at the raw data. Will give this a shout-out in the next linkfest.
I think that a lot of the trend comes from a rapid increase in near 0 values in the last decade or so. It’s really pronounced and could easily drive a lot of the trend.
Looking at all R^2 data vs mean R^2 data shouldn’t (can’t) change the ‘trend’ (slope) if they’re using linear regression, but can change the R^2 associated with the line of best fit.
It’s still an interesting result, but disappointing that they chose to ‘hide’ all the variability in the raw data. I totally agree with Will Pierce (the blog link from downwithtime) that linear regression ain’t correct for data bounded by [0,1], particularly when so much of it is close to those boundaries! (However, the original article is behind a paywall, I can’t access it yet, so I apologise to the authors if I’m misrepresenting the methods here)
It is important to remember though, that these are only the REPORTED r2 values. Unlike the Moller & Jenions paper that back-calculated r2 values from the p/se/n reported, these are only the people who actually wrote an r2 value into their paper. That we had that many people who wrote r2s so close to zero into their paper without blushing is to me the main story. That and the increase to 11 p-values reported. Variance or no, it remains to me a pretty clear story of caring too much about p>0.
Hmm – weird HTML munging – the end of my comment was supposed to say caring too much about p less than 0.05 and not enough about r2 much greater than 0.
I also find the scatter interesting, and the mean seems to be a really poor summary of what is going on. My guess is that the pattern is mostly driven by the selection of studies being published, with different criteria over time and maybe higher R2 standards to publish further back. Maybe less focus on R2 in recent times, and easier computing also means more models being tested and reported (could also explain the increase in p-values). The argument that the pattern is due to low-hanging fruit having been picked doesn’t really hold up if you look at the data – there is no shortage of studies with high R2 in recent times.