Non-academic careers for ecologists: data science (guest post)

Note from Jeremy: This is a guest post from Ted Hart, who holds a Ph.D. in ecology and did a postdoc at the University of British Columbia, but is now a data scientist in Silicon Valley. Thanks very much to Ted for offering to share his experiences (and click through on the link above for Ted’s blog, where he discusses his career path at greater length).

This is the latest in our series of posts on non-academic careers for ecologists. For previous posts in the series, go here. And if you’re an ecologist in a non-academic career we haven’t covered yet and want to write a guest post on it, drop me a line! (jefox@ucalgary.ca)

*************************************

1. When and how did you decide to go into data science?

When I moved to San Francisco (The joke being: A data scientist is a scientist who lives in San Francisco). In all seriousness though, I was recruited by my current company and it seemed like an opportunity I couldn’t pass up. The longer answer is that I took Jeremy’s story about almost leaving science to heart and cultivated a technical skill that would make me an appealing hire outside of academia. As I further progressed in my PhD I began to realize the practical realities of getting a faculty job. While I still hoped to stay in academia at the time, I knew that I needed a contingency plan. However I found that what started out my contingency plan was my real interest. I enjoyed working with data, modelling, coding and doing research more than I liked other parts of academia (paper writing, grant writing, teaching, etc…).

2. Did you get advice (wanted or unwanted) from others about your non-academic career path? If so, what sort of advice did you get, and how did it affect you?

I did get some great advice that I sought out each time a post-academia opportunity came along. My PhD advisor had little advice about alternative careers at the time of my PhD, but he has been a great resource throughout all of my career decisions post-PhD. Much of the other advice I recieved came from personal connections I made in the ecological software development community. Whenever I was offered a job, I e-mailed the people I trusted, and asked for their opinions on my career decisions. Their responses were always thoughtful and helpful. I think having more senior people who you look-up to is invaluable in guiding big career decisions.

3. Tell readers a bit about your current position, how you found it, and what attracted you to it.

I can’t speak freely about what I do these days in too much detail due to the nature of my work. However I can say generally that I work on a research team with other “academic refugees” from fields like physics, computer science and economics. We collaborate on projects in way very similar to an academic environment. I got the job when a member of the team found me on LinkedIn and invited me to apply for the position. I think it just goes to show you never know where an opportunity will come from. As far as what attracted me, ironically it was a chance to return to doing research. Whereas my previous work was in informatics and data engineering, I felt that my current job would offer more room for intellectual growth. I think that giant data sets present a unique set of intellectual challenges. Whereas much of ecological research is about “how do I do a lot with sparse data”, industry data presents the opposite challenge: “how do I find meaning in the firehose of information?” It’s a different approach to questions.

4. In what ways do you find your current position to be a change from academia? Are there aspects of the position that are a “culture shock” or that have required some adjustment on your part?

If I had to choose one word it would be pace. Everything in industry happens faster. The deadlines are measured in weeks or days, not months or years. While some projects have a lot of lead time and I have the freedom to think deeply, others require a rapid turn around time for a meeting with an executive or a quarterly report. You also get feedback much faster in industry. At my last grant funded job, the cycle for funding was measured in years. If we were slow to produce results it didn’t matter much from a funding perspective in so much as we had money for X years already. However in business if you make a bad decision and start losing money, people start to care really quickly. I think the biggest culture shock is in communication. Business has its own set of acronyms, metrics and general slang (like what is AB testing?). Beyond semantic differences though, I face larger issues in communicating complexity in data. One of the biggest challenges is talking about uncertainty. Scientists are trained their whole careers to think about uncertainty, and be comfortable with it, many others aren’t similarly trained. I can’t just give a presentation and say “here are the confidence intervals” because most people don’t know what that means. I wish I could say I’ve come up with an ideal way to exlpain this, but I really haven’t. I also found that academia has shaped my written language, and I’ve needed to become more colloquial in writing reports.

5. In what ways has your academic background helped you in your current position?

The training I received in modelling, data management, and the scientific method allows me to do my job everyday. It has also taught me how to frame questions in a meaningful way and organize research projects effectively. Being a data scientist is very similar to doing any other kind of research. However instead of a field site and sampling protocols, I write map-reduce queries. Either way I’m creating controls, and collecting data, it’s just that I’m not quite as tan as when I was a field biologist. Obviously other parts of my background haven’t helped as much, like I’m still pretty good at IDing aquatic insects and digging holes is the forest but that doesn’t do me much good in my current job.

6. Any regrets about not pursuing an academic career path? Could you see yourself ever going back to academia at some point?

I don’t know if I’d say it’s a regret. I know that in many ways I wouldn’t enjoy aspects of a faculty job, and my current job retains parts of academia I love. However I still have romanticized notions of being an academic. Like maybe if I’d stuck with it I would have an amazing breakthrough and get featured on an episode of Radiolab. I know that will never happen now. Another part of academia I’ll miss is the travel and conferences, and the sense of dispersed community. I love grabbing beers with fellow ecologists and instantly being able to connect of over the mutual friends we had in the small community. I’ll miss the frenetic energy of ESA and seeing all the friends I’ve made over the years (My liver thanks me for not doing this anymore though). As far as returning, I don’t think I could ever go back to academia if for no other reason than the highway to facultyville only has one on ramp and many exit ramps. My impression is once you leave, there’s no going back, especially at my age. However I certainly wouldn’t rule out returning to a role like the one I previously held at NEON, or some other agency if the right opportunity came my way.

7. Anything about your current position that came as a surprise to you?

I think just how much it feels the same as when I was in academia. I work with a team of really smart people all with PhD’s to tackle hard problems. We generate hypotheses and collect data to test those hypotheses. I find it just as intellectually rewarding as anything I ever did in academia. My position also affords me a lot of freedom with my time, like my hours are almost as flexible as when I was a grad student (this may be a function of the tech industry more than others though).

8. Anything else you want to say to readers considering data science, or a non-academic career path more generally?

A PhD in ecology or evolution is good preparation for a career in data science. The traits that allow someone to get a PhD also prepare them for success almost anywhere, especially in a data science position. You’re probably really smart, highly motivated, an autodidact, and that can get you far professionally. However, despite all the qualifications most PhD’s have, a big shortcoming of our training is that no one teaches us how to sell ourselves. You need to go into an interview in industry with the ability to wow not just your scientific peers, but also the upper level VP of whatever with how your scientific skills can provide concrete, actionable (<- business lingo) knowledge. On top of the scientific training and personal marketing, you’ll probably need to bolster some of your technical skills. I’ve made a short hand list below of things you’ll need to know as a data scientist, but it’s been discussed in depth on other venues:

1). Know your databases. You’ll want to know SQL, and have a passing knowledge of a couple of NoSQL technologies like Neo4J or couchDB. If you’ve heard about ‘big data’, you’ve no doubt heard of Hadoop. You’ll need to at least be familiar with Hadoop and it’s associated tools like Hike and Pig.

2). Know something other than R. R is great, but you’ll be well served to know another language(s). Python is a good second stop, followed by C, Java or Javascript if you’re so inclined.

3). Don’t rage against the machine…learning. I’m assuming that most ecologists frequentist statistics chops are pretty good. In the world of data science though machine learning is everywhere. A good place to start is An introduction to statistical learning with applications in R. When I interviewed I had several comprehensive exam style questions about machine learning.

Finally, leaving academia isn’t easy or something to take lightly. You’ve probably devoted a decadeish of your life to training for a particular job, and changing course is a difficult decision. I know for me it was a long time coming. I’ve written about this in detail and in the end I made the right choice for my own happiness and my family’s.

8 thoughts on “Non-academic careers for ecologists: data science (guest post)

  1. Great post Ted. In some ways we went opposite directions. I was a data scientist in a consulting company for 10 years before going back to school (although that was in the 90s so they didn’t call it data science yet). My experience completely matches yours in two regards: a) industry can absolutely be as intellectually interesting and stimulating as academia (at least in the right niches) which is something I don’t think many academics appreciate, and b) the pace in business is completely different. I think you’ve put your finger on a key similarity and a key difference.

  2. Thanks Brian.

    I’m curious, were you in academia before going into consulting and came back? It seems like part of the weightiness of the decision is that academia is very unforgiving of moving back and forth between from industry, at least in the biological sciences. Although I don’t think this is true for all disciplines.

    • I agree – academics are rather narrow minded about experience “outside the walls” making a return harder than it should be. In my case I went to industry straight out of my bachelors. So all of my graduate education through tenure track represented the smooth linear path search committees like to see. Even so, not all graduate schools were willing to take a chance on me and there are days I get frustrated at getting treated like a 30 something junior faculty rather than somebody with 25 years of relevant work experience under my belt. It is rather astonishing to me how academics see “real-world” experience as completely irrelevant.

      But of course not all of academia is like that. Engineering, Education, Computer Science, Business, Nursing all have more of a revolving door approach. I think even more forward looking conservation units have that view.

    • Re: leaving academia and then coming back, you’re right that it’s very rare in ecology Ted. I do know of one ecologist who took a year’s leave from a faculty position to be a day trader with a small boutique stock trading firm at which a couple of ex-ecologists and other “quant” types worked. But that’s unusual, and even in that case the individual hedged his/her bets by taking a leave of absence rather than quitting entirely.

  3. Great post Ted, thanks.

    I’m interested in your comment that a PhD in ecology or evolution is good prep for a career in data science. Reading stats blogs, it’s fairly common to run into angst from statisticians that “data science” too often ignores insights from statistics (e.g., sampling error, overfitting, multiple comparisons, etc.), coupled with worry that stats students aren’t being trained for careers in data science because they aren’t taught enough programming or practical statistical techniques. And conversely, it’s fairly common to see good machine learning/data science people also worry that data science too often ignores statistical uncertainty. So it does seem like an ecologist or evolutionary biologist with the right training would hit the “sweet spot” in terms of knowing both stats and programming.

    • I just recently made the same career move myself, after having finished my PhD in ecology I now work as a data scientist for a small software company. I had to laugh about the business lingo (my company specialises in “actionable analytics” 😉 ).

      Anyway, I believe scientists, maybe particular from life science/biology, might not only be in a sweet spot between stats and programming (many computer scientists and maths people also have skills in both, probably often more in at least one of the two) but are used to messy real-world data. Its one thing to know what you can do in theory if a distribution is normally distributed or not, in practise you often have to make decisions like whether a distribution is “normally distributed enough”. Being used to data that is bad in many ways is an important experience most biologists probably have plenty of.

  4. Ted – as someone who tries to teach his students the rudiments of coding in the course of learning good data analysis (in R – I don’t have time to get them into python or otherwise), I’m curious where you see opportunities for teaching machine learning and databases in a curriculum. I try and keep everything applied and relevant to what they are doing, as the more divorced from an actual example I go, the more I watch students withdraw from the material. Where do you see these topics as entering in?

  5. Great article. Considering this path myself. Masters in ecology then possibly a Ph.D in quantitative ecology research or a Masters in applied statistics. Good to see that ecology Ph.Ds that have worked with statistics are highly valued in places like Silicon Valley. Makes sense that they would be.

Leave a Comment

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.