As I wrote about yesterday, I have slowly shifted from using Systat and SAS to using R. I now do all of my analyses and make my figures in R, but still regularly bump up against things I don’t know how to do. These things generally fall into one of three categories:
- manipulating a dataframe,
- trying to figure out how to do an analysis that I haven’t done in R before,
- trying to make pretty figures.
This has me wondering how to best learn new skills in R. I know I am not alone in trying to figure this out! So, please let us know in the comments what approaches have worked for you and/or people you know!
As my lab was initially shifting to R, we had a series of stats boot camps at lab meeting, where we learned how to import data to R and some of the basics of working with data in R. We then also had different lab members teach everyone else how to do some analyses in R that we were all likely to need at some point (e.g., survival analysis). That worked really well at first, but now we’ve run into the problem of having had some turnover in the lab. As new people join the lab, how do we get them up to speed? And what about things that not everyone needs to know how to do?
As I’ve learned R, my general approach to trying to learn new things has been (roughly in order):
- Look back through old code if I think I maybe have done something similar before,
- Search on something like Cookbook for R (especially if my question relates to graphics),
- Look in Crawley or Zuur,
- Wish I could just download all the R- and stats knowledge from Ben Bolker’s brain into mine,
- Consult Dr. Google, which often leads to Stack Overflow,
- If still stuck, ask on twitter (usually remembering to add the #rstats hashtag),
- Email someone who might be able to help me. (I try hard not to do this last step, though, because I don’t want to bother other folks.)
Based on this tweet from Hadley Wickham:
I am definitely doing it right when learning new things in R!
I’ve also been trying to keep this in mind:
As came up in the comments on yesterday’s post, yes, sometimes it’s a battle to figure out how to make a figure in R, but that knowledge is useful in the future.
Usually, I can figure out what I need, but it sometimes takes a really long time. Sometimes I give up and resort to a less elegant approach. With dataframe manipulation, that less elegant approach is usually brute forcing things. For example, I recently wanted to assign a unique ID to each infection-lake-year category, so that I could make one big box plot containing data from all of them. I couldn’t figure out how to do this and it was nearing the end of the day, so I just manually went in and told R that rows 1:20 should be “A”, 21:39 should be “B”, etc. It worked, but it means that if something about the data changes, I will need to remember to change the row indexing. And it means I can’t easily use that code again for a similar purpose in the future. For figures, the brute force approach for me generally involves moving things into Powerpoint and rearranging figure panels or centering labels there. I will come back to the specific topic of figures in a post next week, but my ideal would be to not need to move to another program at all. I’m getting closer to that, but I’m not all the way there. (Comments on yesterday’s post suggest maybe not all advanced R users view this as something to aim for.)
As I’ve thought about how to learn these techniques, I’ve wondered how others learn how to program, especially in R. And, more specifically, I wonder what I could be doing differently to pick up R faster.
One idea I’ve considered would be to have an R lounge – a room reserved where people can come and work on analyses, with the idea that they could interrupt others or get interrupted by others to ask about a problem they’re running into. But I don’t think this would be really useful. It would only work if some people who know a lot in R were generous with their time and came and worked there. And, when I am trying to figure something out, I want to know the answer approximately 10 minutes ago, so waiting until others come by would drive me up the wall.
Another option would be that I could also try posting to Stack Overflow. I certainly often find helpful suggestions by looking through posts there. But I feel like there’s a culture to it that I haven’t learned, and that makes me hesitant to wade in there. (For example, sometimes the reply to a post is a curt indication that the question has already been asked and answered elsewhere, or an admonishment for selecting an answer too quickly.) Plus, something about posting there feels a little too public to me (which, yes, might seem weird for someone who blogs and tweets to say!) I tend to feel like any specific problem I post would seem so incredibly basic.
In the end, I haven’t come up with a better option than slowly battling through, task-by-task. It still feels incredibly slow sometimes, but maybe that’s just the nature of the beast.
How did you learn R? What would you recommend to people who are complete R novices? (When I mentioned writing this post on twitter, Zhian Kamvar recommended swirlstats, which looks great.) What about to people who’ve mastered the basics but are trying to learn more?