Last week, some of us here at Calgary had a “best R tips” meeting. Everyone was asked to bring their best tips for using R. Here’s a compilation. Add yours in the comments. Because the intertubes can never be clogged with too much R content. 🙂
Hadley Wickham’s free book, R For Data Science.
esquisse package – plot with drag & drop options. Works with ggplot2.
praise – the praise function just gives you random praise
The Rcmdr package makes R menu-driven for most common statistical operations and associated graphing. And it has a window that shows you the code you would’ve typed to do what you just did by clicking buttons in drop-down menus. I used to use it more than I do these days, but I still like it for making simple plots of group means and their standard errors (say, to illustrate an ANOVA). It’s the only easy way I know to make a plot of means and standard errors in R. It also makes it easy to make a scatterplot with differently colored/shaped points for different groups, without having to learn ggplot2 or do any complicated coding in base R.
If you’re going to be filling in a matrix/array/etc. by repeatedly running some chunk of code (say, in a for loop), first create an empty matrix/array/etc. (or one filled with obviously-wrong data) outside the for loop. Then fill it in inside the for loop. This is much faster than creating the first row of the matrix/array/etc. inside the for loop and then growing it by one row each time through the loop. Because when you “grow” a matrix/array/etc. inside a for loop in R, R actually creates a whole new matrix/array/etc. from scratch rather than just adding the new row onto the existing matrix/array/etc.
ICEC – “ignore code evaluate comments”. The principle that the code should be interpretable just from the comments. Also: try writing all the comments, and the write the code that does what the comments say the code should do.
If you just want to find all the elements of a vector that match a condition, you don’t need the which() function, you can just put the condition in brackets. For instance, x[x>8] will return all the elements in the vector x that are >8. And if you want to count the number of elements in a vector that match a condition, you can just use something like sum(x>8).
If you’re in a hurry to quickly pull some data from an Excel spreadsheet into R, say for purposes of a quick exploratory analysis, just copy it and then read it in from the clipboard using x<-read.table(file=”clipboard”,sep=”\t”,header=T). Don’t @ me, reproducibility zealots. 🙂
Write a function for anything you’re going to do at least 3 times. Especially for ggplot2, if you’re making the same kind of plot over and over.
If you have a bunch of functions you like, put them all in one R script, then import it at the start of every session: source(path.to.your.R.script)
%ni% is the negation of %in% (aside: My jaw hit the floor when I learned this. How many other R commands or functions are negated by spelling them backwards?)
styler package: makes your ugly R code look pretty. It’s an add-in in RStudio. Splits long lines of code for you. Makes other aesthetic changes.
To comment out a block of code in R Studio: highlight it, ctrl-shift-c
If you download devtools, you can then run a function from a package without loading the package. For instance, lme4::lmer() means “run the lmer function from lme4 without loading lme4”. This keeps you from having to load a whole package, which will mask functions you don’t want masked.
list.files() will list all the files in your directory. Can use it to read in all the files in your directory, or all the unique files using the unique() function.
There’s an R package to make your graphs look like xkcd. Yes, really. Yes, this stretches the definition of “best” tips. 🙂
R Easter eggs:
- Run ????””
- Run example(readLine)
Looking forward to hearing your best tips. Also looking forward to the first comment saying that one of the tips in the post is the Wrong Way To Do It. 🙂