# R Cheat Sheet

R is a free software environment for statistical computing and graphics. These notes summarize the free R CodeSchool tutorial.

## Basics

• `R` is the command-line interpreter
• `install.pa­ckages(“gg­plot2”) ` to install additional packages
• Expressions are evaluated and displayed e.g. 1, 1+1, “Hello World”
• Booleans are e.g. `1=1` , `3>4` , `TRUE`, T, `FALSE`, F
• For variable assignment `x=1` or `x←1`
• For help on a function use `help(sum)` , `help(package='ggplot2')` or `example(sqrt)`
• Operations are `+ - * / = ←`
• `NA` is used to express a missing or unknown data value. Expressions on NA return NA.

## Vectors

• To create a vector, use the combine command `c(4,7,9)`
• Vectors must be of the same type, and are cast if not (e.g. to strings).
• `a:b` creates a vector of integers from a to b.
• `seq(a,b,s)` creates a vector of numbers from a to b in increments of s
• `myseq` to access third element i.e. vectors indexed starting at 1.
• Use a vector as an index to access multiple elements e.g. `myseq[c(1,3)]`
• The `names` function can be used to assign names to vector elements. Once names are asigned, they can be used as indices e.g.
```names(myseq)=c('one','two','three')
myseq['two']```
• `myseq + 1` adds one to all elements of the myseq vector.
• Scalar operations or functions on vectors typically produce other vectors e.g. + - == sin(myseq)
• `head(myvec)` , `tail(myvec)` to show start or end of vector

## Plotting

• `barplot[myseq]` creates a bar plot of the `myseq` vector. `abline(h=y)` plots a horizontal line at height y.
• `plot[x,y]` plots x vs y e.g.
```x=seq[0,20,.1]
y=sin(x)
plot(x,y) ```
• `contour(mymat)` plots a contour map of a matrix.
• `persp(mymat)` plots a contour map in perspective.
• `image(volcano)` generates a heat map of the matrix.
• `qplot(weights, prices, color=types)` - more attractive plotting using ggplot2 package.

## Matrices

• `matrix(0,3,4)` creates a 3×4 matrix with all elements 0.
• `matrix(1:12,3,4)` creates a 3×4 matrix with numbers 1-12.
• dim(myseq) can be used to change dimensions of a matrix
• `mymatrix[3,4]` returns an element of the matrix (row,column).
• `mymatrix[,2]` returns entire second column.

## Data Sets

• `factor` is a collection type for categorized values - `myfac=factor(myvec)`
• `factor`s group unique string values as `level`s e.g. levels(myfac) shows unique levels.
• `as.integer(myfac)` shows levels as integers, can be used to set plot type
• `legend(“to­pright”, level­s(types), pch=1­:length(le­vels(types­)))`
• A data frame collects sets of related values (i.e. sets of columns with values in the same order) e.g. `mydf=data.frame(weights,prices,types)`
• To extract a column, use double-square brackets with the column index or name e.g. `mydf[['weights']]` or just a dollar sign e.g. `treasure\$prices`
• `merge` merges data sets by joining on shared column names

## Statistics

• `mean(myvec) median(myvec) sd(myvec)`
• `cor.test` tests for correlation (Pearson's product-moment)
• `line = lm(cola ~ colb)` calculates a linear model between cola and colb that can be plotted with `abline(line)`

## File Handling

• `list.files()` to list files in furrent directory
• `source(“file.R”)` to load file of code
• `read.csv('mydat.csv')` to load a csv file
• `read.table` to read text data with other separators
• `con←url(“http://google.com”,“r”)` to read a webpage
• `x←readLines(con)` to convert to a vector of lines      