Table of Contents
R Cheat Sheet
R is a free software environment for statistical computing and graphics. These notes summarize the free R CodeSchool tutorial.
Basics
R
is the command-line interpreterinstall.packages(“ggplot2”)
to install additional packages
- Expressions are evaluated and displayed e.g. 1, 1+1, “Hello World”
- Booleans are e.g.
1=1
,3>4
,TRUE
, T,FALSE
, F - For variable assignment
x=1
orx←1
- For help on a function use
help(sum)
,help(package='ggplot2')
orexample(sqrt)
- Operations are
+ - * / = ←
NA
is used to express a missing or unknown data value. Expressions on NA return NA.
Vectors
- To create a vector, use the combine command
c(4,7,9)
- Vectors must be of the same type, and are cast if not (e.g. to strings).
a:b
creates a vector of integers from a to b.seq(a,b,s)
creates a vector of numbers from a to b in increments of smyseq[3]
to access third element i.e. vectors indexed starting at 1.- Use a vector as an index to access multiple elements e.g.
myseq[c(1,3)]
- The
names
function can be used to assign names to vector elements. Once names are asigned, they can be used as indices e.g.names(myseq)=c('one','two','three') myseq['two']
myseq + 1
adds one to all elements of the myseq vector.- Scalar operations or functions on vectors typically produce other vectors e.g. + - == sin(myseq)
head(myvec)
,tail(myvec)
to show start or end of vector
Plotting
barplot[myseq]
creates a bar plot of themyseq
vector.abline(h=y)
plots a horizontal line at height y.plot[x,y]
plots x vs y e.g.x=seq[0,20,.1] y=sin(x) plot(x,y)
contour(mymat)
plots a contour map of a matrix.persp(mymat)
plots a contour map in perspective.image(volcano)
generates a heat map of the matrix.qplot(weights, prices, color=types)
- more attractive plotting using ggplot2 package.
Matrices
matrix(0,3,4)
creates a 3×4 matrix with all elements 0.matrix(1:12,3,4)
creates a 3×4 matrix with numbers 1-12.- dim(myseq) can be used to change dimensions of a matrix
mymatrix[3,4]
returns an element of the matrix (row,column).mymatrix[,2]
returns entire second column.
Data Sets
factor
is a collection type for categorized values -myfac=factor(myvec)
factor
s group unique string values aslevel
s e.g. levels(myfac) shows unique levels.as.integer(myfac)
shows levels as integers, can be used to set plot typelegend(“topright”, levels(types), pch=1:length(levels(types)))
- A data frame collects sets of related values (i.e. sets of columns with values in the same order) e.g.
mydf=data.frame(weights,prices,types)
- To extract a column, use double-square brackets with the column index or name e.g.
mydf[['weights']]
or just a dollar sign e.g.treasure$prices
merge
merges data sets by joining on shared column names
Statistics
mean(myvec) median(myvec) sd(myvec)
cor.test
tests for correlation (Pearson's product-moment)line = lm(cola ~ colb)
calculates a linear model between cola and colb that can be plotted withabline(line)
File Handling
list.files()
to list files in furrent directorysource(“file.R”)
to load file of coderead.csv('mydat.csv')
to load a csv fileread.table
to read text data with other separatorscon←url(“http://google.com”,“r”)
to read a webpagex←readLines(con)
to convert to a vector of lines