R is a free software environment for statistical computing and graphics. These notes summarize the free R CodeSchool tutorial.
R
is the command-line interpreterinstall.packages(“ggplot2”)
to install additional packages1=1
, 3>4
, TRUE
, T, FALSE
, Fx=1
or x←1
help(sum)
, help(package='ggplot2')
or example(sqrt)
+ - * / = ←
NA
is used to express a missing or unknown data value. Expressions on NA return NA.c(4,7,9)
a:b
creates a vector of integers from a to b.seq(a,b,s)
creates a vector of numbers from a to b in increments of smyseq[3]
to access third element i.e. vectors indexed starting at 1.myseq[c(1,3)]
names
function can be used to assign names to vector elements. Once names are asigned, they can be used as indices e.g. names(myseq)=c('one','two','three') myseq['two']
myseq + 1
adds one to all elements of the myseq vector.head(myvec)
, tail(myvec)
to show start or end of vectorbarplot[myseq]
creates a bar plot of the myseq
vector. abline(h=y)
plots a horizontal line at height y.plot[x,y]
plots x vs y e.g. x=seq[0,20,.1] y=sin(x) plot(x,y)
contour(mymat)
plots a contour map of a matrix.persp(mymat)
plots a contour map in perspective.image(volcano)
generates a heat map of the matrix.qplot(weights, prices, color=types)
- more attractive plotting using ggplot2 package.matrix(0,3,4)
creates a 3×4 matrix with all elements 0.matrix(1:12,3,4)
creates a 3×4 matrix with numbers 1-12.mymatrix[3,4]
returns an element of the matrix (row,column).mymatrix[,2]
returns entire second column.factor
is a collection type for categorized values - myfac=factor(myvec)
factor
s group unique string values as level
s e.g. levels(myfac) shows unique levels. as.integer(myfac)
shows levels as integers, can be used to set plot typelegend(“topright”, levels(types), pch=1:length(levels(types)))
mydf=data.frame(weights,prices,types)
mydf[['weights']]
or just a dollar sign e.g. treasure$prices
merge
merges data sets by joining on shared column namesmean(myvec) median(myvec) sd(myvec)
cor.test
tests for correlation (Pearson's product-moment)line = lm(cola ~ colb)
calculates a linear model between cola and colb that can be plotted with abline(line)
list.files()
to list files in furrent directorysource(“file.R”)
to load file of coderead.csv('mydat.csv')
to load a csv fileread.table
to read text data with other separatorscon←url(“http://google.com”,“r”)
to read a webpagex←readLines(con)
to convert to a vector of lines