Next revision
|
Previous revision
|
notes:r_cheat_sheet [2014/09/28 23:48] smthng created |
notes:r_cheat_sheet [2015/01/10 15:25] (current) smthng [File Handling] |
[[http://www.r-project.org/ | R]] is a free software environment for statistical computing and graphics. These notes summarize the [[http://tryr.codeschool.com/ | free R CodeSchool tutorial]]. | [[http://www.r-project.org/ | R]] is a free software environment for statistical computing and graphics. These notes summarize the [[http://tryr.codeschool.com/ | free R CodeSchool tutorial]]. |
| |
| ===== Basics ===== |
| |
| * ''R'' is the command-line interpreter |
| * ''install.packages("ggplot2") '' to install additional packages |
| |
| * Expressions are evaluated and displayed e.g. 1, 1+1, "Hello World" |
| * Booleans are e.g. ''1=1'' , ''3>4'' , ''TRUE'', T, ''FALSE'', F |
| * For variable assignment ''x=1'' or ''x<-1'' |
| * For help on a function use ''help(sum)'' , ''help(package='ggplot2')'' or ''example(sqrt)'' |
| * Operations are ''+ - * / = <-'' |
| * ''NA'' is used to express a missing or unknown data value. Expressions on NA return NA. |
| ===== Vectors ===== |
| * To create a vector, use the combine command ''c(4,7,9)'' |
| * Vectors must be of the same type, and are cast if not (e.g. to strings). |
| * ''a:b'' creates a vector of integers from a to b. |
| * ''seq(a,b,s)'' creates a vector of numbers from a to b in increments of s |
| * ''myseq[3]'' to access third element i.e. vectors indexed starting at 1. |
| * Use a vector as an index to access multiple elements e.g. ''myseq[c(1,3)]'' |
| * The ''names'' function can be used to assign names to vector elements. Once names are asigned, they can be used as indices e.g. <code> |
| names(myseq)=c('one','two','three') |
| myseq['two']</code> |
| * ''myseq + 1'' adds one to all elements of the myseq vector. |
| * Scalar operations or functions on vectors typically produce other vectors e.g. + - == sin(myseq) |
| * ''head(myvec)'' , ''tail(myvec)'' to show start or end of vector |
| |
| ===== Plotting ===== |
| |
| * ''barplot[myseq]'' creates a bar plot of the ''myseq'' vector. ''abline(h=y)'' plots a horizontal line at height y. |
| * ''plot[x,y]'' plots x vs y e.g. <code> |
| x=seq[0,20,.1] |
| y=sin(x) |
| plot(x,y) </code> |
| * ''contour(mymat)'' plots a contour map of a matrix. |
| * ''persp(mymat)'' plots a contour map in perspective. |
| * ''image(volcano)'' generates a heat map of the matrix. |
| * ''qplot(weights, prices, color=types)'' - more attractive plotting using ggplot2 package. |
| ===== Matrices ===== |
| |
| * ''matrix(0,3,4)'' creates a 3x4 matrix with all elements 0. |
| * ''matrix(1:12,3,4)'' creates a 3x4 matrix with numbers 1-12. |
| * dim(myseq) can be used to change dimensions of a matrix |
| * ''mymatrix[3,4]'' returns an element of the matrix (row,column). |
| * ''mymatrix[,2]'' returns entire second column. |
| |
| ===== Data Sets ===== |
| * ''factor'' is a collection type for categorized values - ''myfac=factor(myvec)'' |
| * ''factor''s group unique string values as ''level''s e.g. levels(myfac) shows unique levels. |
| * ''as.integer(myfac)'' shows levels as integers, can be used to set plot type |
| * ''legend("topright", levels(types), pch=1:length(levels(types)))'' |
| * A data frame collects sets of related values (i.e. sets of columns with values in the same order) e.g. ''mydf=data.frame(weights,prices,types)'' |
| * To extract a column, use double-square brackets with the column index or name e.g. ''mydf%%[['weights']]%%'' or just a dollar sign e.g. ''treasure$prices'' |
| * ''merge'' merges data sets by joining on shared column names |
| |
| ===== Statistics ===== |
| |
| * ''mean(myvec) median(myvec) sd(myvec)'' |
| * ''cor.test'' tests for correlation (Pearson's product-moment) |
| * ''line = lm(cola ~ colb)'' calculates a linear model between cola and colb that can be plotted with ''abline(line)'' |
| * |
| ===== File Handling ===== |
| |
| * ''list.files()'' to list files in furrent directory |
| * ''source("file.R")'' to load file of code |
| * ''read.csv('mydat.csv')'' to load a csv file |
| * ''read.table'' to read text data with other separators |
| * ''con<-url("http://google.com","r")'' to read a webpage |
| * ''x<-readLines(con)'' to convert to a vector of lines |