Bold: commands/functions in Week 8.

Basic Commands

Command Purpose Example
? pull up a help page ?seq
: colon operator(generate regular sequence) 1.5:4
[ ] subset a vector x[c(2,6,1,3)]
[[ ]] extract an element in a list x[[“bar”]]
or extract a column in a data frame data_frame[[“age”]]
$ extract an element in a list x$bar
or extract a column in a data frame data_frame$age
<- assignment x <- 5 + 7
c concatenate c(1.1, 9, 3.14)
seq, seq_along sequence generation seq(1,10,0.5)
length return the length of a vector length(pi:100)
rep replicate elements of vectors rep(c(1,2,3),5)
print prints its argument print(5:10)
# comment character # ignore the rest of line
class return the class of object class(pi)
as.numeric, as.integer,… explicit coercion as.integer(pi)
matrix create a matrix matrix(1:6, nrow=2, ncol=3)
dim return the dimension attribute of object dim( matrix(1:6, nrow=2, ncol=3) )
cbind, rbind create matrix by column-/row- binding vectors cbind(1:3,4:6)
list create a list list(name=“John”, age=20)
factor create a factor factor(“blue”,“green”,levels=c(“red”,“green”,“blue”))
levels print all levels of a factor levels(x)
is.na, is.nan check if the argument is NA (or NaN) is.na(as.numeric(“abc”))
data.frame create a data frame data.frame(foo = 1:4, bar = c(T, T, F, F))
nrow, ncol return the number of rows/columns of object nrow(data.frame(foo = 1:4, bar = c(T, T, F, F)))
names names of elements of a vector/list names(x)
colnames, rownames column/row names of a matrix/data frame colnames(m)
paste concatenate strings paste(“aa”,“bb”,“cc”,sep=“:”)
sum summation of elements in a vector sum(c(1,5,12))
prod product of elements in a vector prod(c(3,15,-2))
max, min maximum/minimum value in a vector max(c(23,12,12,3))
mean, median mean/median of a vector mean(c(1,5,1,2,7))
sort sorting sort(c(1,5,1,pi,2,7))
var, sd sample variance/standard deviation of a vector sd(c(1,5,1,3,2))
cor correlation cor(c(1,4,6,1), c(2,5,1,9))
which.min, which.max first index of minimum/maximum value which.min(c(4,1,3,4,1))
set.seed set a seed for random number generation set.seed(13218)
identical check if two objects are identical identical(2>1, 5>2)
which which indices are TRUE? which(100:2 > 36)
any any component in a vector is TRUE? any(10:0 < 0)
all all components in a vector are TRUE? all(1:100 > 0)
getwd display working directory getwd()
setwd set working directory setwd(“~/Documents”)
ls list all objects in working directory ls()
rm delete variables from working space rm(x,y), rm(list=ls())
head preview the top of a dataset head(airquality,10)
tail preview the bottom of a dataset  tail(airquality,15)
summary summary of an R object summary(x)
table generate contingency table table(x), table(x,y)
str structure of an R object str(airquality)
args print arguments of a function args(rnorm)
jitter add jitters to data jitter(1:10)
unique remove duplicate elements unique(c(1,2,3,4,3,2,1,0))
gl generate factor levels gl(2,100)

Logical Operators

Operator Name Example
>, < greater/less than 1:5 > seq(0,8,2)
== equality operator 1:5 == seq(0,8,2)
>=, <= greater or equal to/less than or equal to 1:5 <= seq(0,8,2)
!= not equal to 1:5 != seq(0,8,2)
! NOT !(1 > 2)
& AND (3:5 > 5:7) & (4:6 == 4:6)
| OR (3:5 > 5:7) | (4:6 == seq(2,6,2))
&& non-vectorized AND TRUE && c(TRUE,FALSE)
|| non-vectorized OR TRUE || c(TRUE,FALSE)
xor exclusive or xor(5==6, FALSE)
isTRUE Is the argument TRUE? isTRUE(6>4)


Control Structures

Examples can be found on pp. 63-70 of the text.

Command Purpose
if-else testing conditions and acting on it
for execute a loop for a fixed number of times
while execute a loop while the conditions are satisfied
repeat execute a loop indefinitely, must use break to exit the loop
break exit a loop
next skip the rest of the commands and jump to the next iteration


Loop Functions

Command Purpose Example
lapply loop over a list and evaluate function on each element lapply(1:10,sqrt)
sapply same as lapply but try to simplify the result sapply(1:10,sqrt)
vapply same as lapply but specify the output type vapply(1:10,sqrt,numeric(1))
apply apply a function over the margins of an array apply(matrix(1:10,2,5), 2, mean)
mapply multivariate version of lapply mapply(rep,1:4,4:1)
tapply apply a function over a subset of a vector tapply(InsectSprays$count,InsectSprays$spray,mean)
rowSums = apply(x,1,sum) rowSums(matrix(1:10,2,5))
rowMeans = apply(x,1,mean) rowMeans(matrix(1:10,2,5))
colSums = apply(x,2,sum) colSums(matrix(1:10,2,5))
colMeans = apply(x,2,mean) colMeans(matrix(1:10,2,5))


Mathematical Functions

Function Name Example
abs absolute value abs(3-6) = 3
sqrt square root sqrt(16) = 4
^ exponentiation 3^10 = \(3^{10}\) = 59049
exp exponential function exp(1.7) = \(e^{1.7}\) = 5.473947
log log function (base e) log(10) = 2.302585
log10 base 10 log (\(\log_{10}\)) log10(100) = 2
pi mathematical constant \(\pi\) pi = 3.141593
sin, cos, tan trigonometric functions (argument in radians) sin(pi/2) = 1
asin, acos, atan inverse trigonometric functions acos(1) = 0
sinh, cosh, tanh hyperbolic functions cosh(0) = 1
asinh, acosh, atanh inverse hyperbolic functions atanh(tanh(12)) = 12
round(x,n) round x to n decimal places round(pi,2) = 3.14
floor rounds down floor(14.7) = 14
ceiling rounds up ceiling(14.7) = 15


Probability Distributions

Every distribution has four functions. There is a root name, for example, the root name for the normal distribution is norm. This root is prefixed by one of the letters

Distribution Functions
Normal dnorm     pnorm     qnorm     rnorm
\(\chi^2\) dchisq     pchisq     qchisq     rchisq
Student t dt     pt     qt     rt
F df     pf     qf     rf


Reading and Writing Files

Command Purpose Example
read.table, read.csv, … read data from a text file data <- read.table(“foo.txt”)
write.table, write.csv, … write data to a text file write.csv(data,“foo.csv”)
readLines read text file line by line data <- readLines(“foo.txt”)
writeLines write text to file writeLines(“write something…”, “foo.txt”)
save write R objects to a file in binary format save(x,y,z,file=“out.rda”)
load reload datasets written with the function save() load(“out.rda”)
dput write R objects to a file in ASCII text dput(x,“out.R”)
dget load R objects written with dput() x2 <- dget(“out.R”)


Plotting (Base Graphics)

Command Purpose Example
plot plot data plot(dist ~ speed, data=cars)
hist plot a histogram hist(rnorm(1e5),freq=FALSE, breaks=100)
barplot create a barplot barplot(table(mtcars$cyl))
boxplot create boxplot boxplot(mpg~cyl,data=mtcars)
curve plot a function curve(x^2, xlim=x(-1,1))
legend add legend to a plot legend(1,2,c(“text1”,“text2”),col=1:2,lty=1:2)
abline add straight lines to a plot abline(1,2)
density make a density plot plot(density(rnorm(1e5)))
smoothScatter make a smooth scatter plot smoothScatter(x,y)


Plotting (Lattice Graphics)

The lattice library has to be loaded to R using library(lattice) before these commands can be used.

Command Name Example
xyplot make scatter plots xyplot(y ~ x | factor, data=data_frame)
histogram make histograms histogram(~ x | factor, data=data_frame)


Linear Regression

Command Purpose Example
lm perform linear regression lm(y ~ x, data=data_frame)
summary summarize the regression result summary(fit)
confint calculate confidence interval confint(fit, levels=0.9)
abline plot regression line (for simple regression) abline(fit,col=“red”)
predict predict new data predict(fit, newdata=data_frame)
resid get the residuals resid(fit)


Model Formulae

Here are some examples of formulae that can be used in linear regressions and the mdoels they represent.

Formula Model
y ~ x \(y=\beta_0+\beta_1 x\)
y ~ x-1 or y ~ 0+x \(y=\beta_0 x\)
y ~ x1 + x2 \(y=\beta_0 + \beta_1 x1 + \beta_2 x2\)
y ~ x1 + x1:x2 \(y=\beta_0 + \beta_1 x1 + \beta_2 x1\cdot x2\)
y ~ x1*x2 \(y=\beta_0 + \beta_1 x1 + \beta_2 x2 + \beta_3 x1\cdot x2\)
y ~ x1*x2*x3 \(y=\beta_0 + \beta_1 x1 + \beta_2 x2 + \beta_3 x3 + \beta_4 x_1\cdot x_2\)
\(+ \beta_5 x_1\cdot x_3 + \beta_6 x_2 \cdot x_3 + \beta_7 x_1\cdot x_2 \cdot x_3\)
y ~ I(x1+x2) \(y=\beta_0 + \beta_1 (x1+x2)\)
y ~ I(x1 + 4*x1*x2) + I(x2^3) - 1 \(y=\beta_1 (x1+4x1\cdot x2) + \beta_2 x2^3\)