Commands and Functions

Basic Commands

Command	Purpose	Example
?	pull up a help page	?seq
:	colon operator(generate regular sequence)	1.5:4
[ ]	subset a vector	x[c(2,6,1,3)]
[[ ]]	extract an element in a list	x[[“bar”]]
	or extract a column in a data frame	data_frame[[“age”]]
$	extract an element in a list	x$bar
	or extract a column in a data frame	data_frame$age
<-	assignment	x <- 5 + 7
c	concatenate	c(1.1, 9, 3.14)
seq, seq_along	sequence generation	seq(1,10,0.5)
length	return the length of a vector	length(pi:100)
rep	replicate elements of vectors	rep(c(1,2,3),5)
print	prints its argument	print(5:10)
#	comment character	# ignore the rest of line
class	return the class of object	class(pi)
as.numeric, as.integer,…	explicit coercion	as.integer(pi)
matrix	create a matrix	matrix(1:6, nrow=2, ncol=3)
dim	return the dimension attribute of object	dim( matrix(1:6, nrow=2, ncol=3) )
cbind, rbind	create matrix by column-/row- binding vectors	cbind(1:3,4:6)
list	create a list	list(name=“John”, age=20)
factor	create a factor	factor(“blue”,“green”,levels=c(“red”,“green”,“blue”))
levels	print all levels of a factor	levels(x)
is.na, is.nan	check if the argument is NA (or NaN)	is.na(as.numeric(“abc”))
data.frame	create a data frame	data.frame(foo = 1:4, bar = c(T, T, F, F))
nrow, ncol	return the number of rows/columns of object	nrow(data.frame(foo = 1:4, bar = c(T, T, F, F)))
names	names of elements of a vector/list	names(x)
colnames, rownames	column/row names of a matrix/data frame	colnames(m)
paste	concatenate strings	paste(“aa”,“bb”,“cc”,sep=“:”)
sum	summation of elements in a vector	sum(c(1,5,12))
prod	product of elements in a vector	prod(c(3,15,-2))
max, min	maximum/minimum value in a vector	max(c(23,12,12,3))
mean, median	mean/median of a vector	mean(c(1,5,1,2,7))
sort	sorting	sort(c(1,5,1,pi,2,7))
var, sd	sample variance/standard deviation of a vector	sd(c(1,5,1,3,2))
cor	correlation	cor(c(1,4,6,1), c(2,5,1,9))
which.min, which.max	first index of minimum/maximum value	which.min(c(4,1,3,4,1))
set.seed	set a seed for random number generation	set.seed(13218)
identical	check if two objects are identical	identical(2>1, 5>2)
which	which indices are TRUE?	which(100:2 > 36)
any	any component in a vector is TRUE?	any(10:0 < 0)
all	all components in a vector are TRUE?	all(1:100 > 0)
getwd	display working directory	getwd()
setwd	set working directory	setwd(“~/Documents”)
ls	list all objects in working directory	ls()
rm	delete variables from working space	rm(x,y), rm(list=ls())
head	preview the top of a dataset	head(airquality,10)
tail	preview the bottom of a datasetÂ	tail(airquality,15)
summary	summary of an R object	summary(x)
table	generate contingency table	table(x), table(x,y)
str	structure of an R object	str(airquality)
args	print arguments of a function	args(rnorm)
jitter	add jitters to data	jitter(1:10)
unique	remove duplicate elements	unique(c(1,2,3,4,3,2,1,0))
gl	generate factor levels	gl(2,100)
sample	take random samples	sample(LETTERS,10)
sample.int	take samples of integers	sample.int(1000,6)

Logical Operators

Operator	Name	Example
>, <	greater/less than	1:5 > seq(0,8,2)
==	equality operator	1:5 == seq(0,8,2)
>=, <=	greater or equal to/less than or equal to	1:5 <= seq(0,8,2)
!=	not equal to	1:5 != seq(0,8,2)
!	NOT	!(1 > 2)
&	AND	(3:5 > 5:7) & (4:6 == 4:6)
\|	OR	(3:5 > 5:7) \| (4:6 == seq(2,6,2))
&&	non-vectorized AND	TRUE && c(TRUE,FALSE)
\|\|	non-vectorized OR	TRUE \|\| c(TRUE,FALSE)
xor	exclusive or	xor(5==6, FALSE)
isTRUE	Is the argument TRUE?	isTRUE(6>4)

Control Structures

Examples can be found on pp. 63-70 of the text.

Command	Purpose
if-else	testing conditions and acting on it
for	execute a loop for a fixed number of times
while	execute a loop while the conditions are satisfied
repeat	execute a loop indefinitely, must use `break` to exit the loop
break	exit a loop
next	skip the rest of the commands and jump to the next iteration

Loop Functions

Command	Purpose	Example
lapply	loop over a list and evaluate function on each element	lapply(1:10,sqrt)
sapply	same as lapply but try to simplify the result	sapply(1:10,sqrt)
vapply	same as lapply but specify the output type	vapply(1:10,sqrt,numeric(1))
apply	apply a function over the margins of an array	apply(matrix(1:10,2,5), 2, mean)
mapply	multivariate version of lapply	mapply(rep,1:4,4:1)
tapply	apply a function over a subset of a vector	tapply(InsectSprays$count,InsectSprays$spray,mean)
rowSums	= apply(x,1,sum)	rowSums(matrix(1:10,2,5))
rowMeans	= apply(x,1,mean)	rowMeans(matrix(1:10,2,5))
colSums	= apply(x,2,sum)	colSums(matrix(1:10,2,5))
colMeans	= apply(x,2,mean)	colMeans(matrix(1:10,2,5))
replicate	repeated evaluation of an expression	replicate(20,rnorm(10))

Mathematical Functions

Function	Name	Example
abs	absolute value	abs(3-6) = 3
sqrt	square root	sqrt(16) = 4
^	exponentiation	3^10 = $3^{10}$ = 59049
exp	exponential function	exp(1.7) = $e^{1.7}$ = 5.473947
log	log function (base e)	log(10) = 2.302585
log10	base 10 log ($\log_{10}$)	log10(100) = 2
pi	mathematical constant $\pi$	pi = 3.141593
sin, cos, tan	trigonometric functions (argument in radians)	sin(pi/2) = 1
asin, acos, atan	inverse trigonometric functions	acos(1) = 0
sinh, cosh, tanh	hyperbolic functions	cosh(0) = 1
asinh, acosh, atanh	inverse hyperbolic functions	atanh(tanh(12)) = 12
round(x,n)	round x to n decimal places	round(pi,2) = 3.14
floor	rounds down	floor(14.7) = 14
ceiling	rounds up	ceiling(14.7) = 15

Probability Distributions

Every distribution has four functions. There is a root name, for example, the root name for the normal distribution is norm. This root is prefixed by one of the letters

p for “probability”, the cumulative distribution function (cdf)
q for “quantile”, the inverse cdf
d for “density”, the probability density function (pdf)
r for “random”, a random variable having the specified distribution

Distribution	Functions
Normal	dnorm pnorm qnorm rnorm
$\chi^2$	dchisq pchisq qchisq rchisq
Student t	dt pt qt rt
F	df pf qf rf

Reading and Writing Files

Command	Purpose	Example
read.table, read.csv, …	read data from a text file	data <- read.table(“foo.txt”)
write.table, write.csv, …	write data to a text file	write.csv(data,“foo.csv”)
readLines	read text file line by line	data <- readLines(“foo.txt”)
writeLines	write text to file	writeLines(“write something…”, “foo.txt”)
save	write R objects to a file in binary format	save(x,y,z,file=“out.rda”)
load	reload datasets written with the function save()	load(“out.rda”)
dput	write R objects to a file in ASCII text	dput(x,“out.R”)
dget	load R objects written with dput()	x2 <- dget(“out.R”)

Plotting (Base Graphics)

Command	Name	Example
plot	plot data	plot(dist ~ speed, data=cars)
hist	plot a histogram	hist(rnorm(1e5),freq=FALSE, breaks=100)
barplot	create a barplot	barplot(table(mtcars$cyl))
boxplot	create boxplot	boxplot(mpg~cyl,data=mtcars)
curve	plot a function	curve(x^2, xlim=x(-1,1))
legend	add legend to a plot	legend(1,2,c(“text1”,“text2”),col=1:2,lty=1:2)
abline	add straight lines to a plot	abline(1,2)
density	make a density plot	plot(density(rnorm(1e5)))
smoothScatter	make a smooth scatter plot	smoothScatter(x,y)

Plotting (Lattice Graphics)

The lattice library has to be loaded to R using library(lattice) before these commands can be used.

Command	Purpose	Example
xyplot	make scatter plots	xyplot(y ~ x \| factor, data=data_frame)
histogram	make histograms	histogram(~ x \| factor, data=data_frame)

Linear Regression

Command	Purpose	Example
lm	perform linear regression	lm(y ~ x, data=data_frame)
summary	summarize the regression result	summary(fit)
confint	calculate confidence interval	confint(fit, levels=0.9)
abline	plot regression line (for simple regression)	abline(fit,col=“red”)
predict	predict new data	predict(fit, newdata=data_frame)
resid	get the residuals	resid(fit)

Model Formulae

Here are some examples of formulae that can be used in linear regressions and the mdoels they represent.

Formula	Model
y ~ x	$y=\beta_0+\beta_1 x$
y ~ x-1 or y ~ 0+x	$y=\beta_0 x$
y ~ x1 + x2	$y=\beta_0 + \beta_1 x1 + \beta_2 x2$
y ~ x1 + x1:x2	$y=\beta_0 + \beta_1 x1 + \beta_2 x1\cdot x2$
y ~ x1*x2	$y=\beta_0 + \beta_1 x1 + \beta_2 x2 + \beta_3 x1\cdot x2$
y ~ x1x2x3	$y=\beta_0 + \beta_1 x1 + \beta_2 x2 + \beta_3 x3 + \beta_4 x_1\cdot x_2$
	$+ \beta_5 x_1\cdot x_3 + \beta_6 x_2 \cdot x_3 + \beta_7 x_1\cdot x_2 \cdot x_3$
y ~ I(x1+x2)	$y=\beta_0 + \beta_1 (x1+x2)$
y ~ I(x1 + 4x1x2) + I(x2^3) - 1	$y=\beta_1 (x1+4\cdot x2) + \beta_2 x2^3$

Formula	Model
y ~ x	\(y=\beta_0+\beta_1 x\)
y ~ x-1 or y ~ 0+x	\(y=\beta_0 x\)
y ~ x1 + x2	\(y=\beta_0 + \beta_1 x1 + \beta_2 x2\)
y ~ x1 + x1:x2	\(y=\beta_0 + \beta_1 x1 + \beta_2 x1\cdot x2\)
y ~ x1*x2	\(y=\beta_0 + \beta_1 x1 + \beta_2 x2 + \beta_3 x1\cdot x2\)
y ~ x1x2x3	\(y=\beta_0 + \beta_1 x1 + \beta_2 x2 + \beta_3 x3 + \beta_4 x_1\cdot x_2\)
	\(+ \beta_5 x_1\cdot x_3 + \beta_6 x_2 \cdot x_3 + \beta_7 x_1\cdot x_2 \cdot x_3\)
y ~ I(x1+x2)	\(y=\beta_0 + \beta_1 (x1+x2)\)
y ~ I(x1 + 4x1x2) + I(x2^3) - 1	\(y=\beta_1 (x1+4\cdot x2) + \beta_2 x2^3\)

Distribution	Functions
Normal	dnorm pnorm qnorm rnorm
\(\chi^2\)	dchisq pchisq qchisq rchisq
Student t	dt pt qt rt
F	df pf qf rf

Commands and Functions - Weeks 1-9