Sei sulla pagina 1di 3

R Studio Cheat Sheet

Function
Code
Combine, create a vector with these
name = c(#, #, #, #..., #)
values
Session>Set Working Directory>Choose
Directory
Importing files
Call file (display all data)
Call first ten rows
Call second and fourth columns
General glimpse of data
Table of frequencies comparing two
aspects
In dataset A, find column B
Scatterplot of columnnames data
Labelled scatterplot
Boxplot
Histogram
Barplot of labelled subjects
Clustered Bar Chart (beside makes
clustered)
Mean
Standard Deviation
First quartile (similar percentages for
others)
Five number summary (+mean)
Correlation Coefficient (numeric
comparison)
Least Squares Regression Line
lm = linear model
Produces Y-intercept (Intercept) and
gradient (column2)
Log Transformation

Compare two graphs


Randomise order of subjects

Page 1

name = read.table(name.txt, header=T)


name
name[1:10,]
name[,c(2,4)]
head(name)
tab = table(name$column1,
name$column2)
A$B
plot(name$columnname)
plot(name$columnname, xlab=label,
ylab=label, main=title
boxplot(name$columnname)
hist(name$columnname)
x = c(number1, number2)
names(x) = c(name1, name2)
barplot(x)
barplot(tab, beside=T, legend=T)
mean(name)
sd(name)
quantile(name, 0.25)
summary(name)
cor(name$column1, name$column2
model = lm(column1~column2,
dat=name)
model
log.data = log10(data)
(also sqrt(data), log(data) for e,
1/data
hist(log.data) produces histogram
par(mfrow=c(2,1))
hist(data1)
hist(data2)
randomnumbers = runif(6)
sort(randomnumbers, index.return=T)
-ORnames = c(1, 2, 3, David)
ran = runif(6) random, uniform

Oliver Bogdanovski

Probability to left of point on normal


distribution
P(Zz)
Point (quantile) where given the
probability we can find the z-value
i.e.
Find c in P(Zc)=p given p
Create a normal quantile plot

Find how many values in data greater than


a particular number n
Generate normally distributed numbers
Generate binomially distributed numbers
We can combine whole datasets (of equal
n) by summing the two and dividing it all
by 2
Find P(X=k) where X~B(n,p)
Find P(Xk) where X~B(n,p)
Create range of values from x to y
increasing by 1
To create probability histogram of binomial
distribution
space=0 means no gaps between lines
names.arg gives x-values
overlay normal curve
Hypothesis tests - using z statistic (with )
Confidence interval
Find P(Tx) in T~t(k) k degrees of
freedom
Find P(T>x) in T~t(k)
Find c in P(Tc)=p in T~t(k)

sort.ran = sort(ran, index.return=T)


names[sort.ran$ix]
pnorm(z) e.g. pnorm(1.96) for
standardised
pnorm(z, , ) for unstandardised
qnorm(p) e.g. qnorm (0.975) for
P(Zc)=0.975
qnorm(p, , ) unstandardised
qqnorm(data$column)
*
sum(dataname>n)
Also: ==, !=, <, <=, >=
rnorm(k, mu, sigma) k means 100
values
rnorm(100, 3, 2)
rbinom(k, n, p)
dbinom(k, n, p)
pbinom(k, n, p) (do 1-pbinom for strictly
>)
name = x:y
xbin = 0:24 creates set of numbers
fbin = dbinom(xbin, n, p) n must be
bigger than 24 in this case, doubled is
best for normal
barplot(fbin, names.arg=xbin, space=0)
lines(xbin, dnorm(xbin, 12, 3)) note n is
half of 24
set up values (mean, mu.0, sigma, n)
z = (m - mu.0)/(sigma/sqrt(n))
pnorm(z)
z.star=qnorm(0.975) for 95%
confidence
use rest of formula
pt(x, k) works like pnorm, also
probability to left
1-pt(x, k)
qt(p, k) e.g. to find upper 0.5%, use
qt(0.995, k)

Hypothesis Testing (with t-tests)


if Ha: > 0 (one-sided)
if Ha: < 0 (one-sided)
if Ha: 0 (two-sided)

Page 2

t.test(sampledata, mu=*0*, alternative =


greater)
t.test(sampledata, mu=*0*, alternative =
less)
t.test(sampledata, mu=*0*)

Oliver Bogdanovski

if Ha: 0 (two-sided) - choose


98% CI

t.test(sampledata, conf. level=0.98)


no mu needed as it only affect P-value, not
confidence interval if thats all youre
interested in

if using a parameter other than mu,


like p, we need to do it manually

Page 3

Oliver Bogdanovski