Sample Project Report Structure

Assignment: Statistical Inference Course Project
Rodrigo Farruguia
April 1, 2016
Overview:
In this project you will investigate the exponential distribution in R and compare it with the Central Limit Theorem. The exponential distribution can be
simulated in R with rexp(n, lambda) where lambda is the rate parameter. The
mean of exponential distribution is 1/lambda and the standard deviation is also
1/lambda. Set lambda = 0.2 for all of the simulations. You will investigate the
distribution of averages of 40 exponentials. Note that you will need to do a
thousand simulations.
Simulations
1000 samples of size 40 from and explonential distribution using a lambda of .2 .
We use rexp for this, also 1/lambda is the standard diviation and the exponential
distribution.
#number of seeds for reproduceability
set.seed(304)
# Variables to be used extracted from the overview.
lambda <- 0.2 # the lambda for experiments.
n <- 40 # number of experimentals used
numsims <- 1000 #number of simulations
Lets look at the means, Sample and Theoretical.

with a sample size of n , theoretical mean of the average samples will be mu sub
x = 1 lambda, you can see they are very close.
Mean
Mean from samples
Theoretical mean
5.013
5.000
A histogram representing the distribution of our sample means the vertical lines
are the mean of the distribution and the theoretical mean.By the comparison
numbers we got above we can tell from the graph that the lines will almost
overlap eachother.
Histogram of the Sample Means (30 bins)
Frequency
60
40
20
0
4
Sample Mean
Lets now look at the Variance, Sample and theoretical.

The actual variance is calculated by taking the variance of the experintal sample
mean. and the theoretical is calculated by the theoretical mean raise to the
second power , devided by the number of experimentals used. They are actually
really close as expected. ro2=Var(mean of samples)n.
Variance
Variance from the sample
Theoretical variance
# Do we have a normal distr
0.637
0.625
ibution
The averages of the samples should follow the normal distribution. We do this
by plotting and compairing the distribution of the samples mean and normal
distribution
Histogram, sample means fitting normal curve

0.5
Frequency
0.4
0.3
0.2
0.1
0.0
4
Sample mean
Normal probability plot

8
sample
theoretical
Lets see how it plots on a line to see that the theoretical normal distribution is a
match to the sample mean.You can see the distribution is approximately linear
normal
INDEX with code used to generate supporting graphs and calculations

### library needed for plots
library(ggplot2) #library needed for plots
library(knitr) #knitr, for exporting to document.
### means
experimentaldist <- matrix(data=rexp(n= numsims*n,rate=lambda), numsims,n )
experimentalmean <- rowMeans(experimentaldist)
actualmean <- mean(experimentalmean)
theoreticalmean <- 1/ lambda
actualvariance <- var(experimentalmean)
theoreticalvariance <- (1/ lambda)^2 /n
### first table
r1 <-data.frame("Mean"=c(actualmean,theoreticalmean),
row.names = c("Mean from samples ","Theoretical mean"))
kable(x = round(r1,3),align = 'c')

### graph 1 histogram
experimentalmeandata <- as.data.frame(experimentalmean)
ggplot(experimentalmeandata, aes(experimentalmean))+
geom_histogram(bins= 40, alpha=.5, position="identity", fill="green", col="black")+
geom_vline(xintercept = theoreticalmean, col="blue", linetype = "longdash",show.legend=TRUE)+
geom_vline(xintercept = actualmean, col="red", linetype = "longdash", show.legend =TRUE)+
ggtitle ("Histogram of the Sample Means (30 bins)")+
xlab("Sample Mean")+
ylab("Frequency")
### second table
r2 <-data.frame("Variance"=c(actualvariance, theoreticalvariance),
row.names = c("Variance from the sample ","Theoretical variance"))
kable(x = round(r2,3),align = 'c')
### graph 2 histogram

ggplot(experimentalmeandata, aes(experimentalmean))+
geom_histogram(aes(y=..density..),bins = 40, alpha=.5, position="identity", fill="green", col="b
geom_density(col="brown", size=1)+
stat_function(fun = dnorm, col = "red", args = list(mean = theoreticalmean, sd = sqrt(theoretica
ggtitle ("Histogram, sample means fitting normal curve ")+
xlab("Sample mean")+
ylab("Frequency")
### graph 3 line plot

qqplot.data <- function (vec) # argument: vector of numbers
{
y <- quantile(vec[!is.na(vec)], c(0.25, 0.75))
x <- qnorm(c(0.25, 0.75))
slope <- diff(y)/diff(x)
int <- y[1L] - slope * x[1L]
d <- data.frame(resids = vec)
ggplot(d, aes(sample = resids)) + stat_qq(col="blue") + geom_abline(slope = slope, intercept = int, co
}
qqplot.data (experimentalmean) +ggtitle ("Normal probability plot ")

Sample Project Report Structure

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Sample Project Report Structure

Caricato da

Copyright:

Formati disponibili

Assignment: Statistical Inference Course Project

Lets look at the means, Sample and Theoretical.

Histogram of the Sample Means (30 bins)

Lets now look at the Variance, Sample and theoretical.

Histogram, sample means fitting normal curve

Normal probability plot

INDEX with code used to generate supporting graphs and calculations

kable(x = round(r1,3),align = 'c')

### graph 2 histogram

### graph 3 line plot

Potrebbero piacerti anche