Sei sulla pagina 1di 17

Bootstrap Simulation

𝜏𝜌
Definition
• method for assigning measures of accuracy to
sample estimates
• allows estimation of the sampling distribution
of almost any statistic using only very simple
methods
• Generally, it falls in the broader class of
resampling methods
Situations where bootstrapping is
useful
• When the theoretical distribution of a statistic
of interest is complicated or unknown
• When the sample size is insufficient for
straightforward statistical inference
• When power calculations have to be
performed, and a small pilot sample is
available
Types of bootstrap scheme
Case Resampling
• Bootstrap is generally useful for estimating the
distribution of a statistic (e.g. mean, variance)
without using normal theory (e.g. z-statistic, t-
statistic)
Illustration 1
• From 10 times coin-flipping experiment, we
get the observations 𝑥1 , 𝑥2 , … , 𝑥10 below
0 0 1 0 1 1 1 1 1 1
(1→head)
With bootstrap method, create an emprical
bootstrap distribution of sample mean !
Illustration 1
• Algorithm
1. Resample data with replacement, and the
size of the resample must be equal to the size
of the original data set.
2. Find the 𝜇Ƹ
3. Repeat point 1 and 2, 𝑘 times
4. Plot histogram of 𝜇ෝ𝑖 ; 𝑖 = 1,2. . , 𝑘
Illustration 1
Simulation in R
bt<-function(data,k)
{ n<-length(data)
y<-matrix(round(runif(n*k,0,n-1))+1,n,k)
a<-matrix(1:n*k,n,k)
for(i in 1:k)
{ for(j in 1:n)
{ a[j,i]<-data[y[j,i]]
}
}
A<-a
mu<-apply(A,2,mean)
print(mu)
hist(mu)
}
a<-c(0,0,1,0,1,1,1,1,1,1)
bt(a,100)
Illustration 1
Histogram of mu

300
250
200
Frequency

150
100
50
0

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

mu
Illustration 2
• From the result of illustration 1, create the
bootstrap confidence interval !
Illustration 2
• Algorithm
1. Resample data with replacement, and the
size of the resample must be equal to the size
of the original data set.
2. Find the 𝜇Ƹ
3. Repeat point 1 and 2, 𝑘 times
4. Create the proportion confidence interval
from 𝜇ෝ𝑖 ; 𝑖 = 1,2. . , 𝑘
Illustration 2
Simulation in R
Confidence interval for proportion
𝑝ො 1 − 𝑝ො 𝑝ො 1 − 𝑝ො
𝑝ො − 𝑍𝛼 < 𝑝 < 𝑝ො + 𝑍𝛼
2 𝑛 2 𝑛

CI<-function(data,alpha)
{ n<-length(data)
pduga<-mean(data)
se<-sqrt((pduga*(1-pduga))/n)
z<-qnorm((1 - alpha)/2, lower.tail = FALSE)
c(pduga - z * se, pduga + z * se)
}
Illustration 2
bt2<-function(data,k,f,alpha)
{ n<-length(data)
y<- matrix(round(runif(n*k,0,n-1))+1,n,k)
a<-matrix(1:n*k,n,k)
for(i in 1:k)
{ for(j in 1:n)
{ a[j,i]<-data[y[j,i]]
}
}
A<-a
mu<-apply(A,2,mean)

hasil<-f(mu,alpha)
print(hasil)
}
a<-c(0,0,1,0,1,1,1,1,1,1)
bt2(a,100,CI,0.05)
Illustration 3
• From illustration 1 and 2, do the hypothesis
testing for 𝑝 = 0.4 !
• 𝐻0 : 𝑝 = 0.4 𝑣𝑠 𝐻1 : 𝑝 ≠ 0.4
Illustration 3
• Algorithm
1. Resample data with replacement, and the size of
the resample must be equal to the size of the
original data set.
2. Find the 𝜇ො
3. Repeat point 1 and 2, 𝑘 times
4. Create the proportion confidence interval from
𝜇ෝ𝑖 ; 𝑖 = 1,2. . , 𝑘
5. If 𝑝 in the interval, then 𝐻0 true
Illustration 3
Simulation in R
bt3<-function(data,k,f,alpha,h0)
{ n<-length(data)
y<- matrix(round(runif(n*k,0,n-
1))+1,n,k)
a<-matrix(1:n*k,n,k)
for(i in 1:k)
{ for(j in 1:n)
{ a[j,i]<-data[y[j,i]]
}
}
A<-a
mu<-apply(A,2,mean)

hasil<-f(mu,alpha)
conc<-ifelse(h0>=hasil[1] &
h0<=hasil[2],"Tidak tolak H0", "Tolak
H0")
list(CI=hasil, Conc=conc)
}
a<-c(0,0,1,0,1,1,1,1,1,1)
bt3(a,100,CI,0.05,0.4)
Thank you 

Potrebbero piacerti anche