Sei sulla pagina 1di 11

UNIVERSITY OF EXETER

COLLEGE OF ENGINEERING, MATHEMATICS, AND PHYSICAL SCIENCES


ECM3728 Statistical Inference

Solutions to exercises for section 3

Preparatory exercises
1. We have x̄ = 5/10 = 0.5 and φ̂ = x̄2 = 0.25. For the parametric bootstrap, we estimate the Ber(θ)
distribution by the Ber(θ̂) distribution, where θ̂ = x̄ = 0.5. The bootstrap version of the statistic
T = φ̂ − φ = X̄ 2 − θ2 is T ∗ = X̄ ∗2 − θ̂2 , where X1∗ , . . . , Xn∗ are independent Ber(θ̂) random variables.
The bias-corrected estimate is φ̂−E(T ∗ ) and the estimated standard error is the square root of var(T ∗ ).
We approximate these quantities by resampling in the following code to obtain the bias-corrected
estimate 0.22 and estimated standard error 0.16.

x=c(0,0,0,0,0,1,1,1,1,1)
B=10000
n=length(x)
theta.hat=mean(x)
phi.hat=mean(x)^2
t.star=numeric(B)
for(b in 1:B) {
x.star=rbinom(n,1,theta.hat) # resample
t.star[b]=mean(x.star)^2-phi.hat
}
phi.hat-mean(t.star) # bias-corrected estimate
sd(t.star) # estimated standard error

Nonparametric resampling replaces line 8 with x.star=sample(x,replace=TRUE) and yields the same
results. (In this example, nonparametric resampling is identical to parametric resampling because both
correspond to Xi∗ being 0 or 1 with probability 0.5.)
Pn
2. (a) We have T ∗ = ∗2 ∗ ∗
i=1 Xi /n, where X1 , . . . , Xn are independent with common mass function

Pr(X1 = xi ) = 1/n for i = 1, . . . , n because we are using the nonparametric bootstrap. Therefore,
n n n
1X X 1X 2
E(T ∗ ) = E(Xi∗2 ) = E(X1∗2 ) = x2i Pr(X1∗ = xi ) = x .
n i=1 i=1
n i=1 i

We are told that T is an estimator for the variance of the Xi and so the bias of T is E(T )−var(X1 ).
The bootstrap estimate of this bias is E(T ∗ ) − var(X1∗ ). We have
( n
)2
X
var(X1∗ ) = E(X1∗2 ) − E(X1∗ )2 ∗
= E(T ) − xi Pr(X1∗ = xi ) = E(T ∗ ) − x̄2
i=1

and so the estimated bias is E(T ∗ ) − {E(T ∗ ) − x̄2 } = x̄2 , as required.

1
(b) The nonparametric bootstrap estimate of the variance of T is
n
1 X
var(T ∗ ) = var(Xi∗2 )
n2 i=1
1
= var(X1∗2 )
n
1
E(X1∗4 ) − E(X1∗2 )2

=
n
n n
!2 
1 1
 X 1 X 
= x4i − x2i
n  n i=1 n i=1 
 !2 
n n
1 X 4 1 X 2 
= 2 xi − xi
n  i=1
n 
i=1

and the estimated standard error is the square root of this expression, as required.
Pn Pn
3. Note that θ̂ = i=1 yi / i=1 xi = 117/24.4 = 4.795. The jackknife version of θ̂ is
n
1X
θ̂J = θ̂i = 4.80,
n i=1

where n = 6, θ̂i = nθ̂ − (n − 1)θ̂−i and


Pn
( j=1 yj ) − yi
θ̂−i = Pn .
( j=1 xj ) − xi

The jackknife estimate of the standard error is v̂J = 0.33, where
n
1 X
v̂J = (θ̂i − θ̂J )2 .
n(n − 1) i=1

The jackknife 95% confidence interval for θ is θ̂J ± 2.571 v̂J = (3.96, 5.64), where 2.571 is the 97.5%
quantile of the Stu(n − 1) distribution. You may find it useful to record your calculations in a table
like the following one.

xi 3.4 7.2 4.6 1.3 5.5 2.4


P yi 14 32 23 3 31 14
P xj − xi 21.0 17.2 19.8 23.1 18.9 22.0
yj − yi 103 85 94 114 86 103
θ̂−i 4.905 4.942 4.747 4.935 4.550 4.682
θ̂i 4.247 4.061 5.033 4.095 6.019 5.361
Pn
4. We have φ̂ = ( i=1 Xi /n)2 = (S/n)2 and so
 2
 2
1 X S − Xi
φ̂−i =  Xj  = .
n−1 n−1
j6=i

2
Therefore, the jackknife estimator is
n
1 Xh i
φ̂J = nφ̂ − (n − 1)φ̂−i
n i=1
 2 n  2
S n − 1 X S − Xi
=n −
n n i=1 n − 1
n
1 2 1 X
S 2 − 2SXi + Xi2

= S −
n n(n − 1) i=1
n
1 2 1 2 1 X
= S − S2 + S2 − X2
n n−1 n(n − 1) n(n − 1) i=1 i
n
!
1 2
X
2
= S − Xi
n(n − 1) i=1
1
S2 − S

= since Xi2 = Xi for Bernoulli random variables
n(n − 1)
S(S − 1)
= .
n(n − 1)

For the data in exercise 1, φ̂ = (5/10)2 = 1/4 and φ̂J = (5 × 4)/(10 × 9) = 2/9 ≈ 0.22.
5. (a) A Monte Carlo test would simulate a large number of samples of size n from the Poisson dis-
tribution with parameter θ0 and calculate the Wald test statistic for each sample. The p-value
would be the proportion of these resampled test statistics that exceed the observed value of the
test statistic.
(b) The following code yields a p-value of 0.042.
n=30
xbar=1.5
theta0=1
B=10000
t=n*(xbar-theta0)^2/xbar # observed test statistic
t.star=numeric(B)
for(b in 1:B) {
x.star=rpois(n,theta0) # resample
t.star[b]=n*(mean(x.star)-theta0)^2/mean(x.star) # test statistic
}
mean(t.star>=t) # p-value
6. We resample both x∗1 , . . . , x∗m and y1∗ , . . . , yn∗ from Exp(µ̂0 ), calculate the test statistic for each resample,
and compute the p-value as the proportion of resampled test statistics that exceed the observed test
statistic. The following R code yields a p-value of 0.21.

m=5
n=5
xbar=14.2
ybar=32.3
B=10000
mu0=(m+n)/(m*xbar+n*ybar)
ratio=ybar/xbar
t=2*(m*log(m+n*ratio)+n*log(n+m/ratio)-(m+n)*log(m+n)) # observed test statistic
t.star=numeric(B)
for(b in 1:B) {

3
x.star=rexp(m,mu0) # resample x*
y.star=rexp(n,mu0) # resample y*
ratio=mean(y.star)/mean(x.star)
t.star[b]=2*(m*log(m+n*ratio)+n*log(n+m/ratio)-(m+n)*log(m+n)) # test statistic
}
mean(t.star>=t) # p-value

7. The code below yields the following 90% confidence intervals for φ: (−0.14, 0.46) (basic), (−0.055, 0.91)
(studentised) and (0.04, 0.64) (percentile). Only the percentile bootstrap guarantees an interval that
is a subset of (0, 1), the set of possible values for φ in this example.

x=c(0,0,0,0,0,1,1,1,1,1) # data
B=10000
n=length(x)
phi.hat=mean(x)^2 # point estimate
s=2*sqrt(mean(x)^3*(1-mean(x))/n) # standard error
phi.star=numeric(B)
t1.star=numeric(B)
t2.star=numeric(B)
for(b in 1:B) {
x.star=sample(x,replace=TRUE) # nonparametric resampling
phi.star[b]=mean(x.star)^2 # point estimate
s.star=2*sqrt(mean(x.star)^3*(1-mean(x.star))/n) # standard error
t1.star[b]=phi.star[b]-phi.hat # T* for the basic interval
t2.star[b]=(phi.star[b]-phi.hat)/s.star # T* for the studentised interval
}
phi.hat-quantile(t1.star,c(0.95,0.05)) # basic interval
phi.hat-quantile(t2.star,c(0.95,0.05))*s # studentised interval
quantile(phi.star,c(0.05,0.95)) # percentile interval

8. We simulate many samples of size 10 from a Ber(0.5) distribution and compute the percentile bootstrap
90% confidence interval for φ for each sample. Finally, we calculate the proportion of times that
φ = θ2 = 0.25 lies outside the interval. We find that φ lies below the lower limit about 5.3% of the time
and above the upper limit about 5.1% of the time, so the coverage is about 100 − 5.3 − 5.1 = 89.6%,
close to the nominal 90%. Here is the code, which takes a while to run because of the nested loops.

n=10
theta=0.5
phi=theta^2
B=1000
nsim=10000
lo=up=numeric(nsim)
for(i in 1:nsim) {
x=rbinom(n,1,theta) # simulate a sample from Ber(0.5)
phi.star=numeric(B)
for(b in 1:B) {
x.star=sample(x,replace=TRUE) # resample
phi.star[b]=mean(x.star)^2 # estimate phi
}
lo[i]=quantile(phi.star,0.05) # lower limit of percentile interval
up[i]=quantile(phi.star,0.95) # upper limit of percentile interval
}
mean(phi<lo) # proportion of times phi is below the lower limit
mean(phi>up) # proportion of times phi is above the upper limit

4
9. (a) Let Xi = σZi , where the distribution of Zi contains no unknown parameters. Then X̄ = σ Z̄ and
T = (σZ0 )/(σ Z̄) = Z0 /Z̄, which depends on only the Zi and hence is independent of σ.
(b) i. Let σ = 1/θ and X = σZ. We show that Z has a distribution that is independent of θ. We
have Z z/θ Z z/θ
Pr(Z ≤ z) = Pr(X ≤ z/θ) = f (x; θ)dx = θe−θx dx = 1 − e−z .
0 0

ii. If qp denotes the p-quantile of T = X0 /X̄ then 0.8 = Pr(q0.1 < X0 /X̄ < q0.9 ) = Pr(q0.1 X̄ <
X0 < q0.9 X̄) and so a 80% prediction interval is (q̂0.1 X̄, q̂0.9 X̄), where q̂p is the p-quantile
of the bootstrap version of T , that is of T ∗ = X0∗ /X̄ ∗ , where the Xi∗ are independent with
distribution Exp(θ̂). The following code yields the interval (0.25, 6.75).
n=5
xbar=2.3
B=10000
theta.hat=1/xbar
t.star=numeric(B)
for(b in 1:B) {
x.star=rexp(n,theta.hat)
x0.star=rexp(1,theta.hat)
t.star[b]=x0.star/mean(x.star)
}
quantile(t.star,c(0.1,0.9))*xbar

5
Extra exercises
1. The original estimate is φ̂ = 0.022, the bias-corrected estimate is 0.013 (parametric resampling) or
0.016 (nonparametric resampling) and the estimated standard error is 0.03 (parametric) or 0.02 (non-
parametric). Here is the R code for parametric resampling in which we resample data from P oi(θ̂),
where θ̂ = x̄.

x=c(3,3,4,7,2)
B=10000
n=length(x)
theta.hat=mean(x)
phi.hat=exp(-theta.hat)
t.star=numeric(B)
for(b in 1:B) {
x.star=rpois(n,theta.hat) # parametric resampling
t.star[b]=exp(-mean(x.star))-phi.hat
}
phi.hat-mean(t.star) # bias-corrected estimate
sd(t.star) # estimated standard error

Nonparametric resampling replaces line 8 with x.star=sample(x,replace=TRUE).


2. The original estimate is δ̂ = 1.64, the bias estimates are 0.098 (parametric resampling) and 0.068
(nonparametric resampling) and the estimated standard errors are 0.52 (parametric) and 0.43 (non-
parametric). Here is the R code for parametric resampling in which we resample x from P oi(γ̂) and y
from P oi(γ̂ δ̂).

x=c(3,3,4,7,2)
y=c(9,10,7,7,2,4,4,7)
B=10000
m=length(x)
n=length(y)
delta.hat=mean(y)/mean(x)
gamma.hat=(sum(x)+sum(y))/(m+n*delta.hat)
t.star=numeric(B)
for(b in 1:B) {
x.star=rpois(m,gamma.hat) # resample x
y.star=rpois(n,gamma.hat*delta.hat) # resample y
t.star[b]=mean(y.star)/mean(x.star)-delta.hat
}
mean(t.star) # estimate bias
sd(t.star) # estimate standard error

Nonparametric resampling replaces lines 10 and 11 with

x.star=sample(x,replace=TRUE)
y.star=sample(y,replace=TRUE)
Pn Pn
3. We have φ̂ = ( i=1 Xi /n)2 and so the nonparametric bootstrap version of φ̂ is φ̂∗ = ( i=1 Xi∗ /n)2 ,
where X1∗ , . . . , Xn∗ are independent with common mass function Pr(X1∗ = xi ) = 1/n for i = 1, . . . , n.

6
The bootstrap estimate of the bias is E(T ∗ ), where T ∗ = φ̂∗ − φ̂. We have
n n
1 XX
E(φ̂∗ ) = E(Xi∗ Xj∗ )
n2 i=1 j=1
 
n
1 X X 
= 2 E(Xi∗2 ) + E(Xi∗ Xj∗ )
n  i=1

i6=j
1 
= 2 nE(X1∗2 ) + n(n − 1)E(Xi∗ )E(Xj∗ )

n ( )
n
1 1X 2 2
= 2 n x + n(n − 1)x̄
n n i=1 i
x̄(1 − x̄)
= x̄2 +
n
and φ̂ = x̄2 so that the estimate of the bias is E(T ∗ ) = x̄(1 − x̄)/n. Therefore, the bias-corrected
estimate is
x̄(1 − x̄) S(S − 1) S 2
φ̂ − E(T ∗ ) = x̄2 − = ... = + 3.
n n2 n
For the data in preparatory exercise 1, n = 10 and S = 5 so that the bias-corrected estimate is
5 × 4/102 + 52 /103 = 9/40 = 0.225.
4. The pseudo-values are φ̂i = nφ̂ − (n − 1)φ̂−i , where φ̂−i = exp{−(nx̄ − xi )/(n − 1)}. The following
code yields the jackknife estimate φ̂J = 0.012 and standard error 0.026.

x=c(3,3,4,7,2)
n=length(x)
pseudo=n*exp(-mean(x))-(n-1)*exp(-(sum(x)-x)/(n-1)) # pseudo-values
mean(pseudo) # jackknife estimate
sd(pseudo)/sqrt(n) # jackknife standard error

5. The estimate is the minimum observation. Therefore, when x(1) is omitted the estimate becomes
θ̂−1 = x(2) but when x(i) is omitted for any i = 2, . . . , n the estimate remains θ̂−i = x(1) . The
pseudo-values are (
nx(1) − (n − 1)x(2) if i = 1
θ̂i = nθ̂ − (n − 1)θ̂−i =
x(1) if i > 1
and the jackknife estimate is
n n
!
1X 1 X 1 n−1
θ̂J = θ̂i = θ̂1 + θ̂i = {nx(1) − (n − 1)x(2) + (n − 1)x(1) } = x(1) − (x(2) − x(1) ).
n i=1 n i=2
n n

The jackknife variance is


n
1 X
v̂J = (θ̂i − θ̂J )2
n(n − 1) i=1
( n
)
1 2
X
2
= (θ̂1 − θ̂J ) + (θ̂i − θ̂J )
n(n − 1) i=2
1  −1
{(n − 1)(n − 1)(x(2) − x(1) )}2 + (n − 1){n−1 (n − 1)(x(2) − x(1) )}2

=
n(n − 1)
(n − 1)2 (x(2) − x(1) )2
= ... =
n2

and the standard error is v̂J = (n − 1)(x(2) − x(1) )/n, as required.

7
6. The Monte Carlo test simulates new data x∗ from the Bin(n, θ0 ) distribution can calculates the test
statistic for each new sample. The p-value is the proportion of simulated test statistics that exceed
the observed test statistic. The following code yields Monte Carlo p-values of 0.64 (Wald) and 0.81
(score). The χ21 distribution yields p-values of 0.61 (Wald) and 0.63 (score).

x=5
n=20
theta0=0.3
B=10000
theta.hat=x/n # estimate theta
wa=n*(theta.hat-theta0)^2/(theta.hat*(1-theta.hat)) # Wald test statistic
sc=n*(theta.hat-theta0)^2/(theta0*(1-theta0)) # score test statistic
wa.star=numeric(B)
sc.star=numeric(B)
for(b in 1:B) {
x.star=rbinom(1,n,theta0) # simulate x
theta.star=x.star/n
wa.star[b]=n*(theta.star-theta0)^2/(theta.star*(1-theta.star))
sc.star[b]=n*(theta.star-theta0)^2/(theta0*(1-theta0))
}
mean(wa.star>=wa) # Monte Carlo p-values
mean(sc.star>=sc) #
1-pchisq(wa,1) # chi-squared p-values
1-pchisq(sc,1) #

7. We reject H0 when the test statistic is large. When H0 is true, x and y both have P oi(γ) distributions,
so we resample x∗ and y ∗ independently from P oi(γ̂0 ), where γ̂0 = (mx̄ + nȳ)/(m + n). The following
code yields a p-value of 0.031, sufficient evidence to reject H0 at the 5% level.

x=c(3,3,4,7,2)
y=c(9,10,7,7,2,4,4,7)
B=10000
m=length(x)
n=length(y)
gamma0.hat=(sum(x)+sum(y))/(m+n) # estimate gamma
t=(mean(y)/mean(x)-1)^2 # observed test statistic
t.star=numeric(B)
for(b in 1:B) {
x.star=rpois(m,gamma0.hat) # resample x
y.star=rpois(n,gamma0.hat) # resample y
t.star[b]=(mean(y.star)/mean(x.star)-1)^2 # test statistic
}
mean(t.star>=t) # p-value

8. We use the test statistic t = β̂ (other choices are acceptable) and reject H0 when t is large. When H0
is true, each xi has a N (α, σ 2 ) distribution and so we resample x∗i from N (α̂0 , σ̂02 ), where α̂0 is the
sample mean and σ̂0 is the sample standard deviation of x1 , . . . , xn . The p-value is the proportion of
resampled test statistics that exceed the observed test statistic. The following code yields a p-value of
0.002, strong evidence to reject H0 and conclude that there is an increasing trend in the sea-level.

x=c(103, 78,121,116,115,147,119,114, 89,102, 99, 91, 97,106,105,


136,126,132,104,117,151,116,107,112, 97, 95,119,124,118,145,
122,114,118,107,110,194,138,144,138,123,122,120,114, 96,125,
124,120,132,166,134,138)

8
z=(1931:1981)-1956
B=10000
n=length(x)
alpha0hat=mean(x) # estimate alpha under H0
sigma0hat=sd(x) # estimate sigma under H0
t=cov(x,z)/var(z) # observed test statistic
t.star=numeric(B)
for(b in 1:B) {
x.star=rnorm(n,alpha0hat,sigma0hat) # resample data
t.star[b]=cov(x.star,z)/var(z) # calculate test statistic
}
mean(t.star>=t) # p-value

9. GJJ 9.24(a) The basic bootstrap 90% confidence interval is (θ̂ − q̂0.95 , θ̂ − q̂0.05 ) where q̂p is an estimate
of the p-quantile of θ̂∗ − θ̂. The empirical 5% quantile of θ̂∗ is θ̂(0.05(50))
∗ ∗
= θ̂(2.5) = (42.8+50.2)/2 = 46.5
and the empirical 95% quantile of θ̂∗ is θ̂(0.95(50))

= θ̂(47.5) = (93.4 + 93.9)/2 = 93.65 so the basic
bootstrap interval is (77.6 − (93.65 − 77.6), 77.6 − (46.5 − 77.6)) or (61.55, 108.7).
∗ ∗
GJJ 9.24(b) The percentile bootstrap 90% confidence interval is (θ̂(2.5) , θ̂(47.5) ) or (46.5, 93.65).

10. There is no parametric model and so we use nonparametric resampling in the following code to obtain
the intervals (0.93, 7.06) (basic), (4.15, 5.76) (studentised) and (2.53, 8.66) (percentile).

x=c(3.4,7.2,4.6,1.3,5.5,2.4)
y=c(14,32,23,3,31,14)
B=10000
n=length(x)
theta.hat=mean(y)/mean(x) # point estimate
s=sqrt(sum((y-theta.hat*x)^2))/sum(x) # estimated standard error
theta.star=numeric(B)
t1.star=numeric(B)
t2.star=numeric(B)
for(b in 1:B) {
x.star=sample(x,replace=TRUE) # resample x
y.star=sample(y,replace=TRUE) # resample y
theta.star[b]=mean(y.star)/mean(x.star)
s.star=sqrt(sum((y.star-theta.star[b]*x.star)^2))/sum(x.star)
t1.star[b]=theta.star[b]-theta.hat # T* for the basic interval
t2.star[b]=(theta.star[b]-theta.hat)/s.star # T* for the studentised interval
}
theta.hat-quantile(t1.star,c(0.975,0.025)) # basic interval
theta.hat-quantile(t2.star,c(0.975,0.025))*s # studentised interval
quantile(theta.star,c(0.025,0.975)) # percentile interval

11. We resample x∗i from N (α̂+β̂zi , σ̂ 2 ) for i = 1, . . . , n. The following code yields the intervals (0.12, 1.02) cm/year
(basic), (0.10, 1.05) cm/year (studentised) and (0.11, 1.01) cm/year (percentile).

x=c(103, 78,121,116,115,147,119,114, 89,102, 99, 91, 97,106,105,


136,126,132,104,117,151,116,107,112, 97, 95,119,124,118,145,
122,114,118,107,110,194,138,144,138,123,122,120,114, 96,125,
124,120,132,166,134,138)
z=(1931:1981)-1956
B=10000
n=length(x)

9
beta.hat=cov(x,z)/var(z) # estimate beta
alpha.hat=mean(x)-beta.hat*mean(z) # estimate alpha
mu.hat=alpha.hat+beta.hat*z # estimate expectations
sigma.hat=sqrt(mean((x-mu.hat)^2)) # estimate sigma
s=sigma.hat/sqrt((n-1)*var(z)) # estimate standard error
t.star=numeric(B)
beta.star=numeric(B)
s.star=numeric(B)
for(b in 1:B) {
x.star=rnorm(n,mu.hat,sigma.hat) # resample data
beta.star[b]=cov(x.star,z)/var(z) # estimate beta
alpha.star=mean(x.star)-beta.star[b]*mean(z) # estimate alpha
mu.star=alpha.star+beta.star[b]*z # estimate expectations
sigma.star=sqrt(mean((x.star-mu.star)^2)) # estimate sigma
s.star[b]=sigma.star/sqrt((n-1)*var(z)) # estimate standard error
t.star[b]=beta.star[b]-beta.hat # T* for the basic interval
}
beta.hat-quantile(t.star,c(0.995,0.005)) # basic interval
beta.hat-quantile(t.star/s.star,c(0.995,0.005))*s # studentised interval
quantile(beta.star,c(0.005,0.995)) # percentile interval

12. The parameter estimate is γ̂ = 1.64. The plug-in (equal-tailed) 80% prediction interval is defined by the
0.1- and 0.9-quantiles of the Be(γ̂) distribution, which the command qbeta(c(0.1,0.9),1.64,1.64)
gives as (0.17, 0.83). For the bootstrap intervals, we resample x∗0 , x∗1 , . . . , x∗n from the Be(γ̂) distribution.
The following code yields the intervals (0.13, 0.92) (studentised) and (0.13, 0.87) (PIT). These intervals
are wider than the plug-in interval, as expected.

x=c(0.33,0.55,0.27,0.88,0.59)
B=10000
n=length(x)
gamma=0.5*(0.25/var(x)-1)
t1.star=numeric(B)
t2.star=numeric(B)
for(b in 1:B) {
x.star=rbeta(n,gamma,gamma) # resample x
x0.star=rbeta(1,gamma,gamma) # resample x0
gamma.star=0.5*(0.25/var(x.star)-1) # estimate gamma
t1.star[b]=(x0.star-mean(x.star))/sd(x.star) # T* for studentised interval
t2.star[b]=pbeta(x0.star,gamma.star,gamma.star) # T* for PIT interval
}
mean(x)+quantile(t1.star,c(0.1,0.9))*sd(x) # studentised interval
qbeta(quantile(t2.star,c(0.1,0.9)),gamma,gamma) # PIT interval

13. We resample x∗i from N (α̂ + β̂zi , σ̂ 2 ) for i = 0, 1, . . . , n, where z0 = 1982 − 1956 = 26 is the covariate
for the year we want to predict. The following code yields the PIT bootstrap 90% prediction interval
(102, 167) cm.

x=c(103, 78,121,116,115,147,119,114, 89,102, 99, 91, 97,106,105,


136,126,132,104,117,151,116,107,112, 97, 95,119,124,118,145,
122,114,118,107,110,194,138,144,138,123,122,120,114, 96,125,
124,120,132,166,134,138)
z=(1931:1981)-1956
z0=1982-1956 # covariate for X0
B=10000

10
n=length(x)
beta.hat=cov(x,z)/var(z) # estimate beta
alpha.hat=mean(x)-beta.hat*mean(z) # estimate alpha
mu.hat=alpha.hat+beta.hat*z # estimate expectations of X1,...,Xn
mu0.hat=alpha.hat+beta.hat*z0 # estimate expectation of X0
sigma.hat=sqrt(mean((x-mu.hat)^2)) # estimate sigma
t.star=numeric(B)
for(b in 1:B) {
x.star=rnorm(n,mu.hat,sigma.hat) # resample X1,...,Xn
x0.star=rnorm(1,mu0.hat,sigma.hat) # resample X0
beta.star=cov(x.star,z)/var(z) # re-estimate beta
alpha.star=mean(x.star)-beta.star*mean(z) # re-estimate alpha
mu.star=alpha.star+beta.star*z # re-estimate expectations
mu0.star=alpha.star+beta.star*z0 # re-estimate expectation of X0
sigma.star=sqrt(mean((x.star-mu.star)^2)) # re-estimate sigma
t.star[b]=pnorm(x0.star,mu0.star,sigma.star) # calculate statistic
}
qnorm(quantile(t.star,c(0.05,0.95)),mu0.hat,sigma.hat) # prediction interval

11

Potrebbero piacerti anche