HW4 Sol With Simulation PDF

STAT 135 Solutions to Homework 4: 30 points
Spring 2015
Problem 1: 10 points
In each of the following cases, (i) write down the likelihood function
ofθ, (ii) show that the corresponding T (X)
is a sufficient statistic, (iii) compute θ̂M LE and (iv) compute E θ̂M LE .
P
1. X1 , ..., Xn iid Poisson random variables with rate λ = θ + 1, with T (X) = i Xi
(i)
The likelihood is given by
n Pn
Y e−θ+1 (θ + 1)Xi e−n(θ+1) (θ + 1) i=1 Xi
lik(θ) = = Qn
Xi ! i=1 Xi !
i=1
(ii)
Recall that a necessary and sufficient condition for T to be sufficient for θ is that
fθ (x1 , ..., xn ) = gθ (T )h(x1 , ..., xn )
i.e. that the density can be factored into a product such that one factor, h, does not depend on θ, and the other
factor, which does depend on θ, depends on (x1 , ..., xn ) only through T .
In this case, we have that
Pn
e−n(θ+1) (θ + 1) i=1 xi
fθ (x1 , ..., xn ) = Qn
i=1 xi !
Pn 1
= e−n(θ+1) (θ + 1) i=1 xi
· Qn
i=1 xi !
1
= e−n(θ+1) (θ + 1)T · Qn
i=1 xi !
So our density satisfies the theorem, since gθ (T ) = e−n(θ+1) (θ + 1)T depends on (x1 , ..., xn ) only through T (x) =
Qn 1
P
i xi , and h(x1 , ...., xn ) =
i=1 xi !
does not depend on θ.
(iii)
1
The log-likelihood is given by
X X
`(θ) = log(lik(θ)) = −n(θ + 1) + Xi log(θ + 1) − Xi !
i i
So differentiating with respect to θ gives

P
∂` Xi
= −n + i
∂θ θ+1
Setting to zero to identify the value of θ that maximizes the log-likelihood, we get that
P
i Xi
−n + =0
θ̂M LE + 1
Thus, rearranging, we get P
i Xi
θ̂M LE = −1
n
(iv)
P
i Xi
E(θ̂M LE ) = E −1
n
P
E ( i Xi )
= −1
P n
E(Xi )
= i −1
P n
(θ + 1)
= i −1
n
n(θ + 1)
= −1
n
=θ+1−1
=θ
So that our MLE is unbiased.
2. X1 , ..., Xn iidPrandom variables with exponential distribution, i.e. with density f (x|θ) = θeθ x ,
x > 0. T (X) = i Xi
(i)
n
Y Pn
lik(θ) = θeθXi = θn eθ i=1 Xi
i=1
(ii)
Again, recall the sufficient and necessary condition for T (X) to be a sufficient statistic and note that our likelihood
function can be written as such a product, where
gθ (T ) = θn eθT
2
and
h(x1 , ..., xn ) = 1
(iii)
The log-likelihood function is given by
n
X
`(θ) = log(lik(θ)) = n log θ + θ Xi
i=1
Differentiating, we have
n
∂` n X
= + Xi
∂θ θ
i=1
And setting to zero to identify the value of θ which maximizes the log-likelihood, we have that
n
n X
+ Xi = 0
θ̂M LE i=1
where, rearranging yields

n
θ̂M LE = Pn
i=1 Xi
(iv)

n
E(θ̂M LE ) = E
Pn
Xi
i=1
1
= nE Pn
i=1 Xi
Pn Pn
The hint tells us that i=1 Xi ∼ Γ(n, θ), so if we let Y = i=1 Xi , we simply need to find E(1/Y ) where
Y ∼ Γ(n, θ). We can do this as follows:
∞
1 θn n−1 −θy
Z
1
E = y e dy
Y 0 y Γ(n)
Z ∞ n
θ
= y n−2 e−θy dy
0 Γ(n)
Γ(n − 1)θ ∞ θn−1
Z
= y n−2 e−θy dy
Γ(n) 0 Γ(n − 1)
Γ(n − 1)θ
= (Since the integrand is just the Γ(n − 1, θ) density)
Γ(n)
θ
= (Since Γ(k) = (k − 1)!)
n−1
Thus
n
E(θ̂M LE ) = θ
n−1
3
Problem 2: Let X1 , ..., Xn be iid random variables, uniformly distributed over
(θ, 2θ). 10 points
1. Show that a sufficient statistic for θ is T (X) = (mini Xi , maxi Xi )
Note that we can write the density for the U nif orm(θ, 2θ) distribution as
1
fθ (x) = 1(θ ≤ x ≤ 2θ)
θ
(
1 if A is true
where 1(A) =
0 otherwise
so
n
1
1(θ ≤ xi ≤ 2θ)
Y
fθ (x1 , ..., xn ) =
θ
i=1
Where we note that for the product to be non-zero, we need θ ≤ xi ≤ 2θ for all i = 1, 2, ..., n, which is equivalent
to maxi xi ≤ 2θ and mini xi ≥ θ. So
1
fθ (x1 , ..., xn ) = 1(max xi ≤ 2θ , min xi ≥ θ)
θn i i
Thus T = (mini Xi , maxi Xi ) is a sufficient statistic by the factorization theorem, where gθ (T ) = 1

θn 1(maxi xi ≤
2θ , mini xi ≥ θ) depends on the xi ’s only through T = (mini Xi , maxi Xi ) and h(x1 , ..., xn ) = 1.
2. Show that an unbiased estimator for θ is θ̂ = 32 X1

Note that E(Xi ) = 12 (θ + 2θ), so

2 2 2 3θ
E(θ̂) = E X1 = E(X1 ) = =θ
3 3 3 2
so θ̂ is an unbiased estimator for θ.
3. Compute θ̂M LE
Note that the likelihood is given by
1
lik(θ) = 1(max Xi ≤ 2θ , min Xi ≥ θ)
θn i i
which is clearly maximized when θ takes its smallest possible value such that θ ≤ Xi ≤ 2θ for all i = 1, ..., n,
specifically, X2i ≤ θ for all i = 1, ..., n. Thus
maxi Xi
θ̂M LE =
2
4
4. Using T (X), find an unbiased estimator of θ whose mean-squared error is at least as good
as that of θ̂
Recall that the Rao-Blackwell theorem tells us that, given an estimator θ̂, we can find an estimator whose MSE
is at least as good as that of θ̂ if we know a sufficient statistic T . In particular, that estimator is θ̃, which can be
calculated using
2
θ̃ = E(θ̂|T ) = E(X1 | min Xi , max Xi )
3 i i
we can write this as
2
θ̃ = E(X1 |a = min Xi , b = max Xi )
3 i i
Note that using the law of total probability, we can separate the expectation into several cases as follows:
E(X1 |a = min Xi , b = max Xi ) = E(X1 |a = min Xi , b = max Xi , X1 = a) × P (X1 = min Xi )

i i i i i
+ E(X1 | min Xi = a, max Xi = b, X1 = b) × P (X1 = max Xi )
i i i
+ E(X1 | min Xi = a, max Xi = b, X1 > a, X1 < b) × P (X1 6= min Xi , X1 6= max Xi )
i i i i

1 1 a+b 2
=a× +b× + × 1−
n n 2 n
a+b
=
2
Thus,
2 mini Xi + maxi Xi
θ̃ =
3 2
mini Xi + maxi Xi
=
3
5. Can you use T (X) to improve θ̂M LE as well? Why?

Note that since the MLE is part of the sufficient statistic, the conditional expectation does not change the estimator.
In particular, the improved estimator is simply

maxi Xi maxi Xi
E θ̂M LE |T = E (min Xi , max Xi ) =
2 i i 2
which is just the ML estimator. Thus in this case Rao-Blackwell does not improve the MLE.
6. How do you think the resulting estimator from (4) compares to θ̂M LE in terms of the
mean-squared error? You can simulate the experiment to help you interpret
Note that for this example, the basic assumptions that our density is differentiable as a function of θ, with a
derivative that is jointly continuous in x and θ, and that the support (range over which the distribution is defined)
does not depend on θ is certainly not satisfied. This is because our support is (θ, 2θ) which clearly depends on our
parameter θ, the asymptotic properties of the MLE (the properties which make the MLE so nice) do not apply,
which is why the MLE histogram does not look normal. As a result, we might expect that the improved estimator
from (4) would achieve smaller MSE than the MLE.
5
set.seed(123)
library(ggplot2)
library(grid)
library(gridExtra)
# let's simulate for theta = 2
theta <- 2
mle <- c()

est <- c()
# do 1000 simulations
for(i in 1:1000){
# draw a sample of size 500 from the uniform(theta, 2theta) distribution

sample <- runif(500,theta,2*theta)
# calculate the MLE estimate

mle[i] <- max(sample)/2
# calculate the theta.hat estimate from (4)

est[i] <- (min(sample) + max(sample))/3
}
# put results into a data frame

est.df <- data.frame(mle = mle, est = est)
# plot histograms
gg.mle <- ggplot(est.df) +
geom_histogram(aes(x = mle), col = "white", binwidth = 0.0015) +
scale_x_continuous(limits = c(1.985,2.015)) + ggtitle("MLE")
gg.est <- ggplot(est.df) +
geom_histogram(aes(x = est), col = "white", binwidth = 0.0015) +
scale_x_continuous(limits = c(1.985,2.015)) + ggtitle("Estimate from (4)")
grid.arrange(gg.mle,gg.est,ncol = 2)
6
MLE Estimate from (4)
400 400
count 300 300
count
200 200
100 100
0 0
1.99 2.00 2.01 1.99 2.00 2.01
mle est
# calculate ht ebias for MLE

bias.mle <- mean(mle) - theta
bias.mle
## [1] -0.001982282
# calculate the bias for estimator from (4)

bias.est <- mean(est) - theta
bias.est
## [1] -2.870242e-05
# the estimate from (4) has much smaller bias (MLE has a bias 69 times larger than the estimate!)
bias.mle/bias.est
## [1] 69.06322
# calculate the MSE for the MLE

mse.mle <- (theta - mean(mle))^2 + var(mle)
# calcualte the MSE for the estimate from (4)
mse.est <- (theta - mean(est))^2 + var(est)
# the MLE has MSE at least twice the size of that of the estimate from (4)
mse.mle/mse.est
## [1] 2.263549
7
Problem 3: We want to compute the probability that the Sun will rise to-
morrow, given that we know it has risen every day for the last 500 years. Let
us denote Xi = 1 the event that the sun rose at day i and Xi = 0 otherwise,
i = 1, ..., n. Given a value p ∈ [0, 1], we model Xi ∼ B(p) and assume that Xi
are independent conditionally on p, thus we do not consider nay cosmological
model whatsoever. 10 points
1. Show that the likelihood of observing Xi = xi , i = 1, ..., n given p is
lik(p) = P (X1 = x1 , ..., Xn = xn |p) = ps (1 − p)n−s ,

P
where s = i xi
We have that
n
Y
lik(p) = pxi (1 − p)1−xi
i=1
P P
xi i (1−xi )
=p i (1 − p)
= ps (1 − p)n−s
2. We make no prior assumptions on p, except for the fact that the experiment (Sun rising
or not) is allowed to succeed or fail. Therefore we choose as prior distribution p ∼ U [0, 1].
Show that in that case the posterior distribution for p is
ps (1 − p)n−s
f (p|X1 = x1 , ..., Xn = xn ) = R 1
0
p0s (1 − p0 )n−s dp0
Recall that the posterior distribution for Θ given X is given by
fX|Θ=θ (x|θ)fΘ (θ)

fΘ|X=x (θ|x) = R
fX|Θ=θ0 (x|θ0 )fΘ (θ0 )dθ0
thus, since θ = p ∼ U [0, 1], we have f (p) = 1 and f (x|p) = ps (1 − p)n−s , it follows immediately that the posterior
distribution for p given X is given by
ps (1 − p)n−s
f (p|X1 = x1 , ..., Xn = xn ) = R 1
0 p0s (1 − p0 )n−s dp0
R1
3. Since 0 p0s (1 − p0 )n−s dp0 = s!(n−s)!
(n+1)!
, it results that p|(Xi = xi ) is distributed following a Beta
distribution with parameters α = s + 1 and β = n − s + 1. Using the fact that if Y ∼ Beta(α, β)
α
then E(Y ) = α+β , show that
n
!
X s+1
P Xn+1 = 1 Xi = s =

i=1
n+2
8
Using the law of total probability, we have that
n n n
! Z ! !
X 1 X X
P Xn+1 = 1 Xi = s = P Xn+1 = 1p, Xi = s f p Xi = s dp

i=1 0 i=1 i=1
n
!
Z 1 X
= P (Xn+1 = 1|p)f p Xi = s dp

0 i=1
P
n
Where P Xn+1 = 1p, i=1 Xi = s = P (Xn+1 = 1|p) since the Xi ’s are independent when conditioning on p

(given in the question set-up). Continuing on, since P (Xn+1 = 1|p) = p, we have
n n
! Z !
X 1 X
P Xn+1 = 1 Xi = s = pf p Xi = s dp

i=1 0 i=1
n
!
X
=E p Xi = s

i=1
s+1
=
n+2
Since p|(Xi = xi ) follows a Beta(s + 1, n − s + 1) distribution.

HW4 Sol With Simulation PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

HW4 Sol With Simulation PDF

Caricato da

Copyright:

Formati disponibili

STAT 135 Solutions to Homework 4: 30 points

fθ (x1 , ..., xn ) = gθ (T )h(x1 , ..., xn )

So differentiating with respect to θ gives

So that our MLE is unbiased.

where, rearranging yields

Thus T = (mini Xi , maxi Xi ) is a sufficient statistic by the factorization theorem, where gθ (T ) = 1

2. Show that an unbiased estimator for θ is θ̂ = 32 X1

so θ̂ is an unbiased estimator for θ.

E(X1 |a = min Xi , b = max Xi ) = E(X1 |a = min Xi , b = max Xi , X1 = a) × P (X1 = min Xi )

5. Can you use T (X) to improve θ̂M LE as well? Why?

mle <- c()

# draw a sample of size 500 from the uniform(theta, 2theta) distribution

# calculate the MLE estimate

# calculate the theta.hat estimate from (4)

# put results into a data frame

count 300 300

# calculate ht ebias for MLE

# calculate the bias for estimator from (4)

# calculate the MSE for the MLE

lik(p) = P (X1 = x1 , ..., Xn = xn |p) = ps (1 − p)n−s ,

Recall that the posterior distribution for Θ given X is given by

fX|Θ=θ (x|θ)fΘ (θ)

Potrebbero piacerti anche