Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
STAT 400
July 20, 2016
Definition Point estimation. An important goal of statistics is to make inferences about population
parameters based upon random samples of data.
Let be an unknown population parameter with parameter space .
Let u (X1 , X2 , ..., Xn ) be a statistic, which is an estimator of .
We want u (X1 , X2 , ..., Xn ) to be an unbiased estimator of , such that, E [u (X1 , X2 , ..., Xn )] =
.
Let denote a sample estimate of .
We will consider two methods for fitting sample data to known probability distributions:
maximum likelihood and method of moments.
Definition: Maximum likelihood estimation (mle) chooses to maximize the probability of
observing the sample. Let X1 , . . . , Xn be a sample of independent observations with pdf or pmf of
f (xi ; ).
The likelihood, L (), or probability of observing the sample given is,
L () =
n
Y
f (xi ; )
(1)
i=1
n
X
ln [f (xi ; )]
(2)
i=1
d
` ()
d
the mle.
1
STAT 400
Point Estimation
0.6
0.1
0.1
0.2
0.2
0.3
0.3
0.2
0.3
0.4
0.2
0.1
The likelihood function is L() = P (X = 1)P (X = 4) and takes on values L(1) = 0.12,
L(2) = 0.04, and L(3) = 0.03, which implies = 1.
Definition: Method of moments is based upon the fact that each distribution is defined by their
moments. One way to fit sample data to a distribution is to match moments and then solve for
parameter values as a function of sample moments.
If there is one parameter , the first sample moment is set equal to the theoretical expected
value, m1 = E(X).
We will match the first r moments, if there are r population parameters that we want to
estimate and solve the corresponding system of r equations. Define the rth sample moment
as mr =
Pn
i=1
xri
One advantage of MOM arises when MLE is analytically intractable, but MOM may be
problematic when n is small.
Let be the MOM estimators of .
Stepanov, Culpepper
STAT 400
Point Estimation
Example 2 Consider a single observation X = k of a Binomial random variable with n trials and
unknown probability of success p, P (X = k) =
n
k
In this case, the sample mean is k, because there is only a single realization and
E(X) = np. Consequently, setting k = np p = nk .
b. Obtain the maximum likelihood estimator of p, p.
` (p) = ln
!#
n
k
+ k ln (p) + (n k) ln (1 p)
(3)
(4)
In this case, p = p.
Example 3 Let X1 , . . . , Xn be a random sample of size n from the distribution with pdf,
f (x; ) =
1x
0x1
otherwise
(5)
for > 0.
E (X) =
xx dx =
x dx
0
0
1 1
1
1
=
x1+ =
0
1+
1+
implies, =
Setting E (X) = X
(6)
(7)
1X
.
X
Stepanov, Culpepper
STAT 400
Point Estimation
"
# 1
` () = n ln +
(8)
n
1 X
ln xi
i=1
(9)
X
n
1
1
= n ln +
ln xi
i=1
(10)
ln xi = 0 =
=
ln xi
d
n i=1
2 i=1
(11)
= 1 3 = 2
(12)
(13)
d. What is E ?
We must find E (ln X), which we can find using integration by parts by letting u = ln x,
1
du = 1/x, dv = 1 x 1 , and v = x ,
E (ln X) =
Z 1
0
Z 1
1 1
1
1 1
ln x x 1 = ln x x
x1 x dx
0
Z 1
0
x 1 dx = x =
0
(14)
(15)
E =
E [ln Xi ] =
=
n i=1
n i=1
(16)
Stepanov, Culpepper
STAT 400
Point Estimation
(17)
e. Is unbiased?
In this case,
Note that is a strictly convex function of X.
= 1X
= g X
(18)
E = E g X
=
1
1+
1
1+
> g (E (X))
(19)
(20)
is a biased estimator!
Example 4 Let X1 , . . . , Xn be a random sample of n independent variables from a population
with mean and variance 2 .
.
a. What is E X
=
E X
Pn
E(Xi )
n
i=0
= .
2 = V ar X
+ 2 =
Recall that E X
E s
2
n
Pn
i=1
2.
Xi X
+ 2 .
n
X
1
2
=
E
Xi X
n1
i=1
"
n
X
1
2
=
E Xi2 E X
n 1 i=1
"
"
(21)
#
1
2
=
n 2 + 2 n 2 +
n1
n
(22)
!#
= 2
(23)
Stepanov, Culpepper
STAT 400
Point Estimation
Example 5 Let X1 , ..., Xn be independent random variables with a sample of size n from the
distribution with probability density function,
fX (x) = fX (x; ) = ( 1)2
lnx
, x > 1, > 2
x
(24)
n
Y
( 1)2
i=1
lnXi
Xi
`() = 2nln( 1) +
n
X
(25)
ln (ln Xi )
i=1
n
X
(26)
lnXi
i=1
n
X
2n
dL()
=
ln Xi = 0
d
1 i=1
2n
= 1 + Pn
i=1 lnXi
(27)
(28)
We need to find E(X) and equate the theoretical mean with the sample mean. Finding
E(X) requires the use of integration by parts and LHopitals rule,
E(X) =
xf (x)dx =
x( 1)2
1
2
= ( 1)
=
lnx
dx
x
1 Z 1
1
1
+
dx
ln (x)
2 x2 1
2 1 x1
( 1)2
( 2)2
(29)
n
1X
( 1)2
Xi = X =
n i=1
( 2)2
1
2 X
=
1
X
(30)
(31)
Stepanov, Culpepper
STAT 400
Point Estimation
Example 6 (German Tank Problem) During the course of WWII, the Western Allies made
sustained efforts to determine the extent of German tank production using statistical estimation.
The Allies estimated the number of tanks being produced after observing the serial numbers on
captured or destroyed tanks where numbered sequentially from 1, 2, 3, . . . , N .
1,
N
1kN
(32)
0, otherwise
.
b. Suppose you observe a random sample of serial numbers for n tanks. Find N
n denote the sample mean. The expected value of Xi is,
Let X
E(X) =
N
X
N +1
k
=
2
k=1 N
(33)
= 2X
n 1.
Clearly, N
c. What is the likelihood function for observing X1 , . . . , Xn ? (Hint: Sample without replacement
where order matters)
L (N ) =
1
1
= Qn1
N Pn
k=0 (N k)
(34)
?
d. What is the mle of N , N
= max Xi .
The mle is defined as N
e. Suppose N = 100 and n = 5 tanks with serial numbers 76, 58, 34, 96, 61 are captured. Find
and N
.
N
= 96. The sample mean is 65, so N
= 129.
The mle is N
Stepanov, Culpepper
STAT 400
Point Estimation
Example 7 If the random variable X denotes an individuals income, Paretos law claims P (X
x) = (k/x) , where k is the entire populations minimum income. It follows that
f (x) = k
+1
1
, x k; 1
(35)
Assume is known. The income information has been collected on a random sample of
n individuals: X1 , . . . , Xn .
a. Find the method of moments estimator k of k.
E(X) =
xf (x; )dy =
Z
k
Z
k
1 +1
x dx =
(xk ( ) )dx = k
x
1
k
= k k = 1 X
X
1
(36)
(37)
(+1)
n n Qn
n
xi k, 1 i n.
i=1 f (xi ; ) = k ( i=1 xi )
L (k) =
0 otherwise
(38)
k = min Xi
c. Is k an unbiased estimator of k?
= E( 1 X)
=
E(k)
E(X)
1
E(X)
1 k
1
=k
d. Is k an unbiased estimator of k?
Note that k = min Xi = Y1 . We found the expected value to be E(Y1 ) =
n
k,
n1
so k is
a biased estimator.
Note that
n1
n
k is unbiased.
Stepanov, Culpepper
STAT 400
Point Estimation
h
= E ( )2 .
Definition: For an estimator of , define the mean square error of by M SE()
)2 + V ar()
= (bias())
2 + V ar()
E[( )2 ] = (E()
f. Find M SE
h
n1
n
i
k .
M SE
(39)
=
that V ar(Y1 ) = V ar(k)
! #
n 1
k = V ar
n
=
"
nk2
.
(n2)(n1)2
n 1
Y1
n
n 1
n
!2
k2
nk 2
=
(n 2)(n 1)2
n(n 2)
(40)
= 2X
n 1 where
Example 8 (Revisting the German Tank Problem) Recall the mom was N
n = 1 Pni=1 Xi . Recall E(X1 ) = E(X2 ) =
X
r
N +1
2
and V ar(X) =
N 2 1
.
12
n ).
a. Find V ar(X
Recall the observations are correlated with a covariance between each pair of
Cov(Xi , Xj ) =
N +1
12
formula implies,
n
X
n ) = V ar(
V ar(X
n
X
1
1
Xi ) =
i=1 n
i=1 n
2
V ar(Xi ) + 2
XX11
i>j
Cov(Xi , Xj )
n
1 X
N2 1
2 XX N + 1
= 2
2
n i=1
12
n i>j
12
!
1
=
n
N2 1
2 n(n 1) N + 1
(N + 1)(N n)
2
=
12
n
2
12
12n
(41)
).
b. Find E(N
) = E 2X
n 1 = 2
E(N
N +1
1=N
2
(42)
) where N
= max Xi = Yn .
c. Find E(N
Stepanov, Culpepper
STAT 400
Point Estimation
) =
Recall that E(Yn ) = E(N
that
n+1
n
n(N +1)
.
n+1
1 is unbiased.
N
.
d. Find the variance and mean squared error of N
) = V ar 2X
n 1 = 4V ar(X
n) =
V ar(N
is unbiased so the mean squared error is
N
(43)
(N +1)(N n)
.
3n
n+1
n
(N + 1)(N n)
3n
n(N +1)(N n)
.
(n+1)2 (n+2)
1.
N
) =
M SE(N
(N +1)(N n)
,
n(n+2)
) < M SE(N
) for n > 1.
which implies M SE(N
f (x) =
0x
0 otherwise
, F (x; ) =
x<0
0x
(44)
x>
2
.
12
10
Stepanov, Culpepper
STAT 400
Point Estimation
= , = 2X
E(X) = 2 , X
2
b. Find E().
= E(2X)
= E(X) = , E()
=
E(X)
2
and M SE()?
c. What is V ar()
2
2
2
= V ar(2X)
=
= 4V ar(X)
= 4 = M SE()
V ar()
n
3n
3n
1
i=1 ( )
1
n
(45)
max Xi
(46)
0 < max Xi
= max Xi .
e. What is E()?
= max Xi = Yn . Recall that we showed E(Yn ) =
n
,
n+1
biased.
f. What must c equal if c is to be an unbiased estimator for ?
n + 1 n + 1 n
n+1
n+1
=
E
E =
= , c =
n
n
n n+1
n
g. Find V ar
n+1
(47)
= V ar(Yn ) =
We showed V ar()
n2
.
(n+2)(n+1)2
mle is,
V ar
n+1
n+1
=
n
n
2
V ar =
2
n(n + 2)
(48)
h. Find M SE().
=
Bias()
n
n+1
=
= n+1
and V ar()
n2
.
(n+1)2 (n+2)
n2
22
= ( )2 +
M SE()
=
0, as n 0
n+1
(n + 1)2 (n + 2)
(n + 2)(n + 1)
11
(49)
Stepanov, Culpepper