Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
0
. When the sample comes from a continuous distribution, it would again be natural to
try to nd a value of for which the probability density f(x
1
, , x
n
|) is large, and to use
this value as an estimate of . For any given observed data x
1
, , x
n
, we are led by this
reasoning to consider a value of for which the likelihood function L() is a maximum and
to use this value as an estimate of .
1
2
The meaning of maximum likelihood is as follows. We choose the parameter that makes
the likelihood of having the obtained data at hand maximum. With discrete distributions,
the likelihood is the same as the probability. We choose the parameter for the density that
maximizes the probability of the data coming from it.
Theoretically, if we had no actual data, maximizing the likelihood function will give us
a function of n random variables X
1
, , X
n
, which we shall call maximum likelihood
estimate
. When there are actual data, the estimate takes a particular numerical value,
which will be the maximum likelihood estimator.
MLE requires us to maximum the likelihood function L() with respect to the unknown
parameter . From Eqn. 1, L() is dened as a product of n terms, which is not easy
to be maximized. Maximizing L() is equivalent to maximizing log L() because log is a
monotonic increasing function. We dene log L() as log likelihood function, we denote it as
l(), i.e.
l() = log L() = log
n
i=1
f(X
i
|) =
n
i=1
log f(X
i
|)
Maximizing l() with respect to will give us the MLE estimation.
2 Examples
Example 1: Suppose that X is a discrete random variable with the following probability
mass function: where 0 1 is a parameter. The following 10 independent observations
X 0 1 2 3
P(X) 2/3 /3 2(1 )/3 (1 )/3
were taken from such a distribution: (3,0,2,1,3,2,1,0,2,1). What is the maximum likelihood
estimate of .
Solution: Since the sample is (3,0,2,1,3,2,1,0,2,1), the likelihood is
L() = P(X = 3)P(X = 0)P(X = 2)P(X = 1)P(X = 3)
P(X = 2)P(X = 1)P(X = 0)P(X = 2)P(X = 1) (2)
Substituting from the probability distribution given above, we have
L() =
n
i=1
P(X
i
|) =
_
2
3
_
2
_
3
_
3
_
2(1 )
3
_
3
_
1
3
_
2
Clearly, the likelihood function L() is not easy to maximize.
3
Let us look at the log likelihood function
l() = log L() =
n
i=1
log P(X
i
|)
= 2
_
log
2
3
+ log
_
+ 3
_
log
1
3
+ log
_
+ 3
_
log
2
3
+ log(1 )
_
+ 2
_
log
1
3
+ log(1 )
_
= C + 5 log + 5 log(1 )
where C is a constant which does not depend on . It can be seen that the log likelihood
function is easier to maximize compared to the likelihood function.
Let the derivative of l() with respect to be zero:
dl()
d
=
5
5
1
= 0
and the solution gives us the MLE, which is
= 0.5. We remember that the method of
moment estimation is
= 5/12, which is dierent from MLE.
Example 2: Suppose X
1
, X
2
, , X
n
are i.i.d. random variables with density function
f(x|) =
1
2
exp
_
|x|
_
, please nd the maximum likelihood estimate of .
Solution: The log-likelihood function is
l() =
n
i=1
_
log 2 log
|X
i
|
_
Let the derivative with respect to be zero:
l
() =
n
i=1
_
+
|X
i
|
2
_
=
n
n
i=1
|X
i
|
2
= 0
and this gives us the MLE for as
=
n
i=1
|X
i
|
n
Again this is dierent from the method of moment estimation which is
=
n
i=1
X
2
i
2n
Example 3: Use the method of moment to estimate the parameters and for the normal
density
f(x|,
2
) =
1
2
exp
_
(x )
2
2
2
_
,
4
based on a random sample X
1
, , X
n
.
Solution: In this example, we have two unknown parameters, and , therefore the pa-
rameter = (, ) is a vector. We rst write out the log likelihood function as
l(, ) =
n
i=1
_
log
1
2
log 2
1
2
2
(X
i
)
2
_
= nlog
n
2
log 2
1
2
2
n
i=1
(X
i
)
2
Setting the partial derivative to be 0, we have
l(, )
=
1
2
n
i=1
(X
i
) = 0
l(, )
=
n
+
3
n
i=1
(X
i
)
2
= 0
Solving these equations will give us the MLE for and :
= X and =
_
1
n
n
i=1
(X
i
X)
2
This time the MLE is the same as the result of method of moment.
From these examples, we can see that the maximum likelihood result may or may not be the
same as the result of method of moment.
Example 4: The Pareto distribution has been used in economics as a model for a density
function with a slowly decaying tail:
f(x|x
0
, ) = x
0
x
1
, x x
0
, > 1
Assume that x
0
> 0 is given and that X
1
, X
2
, , X
n
is an i.i.d. sample. Find the MLE of
.
Solution: The log-likelihood function is
l() =
n
i=1
log f(X
i
|) =
n
i=1
(log + log x
0
( + 1) log X
i
)
= nlog +n log x
0
( + 1)
n
i=1
log X
i
Let the derivative with respect to be zero:
dl()
d
=
n
+nlog x
0
i=1
log X
i
= 0
5
Solving the equation yields the MLE of :
MLE
=
1
log X log x
0
Example 5: Suppose that X
1
, , X
n
form a random sample from a uniform distribution
on the interval (0, ), where of the parameter > 0 but is unknown. Please nd MLE of .
Solution: The pdf of each observation has the following form:
f(x|) =
_
1/, for 0 x
0, otherwise
(3)
Therefore, the likelihood function has the form
L() =
_
1
n
, for 0 x
i
(i = 1, , n)
0, otherwise
It can be seen that the MLE of must be a value of for which x
i
for i = 1, , n and
which maximizes 1/
n
among all such values. Since 1/
n
is a decreasing function of , the
estimate will be the smallest possible value of such that x
i
for i = 1, , n. This value
is = max(x
1
, , x
n
), it follows that the MLE of is
= max(X
1
, , X
n
).
It should be remarked that in this example, the MLE
does not seem to be a suitable
estimator of . We know that max(X
1
, , X
n
) < with probability 1, and therefore
x!
.
Please nd the MLE of the parameter .
Exercise 2: Let X
1
, , X
n
be an i.i.d. sample from an exponential distribution with the
density function
f(x|) =
1
, with 0 x < .
7
Please nd the MLE of the parameter .
Exercise 3: Gamma distribution has a density function as
f(x|, ) =
1
()
x
1
e
x
, with 0 x < .
Suppose the parameter is known, please nd the MLE of based on an i.i.d. sample
X
1
, , X
n
.
Exercise 4: Suppose that X
1
, , X
n
form a random sample from a distribution for which
the pdf f(x|) is as follows:
f(x|) =
_
x
1
, for 0 < x < 1
0, for x 0
Also suppose that the value of is unknown ( > 0). Find the MLE of .
Exercise 5: Suppose that X
1
, , X
n
form a random sample from a distribution for which
the pdf f(x|) is as follows:
f(x|) =
1
2
e
|x|
for < x <
Also suppose that the value of is unknown (< < ). Find the MLE of .
Exercise 6: Suppose that X
1
, , X
n
form a random sample from a distribution for which
the pdf f(x|) is as follows:
f(x|) =
_
e
x
, for x >
0, for x
Also suppose that the value of is unknown ( < < ). a) Show that the MLE of
does not exist. b) Determine another version of the pdf of this same distribution for which
the MLE of will exist, and nd this estimate.
Exercise 7: Suppose that X
1
, , X
n
form a random sample from a uniform distribution
on the interval (
1
,
2
), with the pdf as follows:
f(x|) =
_
1
1
, for
1
x
2
0, otherwise
Also suppose that the values of
1
and
2
are unknown ( <
1
<
2
< ). Find the
MLEs of
1
and
2
.