Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
com
Eric B. Hall
Contents
1 Detection Problems . 2
2 Estimation Problems 10
3 Detection Solutions 19
4 Estimation Solutions 46
www.MathGeek.com
www.MathGeek.com
1. Detection Problems
Problem 1.1. As an example in which randomization reduces the maximum risk, suppose
that a coin is known to be either standard (HT) or to have heads on both sides (HH). The
nature of the coin is to be decided on the basis of a single toss, the loss being one for an
incorrect decision and zero for a correct decision. Let the decision be HT when T is observed,
and let the decision be made at random if H is observed with probability p for HT and 1- p
for HH. Show that the ma.ximum risk is minimized when p = ~.
Probleln 1.2. Consider a zero mean Gaussian random variable X with positive variance
(J2. This random variable is passed through one of two equally likely nonlinearities to obtain
Problem 1.3. Consider two independent, zero mean Gaussian random variables Xl and
X 2 that under hypothesis Ho each have unit variance and under HI each have a variance
equal to 2. Assign a unit cost to an incorrect decision and zero cost to a correct decision.
Find a minimax test for this problem based upon the two observations Xl and X 2 .
1
f(x) = "2 exp( -Ixl)
for all x E lR, and consider a hypothesis testing problem in which Ho states that Y = Nand
HI states that Y = N + s. Let s = 2, and assume that the prior probabilities for Ho and HI
are given by ITo = 0.1 and ITI = 0.9, respectively. Design a hypothesis test that minimizes
the probability of error. For what values of the observation Y does your test announce that
the signal s is present?
www.MathGeek.com
www.MathGeek.com
Problem 1.5. (A.) Let 8 be a random variable with probability density function f(8), and
assume that a probability density function for our observation X is given by Pe (x) when
8 = 8. Show that b is a Bayesian decision rule if for each x the decision b( x) is chosen to
minimize fJR L( 8, b(:c))g(8Ix) d8, where
f(8)pe(x)
g(81:c) = f f(¢)pq,(x)d¢
(B) Consider a two-decision problem in which Wo and WI are, respectively, the sets of 8-values
for which do and d l are the correct decisions. Assume that the loss is 0 when the correct
decision is made, and otherwise is given by L(8, do) = a if 8 E WI and L(8, dd = f3 if 8 E Woo
Show that a Bayes solution consists in choosing decision do if
and decision d l if the reverse inequality holds where the choice of decision is immaterial in
case of equality.
Problem 1.6. Consider a Gaussian random variable X with mean 8 and unit variance. Let
0:0 = 0.05 and design a level-o:o Neyman-Pearson test for testing Ho: 8 = 0 against HI:
8 = 1000. If we observe x = 1.8, then will we accept or reject the hypothesis Ho? Does this
make sense for an "optimal" test?
Problem 1. 7. Consider a random variable Y and two probability density functions f and
g. Let Ho denote the hypothesis that Y is distributed according to the density function f
and let HI denote the hypothesis that Y is distributed according to the density function g.
A hypothesis test is designed for this situation such that the false alarm probability 0: of the
test is equal to 0.01 and the power of the test f3 is equal to 0.0099. Is it possible that this
test is a Neyman-Pearson test for the level ao = 0.01? Why or why not?
Problem 1.8. Consider a detection problem in which we desire to test for the presence
of a decaying exponential signal in zero mean, mutually independent Gaussian noise. vVe
will base our decision on k mutually independent samples of the received signal. Hence, we
model the situation as follows where the t/s denote our sampling times:
www.MathGeek.com
www.MathGeek.com
Assume that the Ni's are identically distributed and mutually independent with NI having
a zero mean Gaussian distribution with a positive variance denoted by 1T2. Unfortunately,
in taking our samples we encounter a problem in synchronizing the clock at the receiver. In
particular, assume that the sampling times t l , ... ,tk are modeled as ti = iA - e where A > 0
and e is a fixed yet unknown parameter that lies in the interval
[-A A]
10' 10 .
Design and describe a level-ao Neyman-Pearson detector for this situation. What is the
threshold in terms of 000, A, 0", k, and e? Does the threshold depend on e? Find an
expression for f3 in terms of 000, A, IT, k, and e.
Problem 1.9. Consider the following decision problem: Ho: Yi = Ni for i = 1,2, ... , k
versus HI: Yi = s + Ni for i = 1,2, ... , k, where s is a fixed positive constant and where the
N/s are mutually independent zero mean Gaussian random variables each with variance 2.
(A) Let k = 9. What is the smallest positive signal s for which a Neyman-Pearson test with
false alarm probability 0.01 has a detection probability 13 ~ 0.97?
(B) Let s = 1. \Vhat is the smallest number of observations k for which a Neyman-Pearson
test with false alarm probability 0.05 has a detection probability 13 > 0.99?
Problem 1.10. Consider a collection {Xl, ... ,X,,} of mutually independent, identically
distributed random variables each with a N (e, 1) distribution for some real number e. Let
Ho denote the hypothesis that e = 0 and let HI denote the hypothesis that e = 1/2. Let
000 = 0.005.
(A) Design a Neyman-Pearson test for this problem as a function of n. \Vhat is the power
of the test if n = 9?
(B) Now, consider a different test. Flip a fair coin. If the outcome is heads then perform
the test in part (A) with n = 2. If, on the other hand, the outcome is tails then perform the
test in part (A) with n = 16. What is the power of this test? \Vhat can you conclude about
tests that allow random samples sizes?
Problem 1.11. Consider a coin that, when flipped, comes up heads with probability p and
tails with probability 1 - p. We flip the coin twice and then must decide whether p = ~
(which we will call hypothesis Ho) or p = ~ (which we will call hypothesis Hd. Determine
a procedure that is most powerful for testing Ho against HI subject to the constraint that
0: not exceed ~. What is the power of your test?
Problem 1.12. Consider the problem of detecting a continuous signal s(t) = exp( -t) for
t ~ 0 on the basis of two samples. Assume that the observations are either of the form
www.MathGeek.com
www.MathGeek.com
Yi = Ni (which we will call hypothesis Ho) or Yi = s(h.i) + Ni (which we will call hypothesis
Hd where h is a positive constant and where i = 1,2. Assume that N1 and N2 are zero
mean, mutually Gaussian random variables such that E[N?] = E[N?] = 1 and E[N1N2] = ~.
Let ao = 0.1. Design a Neyman-Pearson test for this situation. What is the power of the
test if h = 0.1?
Problem 1.13. Consider a problem in which we wish to test for the presence of a known
positive signal s(t) = exp( -t) defined for t 2: 0 on the basis of three samples corrupted
by additive Gaussian noise. In particular, let Sk = s((k - l)h) for k = 1,2,3 and h > 0,
let Zl, Z2, Z3 be mutually independent, identically distributed Gaussian random variables
each with mean zero and variance 110 , and let Xl, X 2 , X3 denote our observations. Further,
for k = 1,2,3, let Ho denote the hypothesis that X k is equal to Zk and let H1 denote the
hypothesis that X k is equal to Zk + Sic. Design a Neyman-Pearson test with a level of
significance given by ao = 0.05. What is the maximum sampling interval h such that the
power of the test is not less than 0.99?
Problem 1.14. Consider a binary channel such that if we transmit a 'zero' we receive a
'zero' at the other end with probability 1 - Ao and we receive a 'one' with probability Ao.
Further, if we transmit a 'one' we receive a 'one' with probability 1 - A1 and we receive
a 'zero' with probability A1. (Assume that 0 < Ao < 1, 0 < A1 < 1, and Ao + A1 < 1.)
Consider the transmission and subsequent receipt of a single binary digit. Let Ho denote
the hypothesis that a 'zero' was transmitted and let H1 denote the hypothesis that a 'one'
was transmitted. Design a Neyman-Pearson test for this situation and find the power as a
function of the size ao for 0 S ao S 1. Sketch an ROC (Receiver Operating Characteristic)
curve for the special case when Ao = A1 = ~ and for the special case when Ao = A1 = 1/8.
(An ROC curve is a plot of the power of a test as a function of the size of the test.)
www.MathGeek.com
www.MathGeek.com
Problem 1.16. Consider a random variable X that under hypothesis Ho has density
Po(x) =
{
2.
g( x+1 )
ifO<x<l
otherwise.
PI (:c) = {0 if 0
1 <x<1
otherwise.
(A) Find a minima.x test of Ho versus HI based upon the single observation X.
(B) Find a Neyman-Pearson test of Ho versus HI with size 110 based upon the single obser-
vation X.
Problem 1.17. Consider a random variable X that under hypothesis Ho has a uniform
distribution on (0,1) and that under hypothesis HI has a uniform distribution on (~,~). Of
the following three tests, which one is or which ones are Neyman-Pearson tests of Ho versus
HI based upon the single observation X and with a level of significance equal to ~?
. 7
'() 11fx>-
CPI x = { 8
o otherwise
I 5
1 if-<x<-orx>l
q)2(X) = 2 8
{ o otherwise
9,(X) ~ I 1
1
-
4
0
if x> 1
.
.1
if-<x<l
2- -
1
If x < 2"
Problem 1.18. Consider three probability distributions Po, PI, and P2 on {O, 1, 2} defined
as follows:
1
Po ({O}) = Po ({I}) = Po ({2}) = -;
oJ
www.MathGeek.com
www.MathGeek.com
1
H({I}) = 1- H({2}) ="3
Consider the problem of testing Ho: {Po} against HI: {PI, P2 } on the basis of one sample.
(A) Does there exist a UMP test in this case that has a level of significance equal to 1/6? If
yes, then find the test. If no, then explain why not.
(B) Does there exist a UMP test in this case that has a level of significance equal to 1/27 If
yes, then find the test. If no, then explain why not.
Problem 1.19. Consider a collection {YI , Y2, . .. , Yd of random variables such that Xi =
f(t i ) + Ni for each i where f: JR. ---7 JR., where
and where the Ni's are mutually independent, zero mean Gaussian random variables each
with positive variance (J2. Hypothesis Ho states that f(t) = cos(t) and hypothesis HI states
that f(t) = sin(t). Find a Neyman-Pearson test of Ho against HI at size ao based on
{YI , Y2, . .. ,Yd· What is the power of your test? How should one choose the sampling times
in order to maximize the power?
Problem 1.20. Consider a random variable X with a probability density function given by
fa.b(X) =
I
b exp
(-(x-a))
a b
if x ~ a
{ if x < a
where a E JR. and b > o. (Assume that b is known.) Design a Neyman-Pearson test of size
ao for testing Ho : a = ao against HI : a = al based upon X where al < ao. \Vhat is the
power of your test?
www.MathGeek.com
www.MathGeek.com
size ao is also a minimax test of Ho versus HI with respect to a loss function that assigns a
unit loss to an error and a zero loss otherwise.
l ifO<x<l
Po (:c) =
{O otherwise
14 ~:4'
1
ifO<x<-
- 2
PI (:r;) 1
if"2<x<l
otherwise.
Further, consider a random variable X that under hypothesis Ho has density Po and under
hypothesis HI has density Pl. On the basis of the single observation X:
(A) Find a Bayes test of Ho versus HI assuming that Ho and HI are equally likely.
(B) Find a minimax test of Ho versus HI with respect to a loss function that assigns a unit
loss to an error and a zero loss otherwise.
(C) Find a Neyman-Pearson test of Ho versus HI with size 160. What is the power of this
test?
exp( -x)
fo(x) { 0
if x> 0
if x::::; 0
exp( -(x - 1)) if x>l
JI(x) = { 0 if x<1.
Consider a random variable X that under hypothesis Ho has density fo and under hypothesis
HI has density JI. On the basis of the single observation X:
Problem 1.24. Consider the following system in which the random variable N is Gaussian
with mean zero and positive variance (J2 and the random variable !vI takes on the value 1
with probability ~ and the value -1 with probability ~. Thus, under H o, Y = N, and under
HI, Y = aM a
+ N. Assume that is nonzero and that }v1 and N are independent. Find the
form of a Neyman-Pearson test of Ho versus HI based on the single observation Y. (You do
not need to find the threshold.) Is your test uniformly most powerful over all nonzero a?
www.MathGeek.com
www.MathGeek.com
M N
Problem 1.25. Consider a random variable X that has a Poisson distribution with para-
meter A. That is, assume that
Ak
P(X = k) = -, exp( -A)
k
for k = 0,1,2,.... Assume that A = Ao under Ho and that A = Al under HI where
Al > Ao > O. For what values of ao will there exist a nonrandomized Neyman-Pearson test
of Ho versus HI? Find a Neyman-Pearson test of Ho versus HI if Ao = 1 and ao = 0.02.
Ho Y; = Ni fori = 1, ... , k
HI Y;=.s+Ni fori=l, ... ,k
where k is a positive integer, where s is a positive constant signal, and where N 1 , ... , Nk are
mutually independent random variables each with probability density function
(A) Find the form of a locally optimal test of Ho versus HI. (You do not need to find the
threshold. )
(B) Use the result of Part (A) to find the form of a Neyman-Pearson test of Ho versus HI.
(Again, you do not need to find the threshold.)
Problem 1.27. Consider the following decision problem: Ho: Xi = Ni for i = 1,2,3 versus
HI: Xi = S + Ni for i = 1,2,3, where s is a fixed (unknown) positive constant and where
N 1 , N 2 , and N3 are mutually independent random variables each with a probability density
function given by
1
j(x) = "2 exp( -Ixl)
www.MathGeek.com
www.MathGeek.com
for x E II{.
(A) Design a locally optimal test for this problem subject to the constraint that the false
alarm probability not exceed 116 •
(B) What is the smallest value of s for which the detection probability /3 is not less than
0.497
(A) Find the form of a Neyman-Pearson test for this problem. (You do not need to find the
threshold.) Is it uniformly most powerful over all positive signals 57 \Vill it always announce
that the signal is present whenever the observation is much larger than s7
(B) Find the form of a locally optimal test for this problem. (You do not need to find the
threshold.) Will it always announce that the signal is present whenever the observation is
much larger than 57
2. Estimation Problems
Problem 2.1. Consider a marathon with N participants. Further, assume that these N
runners are each wearing an identifying tag displaying a different number between 1 and
N. As you drive by you see a runner wearing the number 87. How might you use that
information to estimate N7
Problenl 2.2. Let X be a Gaussian random variable with mean 8 and positive variance (J2.
Find the Fisher information 1(8) that X contains about the parameter 8.
Problem 2.3. Let X be a Gaussian random variable with mean 8 and positive variance (J2.
Find the Fisher information 1((J2) that X contains about the parameter (J2.
10
www.MathGeek.com
www.MathGeek.com
Problem 2.4. Let Xl, X 2 , and X3 be mutually independent random variables with Xl and
X 2 identically distributed. Further, let Xl take on the values 0 and 2 each with probability
~ and let X3 take on the values 1 and ~ each with probability ~. Now, consider the problem
of estimating which of these three distributions has the largest mean. A. natural method
of proceeding is to take a random sample from each distribution and then to select the
distribution that produces the largest sample mean. Answer the following questions under
the assumption that such a procedure is used.
(A) What is the probability of correctly determining the distribution with the largest mean
if we take one sample from each distribution.
(B) \Vhat is the probability of correctly determining the distribution with the largest mean if
we take one sample from the distributions of Xl and X 2 and two samples from the distribution
governing X3?
(C) Will an estimate based upon 17, samples always be at least as good as an estimate based
on m samples if 17, > m?
Problem 2.5. Let X and Y be random variables with probability density functions iI (x - 8)
and h(x - 8), respectively, where 8 is some fixed yet unknown real number. Assume that Jl
and h are continuous and even.
(A) If Jl (0) > h(O) then show that P(IX - 81 :::; s) > P(IY - 81 :::; s) for some positive value
of s.
(B) Now, let k > 3 be a fixed integer, let 8 be a fixed yet unknown real number, and let Xl
and X 2 be independent random variables each with a density given by J(x - 8) where
k-1
J(:c) = 2(1 + Ixl)k'
Also, consider a cost function C E that assigns a cost of 1 to errors that are larger in magnitude
than some positive constant s and that assigns a zero cost to smaller errors. (In particular,
an estimator eof 8 is "good" if p(le - 81 :::; c:) is large.) Let Y denote the sample mean of Xl
and X 2 • Show that there exists some positive c: for which P(IXI - 81 :::; c:) > P(IY - 81 :::; c:).
That is, show that a single observation may yield a better estimate of the mean than will a
sample mean of two observations.
Problem 2.6. Consider a random variable X that has a Poisson distribution with parameter
). > 0; that is, assume that
P(X = x) = X" exp( -).)
x!
11
www.MathGeek.com
www.MathGeek.com
for x = 0,1,2, .... Consider the problem of estimating the parameter exp( -3A) based upon
one sample from the distribution for X. Show that T(X) = (-2)X is an unbiased estimator
for exp( -3A). Is T(X) a reasonable estimator for exp( -3A)?
Problem 2.8. Let N be a fixed positive integer. Toss a coin N times and, for 1 :::; i :::; N,
let Xi be 1 or 0 according to whether the ith toss is a head or a tail. Let the probability of
tossing a head be given by some fixed yet unknown value 8 from the interval [0,1]. For what
functions g: [0,1] --7lR do there exist unbiased estimators of g(8)?
for each nonnegative integer k and for some fixed positive value of A. Let T(x) be any
nonconstant function of :c such that T(X) provides an estimate of 1/ A. Show that T(X)
cannot be an unbiased estimator of 1/ A.
Problem 2.10. Consider a parameterized family {fe : 8 E lR} of probability density func-
tions in which 113 is a Gaussian density function with mean 8 and unit variance. Let
Xl, ... ,Xn denote a collection of identically distributed, mutually independent random
variables each having a density function given by 113 for some fixed yet unknown value
of 8. Let T(X l , ... , Xn) be an unbiased estimator for 8. Find a positive lower bound for
VARe(T(Xl , ... , Xn)).
for k = 0,1,2, ... and for some fixed, but unknown, positive constant A. (That is, let X have
a Poisson distribution with parameter A. Note that E[X] = VAR[X] = A.) Find the Fisher
information that X contains about the parameter 8 = exp( -A). Let
T(x) = { 1 ~f x = 0
o If x i= O.
Is T(X) an unbiased estimator of 8? Is T(X) an efficient estimator of 8? Hint: Note that
exp ( x) > :c + 1 for all nonzero :c.
12
www.MathGeek.com
www.MathGeek.com
Problem 2.12. Consider a random sample Xl, ... , Xn from a distribution with a fixed yet
unknown finite mean a E JR.. If a maximum likelihood estimate for a exists will it always be
given by the sample mean ~(XI + ... + Xn)? \Vhyor why not?
Problem 2.13. Consider a random variable X for which P(X = 1) = p and P(X = 0) =
1 - p where p is a fixed yet unknown element from [~, ~J. Find a maximum likelihood
estimate for p. \Vhat is the mean square error for this estimate? Find a constant estimator
for p which always has a smaller mean square error than the maximum likelihood estimator
that you found.
Problem 2.14. Consider a coin with a probability of heads given by some fixed yet unknown
value of a from the interval (0,1). Does there exist a maximum likelihood estimator of a
based upon a single flip of the coin. If not, why not, and if so then find one. If you found
that a maximum likelihood estimator does not exist then can you think of a simple remedy?
Does your remedy provide a reasonable estimator?
(B) Find a generalized likelihood ratio test for this example. What is the power of your test?
(C) Assume that a is known to be positive, and find a UMP test. What is the power of this
test? How does this power compare to your result in (B)?
Problem 2.16. Consider a family of probability density functions {fe : a E JR.} where
13
www.MathGeek.com
www.MathGeek.com
(Recall that IA (:c) equals 1 if x E A and equals 0 otherwise.) For a fixed positive integer
n, let Xl, ... , Xn be a collection of mutually independent, identically distributed random
variables each with a probability density function given by Ie for some fixed yet unknown
value of e. e
Find a maximum likelihood estimator for as a function of Xl, ... ,Xn . Is this
maximum likelihood estimator unique?
Problem 2.17. Consider a family of probability density functions {Ie : e E (0, oo)} where
For a fixed positive integer n, let Xl, ... ,Xn be a collection of mutually independent, iden-
tically distributed random variables each with a probability density function given by Ie for
some fixed yet unknown value of e. Find a maximum likelihood estimator for e as a function
of Xl, ... ,Xn . Is this maximum likelihood estimator unbiased?
e
Problem 2.18. An estimator is said to be admissible with respect to the squared error
cost function ifthere exists no estimator e such that Ee [( e-e)2] :::; Ee [( e- e)2] for all allowable
values of e with the inequality being strict for some value of e. Consider a random variable
X with a distribution given by P(X = 1) = e and P(X = 0) = 1 - e where i : :;
e :::; ~.
(A) Find a maximum likelihood estimator of e based on one sample from the distribution of
X.
(B) Consider the collection of all estimators ee< that are of the form
a ifx=O
en(x) = { 1 - a I'f- 1
x-
where i : :;
a :::; ~. (For what value of a is e",(X) equal to the estimator that you found in
part (A)?) Show that the maximum likelihood estimator you found in part (A) has a larger
e
mean square error than a for any a such that < a :::; ~. i
(C) Is the maximum likelihood estimator you found in part (A) an admissible estimator?
J,(c) ~ I 2x
Ae
2(A - x)
A(A - e)
ifO:::;x:::;e
ife<x<A.
14
www.MathGeek.com
www.MathGeek.com
Consider a maximum likelihood estimator Bof (J based upon n samples Xl, ... ,Xn from the
distribution .Ie where we assume without loss of generality that the 11 observations have been
arranged so that 0 :::; Xl :::; X2 :::; ... :::; Xn :::; A.
(A) Show that a maximum of the likelihood function is attainable only when Bis equal to
one of the n observations.
(B) Prove or Disprove: For the jth sample to be a possible maximum likelihood estimator
for (J it must be true that
j-1 j
~~A<x <-A.
n J n
(C) Must Bbe given by any particular order statistic ofthe sample? If so, which one? (Note:
Consider a collection Xl' ... ' X" of random variables defined on some probability space
(0, J, P). For each W E 0, let Zl (w) take on the smallest value in the set {Xl (w), ... ,Xn(w)},
let Z2(W) take on the next smallest value in that set, and so on until Zn(w) which takes on
the largest value in that set. The random variable Zk is called the kth order statistic of the
set {Xl, ... , X n }.)
Problem 2.20. Consider independent random variables Xl and X 2 each with a probability
density function given by
2(J2
fe(x) = (x + (J)3
for X > 0 and zero for X :::; 0 where (J is some fixed yet unknown positive real number.
(A) Find a maximum likelihood estimate of (J based upon Xl. (That is, find a solution
to the likelihood equation and show that your solution corresponds to a maximum.) Is
your estimate unbiased? Is your estimate admissible with respect to the squared error cost
function?
(B) Find a maximum likelihood estimate of e based upon Xl and X 2 . (For this part of the
problem you need only find a solution to the likelihood equation. You do not need to prove
that your solution corresponds to a ma.ximum.)
15
www.MathGeek.com
www.MathGeek.com
where 8 1 and 8'2 are positive constants. Let Xl, ... ,Xn be a collection of mutually indepen-
dent, identically distributed random variables each with probability density function ./e 1 ,e 2 •
(A) Assume that 81 is a known positive constant. Find a maximum likelihood estimate for 82
as a function of Xl, . .. , X n. Be sure to show that your solution corresponds to a maximu.m
of the likelihood function.
(B) Assume that 82 is a known positive constant. Find a maximum likelihood estimate for
81 as a function of Xl, ... ,Xn. Again, be sure to show that your solution corresponds to a
maximu.m of the likelihood function.
Problem 2.23. Consider a random variable X that has a probability density function of
the form
./e(x)=1~8x
where -1 < x < 1 and -1 ::; 8 ::; 1 and where, as usual, we assume that 8 is fixed but
unknown. Find a maximum likelihood estimate of 8 as a function of X. Is your estimate
unbiased? What is the mean square error of your estimate? Is your estimate admissible?
where 8 is some fixed, but unknown, real number. Find a maximum likelihood estimate of
8 as a function of Xl, . .. ,Xn. Is your estimate unbiased?
Problem 2.26. Consider a zero mean, wide sense stationary random process {X (t) : t E JR.}
with autocorrelation function R(T) = E[X(t)X(t+T)]. Find a minimum mean square linear
estimate of X (t + k) as a function of X (t) where k is any fixed positive real number.
Problem 2.27. Consider a zero mean, wide sense stationary random process {X(t) : t E JR.}
with autocorrelation function R(T) = E[X(t)X(t+T)]. Find a minimum mean square linear
16
www.MathGeek.com
www.MathGeek.com
estimate of X(t) in terms of X(O) and X(T) where T is any fixed positive real number.
\Vhat is your estimate when t = T /2?
Problem 2.28. Consider a zero mean, wide sense stationary random process {X(t) : t E JR.}
with autocorrelation function R(T) = E[X(t)X(t + T)]. Fix a positive real number T and
T
assume that R(t) is integrable over [0, T]. Assume that the integral fo X(t) dt exists for
each sample path of X(t). Find a minimum mean square linear estimate of foT X(t) dt in
terms of X(O) and X(T).
Problem 2.29. Consider a zero mean wide sense stationary random process {X(t) : t E JR.}
with an autocorrelation function given by R(T) = E[X(t + T)X(t)] = exp(-ITI). We desire
to estimate X(t) via a linear combination of X(t - 1) and X(t - 2) so as to minimize
the mean square error. That is, we wish to estimate X(t) via f(X(t - 1), X(t - 2)) =
a1X(t - 1) + a"2X(t - 2) so that E[(X(t) - f(X(t - 1), X(t - 2)))"2] is minimized. Find the
constants a1 and a"2.
Problem 2.30. Suppose that we want to build a filter that is modeled by convolution of
an input :c(t) with h(t) where h(t) = 1 if 0 ::; t < 1 and h(t) = 0 elsewhere. Since we cannot
it
build such a filter we decide to construct an approximation to it using in place of h where
h(t) = Ct1 exp( -t) + C~"2t exp( -t) for t :2: 0 and h(t) = 0 for t < o. Our goal will be to
minimize fo'XO Ih(t) - h(t)12 dt. Find a1 and a2.
Problem 2.31. Consider a random variable X that has a uniform distribution on the
interval [0,1]. Find a minimum mean square affine estimate of X 3 in terms of X 2 and X.
\Vhat is the mean square error of your estimate?
Problem 2.32. Let X denote the set of all continuous functions f:[O, 1] ----:- JR. such that
.1(0) = O. Let 11.111 denote the supremum (least upper bound) of the set {If(t)1 : t E [0, I]}.
Let K denote the collection of all functions 9 in X such that f01 g(t) dt = 1.
Problem 2.33. Consider a Hilbert space H and a closed proper subspace M of H. Consider
the function PM(h) that maps a point h in H to the point in !vI that is nearest to h. That
is, PM is the orthogonal projection of H onto M.
17
www.MathGeek.com
www.MathGeek.com
(B) Let N be another closed proper subspace of H. Show that the subspace j\l1 is orthogonal
to the subspace N if and only if PM 0 P N = P N 0 PM = o.
1
a2.l +X
for real numbers (~i,j. For what choice of the ai,j'S are the Y; 's orthogonal? Find the
minimum mean-square linear estimate of Y4 based upon Y1 , }2, and }3. Find the minimum
mean-square nonlinear estimate of Y4 based upon Yi, }2, and Y3. (Hint: E[X4] = 3.)
Problem 2.35. (A) Let ]'vl be a closed subspace of a Hilbert space H and, for :c E H, let
x= PM(x) + QM(X) where PM(x) E J'vl and QM(X) E M.1.. Prove that
(B) Use the result from Part (A) to find the maximum value of
1: g(x) dx 0
1: xg(x) dx 0
1: 2
x g(x) dx 0
1: i(:c) dx 1.
18
www.MathGeek.com
www.MathGeek.com
3. Detection Solutions
Solution 1.1. Let the parameter set be given by 8 = {HH,HT}, where HH denotes the
coin with two heads and HT denotes the standard coin. The sample space for a single flip of
the coin is given by {H,T}, where H means that we observed a Head and T means that we
observed a Tail. Our decision set D is the same as 8. Let X denote the observation from
our single coin flip. Our decision rule b has the following distribution:
P(b(X) = HTIX = T) 1
P(b(X) = HTIX = H) p
P(b(X) = HHIX = T) o
P(8(X) = HHIX = H) 1- p.
and
The maximum risk is then given by max (p, ~(1- p)), which is minimized when p =~.
Solution 1.2. Since X is Gaussian with mean zero and variance ()2, it follows that X/IT
has a standard Gaussian distribution. Thus, X 2 / ()2 has a chi-square distribution with one
degree of freedom. That is, X 2 / ()2 has a density function given by
1
f(x) = y'2Jrx exp
(X)
-"2
for X > o. Thus, X 2
has a density function given by
1
Po (x) = ()2 f (X)
()2
1
= (J y'21TX (X)
exp - 2()2
19
www.MathGeek.com
www.MathGeek.com
for x > o.
Next, note that
J
-00
_ 1 exp (_ y"2. ) dy
(Jy27f 2(J"2
Since the nonlinearities are equally likely, it follows that ITo ~. Thus, our test
consists in comparing A(y) to 1, where
_1_!exp (_In"2(y))
rYy27f y 2rY2
A(y)
------::1= exp ( _ _ y )
(J yl2ny 2(J"2
1
ylYex p ( 2rY"2(y-ln
1 (y))2) .
Solution 1.3. To begin, note that
20
www.MathGeek.com
www.MathGeek.com
Let 5 = Xf + X5
and note that 5 is our test statistic. Under H o, 5 has a chi-square
distribution with 2 degrees of freedom. Thus, under H o, 5 has density
and that
~exp
2
(_T) = ~ _ ~exp T)
4 2 2
(_
2'
T = -4ln
(
-1 +
2 vs) ;::::: 1.925.
Solution 1.4. From class notes we know that the processor in this case is given by
-2 ify:S;O
g(y) = ~(y - 1) ifO:S;y:S;2
{ if y ~ 2,
21
www.MathGeek.com
www.MathGeek.com
T = In (~:) ~ -2.197.
Since g(y) > T for all possible values of y, it follows that our test will always announce that
the signal is present regardless of our observation.
Solution 1.5. (A) Assume that there exists a decision rule 8 that minimizes
J
00
L(B, 8(x))g(Blx) dB
-00
J
00
J
00
Ee[L(B, 8(X))lf(B) dB
-00
JJ
00 00
= =
JJ
-00 -00
L(B, 8(x)) pe~;~(B) dB h(x) dx where h(x) = J
00
-00
Pe(x)f(B) dB
JJ
00 00
JJ
(Xl (Xl
J=
-00
Ee[L(B, 8(X))lf(B) dB
rU,8).
Thus, it follows that 8 is a Bayes rule since its average risk is not greater than the average
risk of any other test.
22
www.MathGeek.com
www.MathGeek.com
00
a J g(e Ix) de if b(x) = do
/ L(e, b(x))g(elx) de
{ 132 g(elx) de if b(x) = d l
-00
From Part (A), we know that a rule that minimizes the previous expression is a Bayes rule.
Thus, a Bayes rule 80 is given by
is immaterial.
Solution 1.6. Finding the ratio of the densities and making the standard reductions yields
a test that consists of comparing the observation X to a threshold T. Since X has a standard
Gaussian distribution under H o, it follows that T must be such that 1 - <I>(T) = ao, where
ao is the level of significance. For ao = 0.05, it follows that T ::::::; 1.65. Thus, if we observe
1.8 then our optimal test will reject Ho in favor of HI even though Ho is virtually certain
to be the conect hypothesis in light of such an observation! Although seemingly surprising,
the trouble is due to the large value of ao that we chose. We are in effect forcing our test to
be wrong 5% of the time. For this example, which is virtually singular, we could allow ao
to be much smaller without significantly lowering the power of the test.
23
www.MathGeek.com
www.MathGeek.com
test that announces HI with probability ao no matter what the observation x is. The size
of this test is given by
a = Eo [<p(X)] = Eo[cto] = ao,
and the power of this test is given by
/J = Ed<;6(X)] = Edao] = ao·
Thus, since it is always possible to find a test for which the power is equal to the size, it is
impossible to have a Neyman-Pearson test for which the power is less than the size.
Solution 1.8. From class notes, we know that a Neyman-Pearson test in this situation has
the form
k HI
~Xs>T
D J J<
j=l Ho
for some threshold T where 5j = exp( -t j ) = exp( -jA) exp(B). After absorbing the constant
exp( B) into the threshold, we obtain a test of the form
k HI
Zk == Z=Xjexp(-jA) > T
.
J=l
<
Ho
for some threshold T. Under H o, the test statistic Zk is Gaussian with mean zero and
vanance
k
:2 _ z=:2
0" 0 - 0" exp
(_.) _ :2
2) A - 0"
(ex P(-2A(k + 1)) -
()
eXP(-2A))
.
exp -2A - 1
j=l
The Neyman-Pearson lemma states that the false alarm probability PO(Zk > T) must equal
ao. Thus, it follows that T = O"ocp-l(1 - ao). Note that the test does not depend upon B.
Under HI, the test statistic Zk is Gaussian with mean
k
m = Z=E[exp(-jA)exp(B) + N j ]exp(-jA)
j=l
k
1- cp
www.MathGeek.com
www.MathGeek.com
Solution 1.9. From class notes, we know that the test in this situation consists of comparing
the sum of the observations to a threshold given by
T = vk(J1)-I(l - ao).
Further, the power of the test is given by
S= 1 - 1)
(
v'ks) ,
1) -1( 1 - ao ) - ----;;-
S= 1 - 1) (1)-1(0.99) - ~) 2: 0.97.
Solution 1.10. (A) From class notes, we know that the Neyman-Pearson test for this case
has the form
f3(n) = 1 - 1) (1)-1(0.995) - ~)
1 - 1)
(2.576 - 2yin) .
25
www.MathGeek.com
www.MathGeek.com
(B) Note that the expected number of observations in this case is H2) + H16) = 9, which is
the same number of observations as were used in part (A). The power in this case, however,
is given by
1 1
s 2"13(2) + 2"13(16)
~(0.031) + ~(0.282)
0.157,
which is larger than the power of the Neyman-Pearson test considered in part (A). Thus, a
Neyman-Pearson test with a fixed number of observations can have a smaller power than a
test with a random number of observations even when the expected number of observations
for the second test is the same as the fixed number of observations used in the first test.
This does not violate the Neyman-Pearson lemma since that lemma was based on fixed
distributions and the distribution is not fixed when the number of observations is allowed to
vary.
Solution 1.11. Let Y denote the number of Heads that we observe after two flips of the
coin. Note that under Ho
y~{
0 wp 1/4 ==Po
1 wp1/2==PI
2 wp 1/4 ==P2
and that under HI
y~{
0 wp 4/9 == qo
1 wp 4/9 == qI
2 wp 1/9 == q2.
Thus, if An = qn/Pn, then Ao = 16/9, Al = 8/9, and A2 = 4/9. Using our analogy of a buyer
with a limited budget, the most "valuable" point is 17 = 0 since it has the largest ratio of
value (qn) to price (Pn). Unfortunately, the price of item 0 (Po) is 1/4, which exceeds our
budget (ao) of 1/8. Thus, we can only purchase a piece of item o. That is, we must use a
randomized test.
26
www.MathGeek.com
www.MathGeek.com
Solution 1.12. Consider first the general situation in which we have the two hypotheses
Ha Xj = N j
HI Xj =.'3j+Nj
for j = 1, ... ,n, where the Ni's have a zero mean multivariate Gaussian distribution with
covariance matrix~. Under H a, the Xl's have joint density
A(x) = PI (:r:)
Pa(x) exp (-~x T~-IX)
exp ( -"21 (x - 1
.'3) T ~ - (x - .'3) + "2IxT ~ - 1)x
exp ( -~ (x T~-IX - X T~-1 S - S T~-IX + S T L -l s - X T~-IX))
Note that
ST(L-l)TX
Thus, after taking the natural log of A(x) and cancelling constants, we obtain a test of the
form
Under H a, the test statistic XT~-1 s is Gaussian with mean zero and variance
S T~-IE[XXTl~-ls
sT~-I~~-IS
.'3T~-IS.
27
www.MathGeek.com
www.MathGeek.com
Under HI, the test statistic XTL- I s is Gaussian with mean s TL-I s and variance s TL-I s.
Thus,
and
Now, for the particular problem under consideration, we have TI = 2, Sl = exp( -h), and
S2 = exp( -2h). Further,
and hence
4/3 -2/3] = [()U ()12]
[ -2/3 4/3 - ()21 {)22 .
The processor is given by
Further,
h
[
h 2h ] [ 4/3 -2/3] [ e- ]
e- e- -2/3 4/3 e- 2h
-4 (-2h
e -e -3h + e -4h) .
3
Thus, our test is given by
Hi r----------------
4 -h 2 e -2h)
-"3 (4 e -2h
+"3 2 -h) > / 4 .2h .
e- 3h + e- 4h ).
( "3 e Xl - "3e X2 < 1.28Y"3 (e- -
Ho
If h = 0.1, then s TL-I s = 0.9976 and (-J = 1 - <P (1.28 - )0.9976) = 0.389.
Solution 1.13. From class notes, we know that a Neyman-Pearson test for this situation
has the form
28
www.MathGeek.com
www.MathGeek.com
where
Setting y = exp(-2h), we are then seeking the largest value of h for which
and hence such that y2 + y - 0.584 > o. The roots of the corresponding quadratic equation
are Yl = 0.413 and Y2 = -1.413. Since y must be positive, we conclude that y must be
greater than 0.413. If y > 0.413, then h < -~ In(0.413) = 0.44215.
Po(Y) = { 1 - Ao if y = 0
Ao if y = 1
and
AI if Y = 0
PI (y) = { 1 - Al if Y = 1.
Thus,
Al
if y = 0
A(y) = Pl(y) = { 1- Ao
Po(Y) 1 - Al
if y = 1.
29
www.MathGeek.com
www.MathGeek.com
Note that
. Al
1 IfT< - -
I - AO
Al 1 - Al
Po(A(Y) > T) = AO if - - < T < - -
I - AO - AO
o if T > 1 - Al
- AO '
where we have noticed that since AO + Al < 1, it follows that Al < 1 - AO and 1 - Al > AO,
and hence that
Al 1 - Al
--<1<--.
1 - AO AO
The Neyman-Pearson Lemma implies that
1 - Al
~1
if 0 :::; ao < AO
AO
Al
T("o) -- if AO :::; 000 < 1
1- AO
0 if 000 = 1
and that
(to
~1
if 0 :::; ao < AO
AO
(to - AO
p(<>o) if AO :::; 000 <1
1 - AO
0 if (to = 1.
(Note that T(l) may actually take on any value less than A1/(l - AO), and that p(l) IS
arbitrary.) Thus, our test is given by the following procedure:
Test
o :::; 000 < AO Announce HI wp 0:0
An
if we observe '1'; else announce Ho.
000 - AO
AO :::; 000 < 1 Announce HI if we observe '1' and wp if we
1 - AO
observe '0'; else announce H o.
000 = 1 Always announce HI.
30
www.MathGeek.com
www.MathGeek.com
AI =A 2 =1/8
AI =A 2 =3/8
Solution 1.15. From class notes, we know that the Neyman-Pearson test consists of com-
paring the sum of the observations to the threshold T = vnO"1>-l(l - ao). Further, the
power of the test is given by
(
(3 = 1 -1> 1>-1(1 - [Yo) - vn-;;'0) .
Note that 1>-1(1 - ao) = 1>-1(0.95) = 1.65.
For System A, we seek the smallest value of n for which 1 -1>(1.65 - 2.jn) > 0.95. Solving
for n implies that we must have 17 > 2.7. Thus, for System A, we choose n = 3. For
System B, we seek the smallest value of 17 for which 1 - 1>(1.65 - fo) > 0.95. Solving for
n implies that we must have TI > 10.9. Thus, for System B, we choose n = 11. The cost
of System A is $1,000,000 + (100 x 3 x $1000) = $1,300,000 and the cost of System B is
$250,000 + (100 x 11 x $1000) = $1,350,000. Thus, we should choose System A.
www.MathGeek.com
www.MathGeek.com
for 0 < x < 1. Reducing the test, we obtain a test of the form
Ho
>
x T.
<
HI
(Note the change in the inequalities.) For a minimax test, we choose T so that Qo = Ql
where Qo = Po(X < T) and Ql = H(X > T). Thus,
T 1
Qo = Ql {::? J~(x +
o
1) dx = J
T
dx
{::? 23 (1:2
- -T+T) =1-T
2
1 2 5
{::? -T + -T - 1 = O.
3 3
Thus, since 0 < T < 1, we conclude that
T = - ~2 + v'37
2
~ 0.54138.
(B) For a Neyman-Pearson test, we will choose T so that Qo = ao where ao = 1/10. That
IS,
T
1
10
J~(x +
o
1) d:c
1 2 2
-T + -T.
3 3
Again, since 0 < T < 1, we conclude that
yi130
T = ----w- -1 ~ 0.140175.
J
0.140l75
() {O ifO<x<~
A( x) = PI X = 1 if ~ < x < 1
Po(x) . 3
00 Ifl<x<2.
32
www.MathGeek.com
www.MathGeek.com
Further, under H o,
A(C)~{ L
1
wp 2"
1
wp 2
wp 0
and under HI,
wp 0
1
wp 2
1
wp 2.
The Neyman-Pearson lemma states that the threshold T and the randomization constant p
must be chosen so that
1
"8 = Po(A(X) > T) + pPo(A(X) = T). (3.1)
Note that Po(A(X) > T) takes values only in {O, ~, 1}. Thus, we must choose a value of T
such that Po(A(X) > T) = 0 and such that Po(A(X) = T) i- o. The only such value of Tis
1. \Vith T = 1, equation (3.1) implies that
which thus implies that p = 1/4. Thus, our test has the form:
Announce HI if :c > 1
Announce HI wp ~ if ~ < x < 1
Announce H 0 wp ~ if ~ < x < 1
Announce Ho if x < ~.
Note that this test corresponds to 93, which means that 93 is a Neyman-Pearson test. What
about tests 91 and 93?
Although the test 93 is constant on the threshold, nothing in the Neyman-Pearson test
says that a Neyman-Pearson test must be constant on the threshold. Actually, we can do
anything we want to do on the threshold so long as the size of the test is equal to ao. Note
33
www.MathGeek.com
www.MathGeek.com
that <PI and <P2 are identical to <P3 off the threshold. Thus, <PI and <P2 will be Neyman-Pearson
tests if they have a size equal to ao. The size of <PI is given by
~)
1
Po (Announce Hd = Po (X>
8
and the size of ~62 is given by
Solution 1.18. Our strategy for this problem will be to first find a Neyman-Pearson test
of Ho : {Po} versus HI : {PI}, and then to find a Neyman-Pearson test of Ho : {Po}
versus HI : {P2 }. If the two tests are identical then a UMP test exists for Ho : {Po} versus
HI : {PI, P2 }.
0 1/3 0 0
1 1/3 1/3 1
2 1/3 2/3 2
Note that we can 'purchase' only a piece of one item. Thus, we randomize on the point
n = 2, and we obtain a test that announces HI with probability 1/2 if we observe '2' and
we announce Ho otherwise. For testing Ho : {Po} versus HI : {P2 } we have:
0 1/3 1/3 1
1 1/3 0 0
2 1/3 2/3 2
Again, note that we can 'purchase' only a piece of one item. Thus, we randomize on the
point 17 = 2, and we obtain a test that announces HI with probability 1/2 if we observe '2'
and we announce Ho otherwise. Since these two tests are identical, we conclude that this
test is a UMP test for testing Ho : {Po} versus HI : {PI, P2 } at level 1/6.
34
www.MathGeek.com
www.MathGeek.com
0 1/3 0 0
1 1/3 1/3 1
2 1/3 2/3 2
Note that we can 'purchase' one entire item and a piece of a second item. Thus, we purchase
item '2' and randomize on the point 17 = 1. We then obtain a test that announces HI if we
observe '2', that announces HI with probability 1/2 if we observe '1', and that announces
Ho otherwise. For testing Ho : {Po} versus HI : {P2 } we have:
0 1/3 1/3 1
1 1/3 0 0
2 1/3 2/3 2
Again, note that we can 'purchase' one entire item and a piece of a second item. Thus, we
purchase item '2' and randomize on the point n = O. We then obtain a test that announces
HI if we observe '2', that announces HI with probability 1/2 if we observe '0', and that
announces Ho otherwise. Since these two tests are not the sane, we conclude that there does
not exist a UMP test for testing Ho: {Po} versus HI: {PI,P2 } at level 1/2.
Solution 1.19. Let Wi = ms(t i ) and let 8i = sin(t i ) fori = 1, ... , k. Note that
PO(YI, ... ,Yk) = (fT~) k exp (- 2~2 ((YI - wd 2 + ... + (Yk - Wk)2))
and that
Thus,
PI(YI,···,Yk)
PO(YI, . .. , Yk)
35
www.MathGeek.com
www.MathGeek.com
and variance
k
2 2"
V = iT L ) S j - Wj )2 .
j=l
and variance 1)2. \Ve choose T so that PO(Xk > T) = ao. Since
it follows that
k k
(J L(Sj - Wj)2<P-
1
(1 - ao) +L Wj(Sj - Wj).
j=l j=l
where
k k
l: Sj(Sj - Wj) - l: Wj(Sj - 'Wj)
fL1 - fLo j=l j=l
11 Ie
iT l: (Sj - Wj)2
j=l
k
1
L(Sj - Wj)2.
j=l
36
www.MathGeek.com
www.MathGeek.com
To maximize the power, we want this latter term to be large. Thus, our goal is to choose
the ti'S so that (Sj - Wj)2 is as large as possible. Note that
(,sj - Wj)2 (sin(tj ) - coS(tj))2
2
sin (tj ) - 2 sin(tj ) cos(t j ) + cos 2 (t j )
1 - 2 sin(tj) cos(t j )
1 - sin(2tj).
This term is maximized when sin(2tj) = -1. This occurs when
7r
tj = j7r - "4
for j E N.
S(X) > T
<
Ho
where
S { al - ao for x 2:: ao
(x) = 00 for a1 <_ x < a0
To find the power of our test, we must find the distribution of S(X) under HI. Note that
H(S(X) = al - ao) H (X 2:: ao)
1 - PI(X < ao)
j. b (-(X b- ad)
o
1-
a 1
- exp
dx
al - ao)
exp ( b .
37
www.MathGeek.com
www.MathGeek.com
~
al - ao)
wp exp ( b
S(X) {
a1 - a o )
00 wp 1 - exp (
b .
Thus,
1 - (1 - ao) exp ( al ~ ao ) .
Solution 1.21. Recall from the class notes that the test in this case consists of comparing
the sum of the observations to the threshold
}3 = 1 - <I>
(
sV'k) .
<I> -1·(1 - ao) - -----;;-
This is a Neyman-Pearson test for any choice of ao. Our goal is to find a value of ao for
which the corresponding Neyman-Pearson test is also a minimax test. From class notes, we
know that such a value of ao will be 1 - }3. (That is, if ao = 1 - (3 then Qo = QI, which is
the minimax equation.) Note that
1 - }3 = <I>
(
sV'k) .
<I> -1·(1 - ao) - -----;;-
Thus, ao = 1 - (J if
ao = 1- <I>
(sV'k)
~ .
38
www.MathGeek.com
www.MathGeek.com
HI
>1
S(x) == xI(o l)(X) + (1
'2
- X)I(ll)(X)
2' < -4
Ho
Ho
we choose T so that Qo = Q1, where Qo = Po(S(X) > T) and Q1 = P1(S(X) < T). Note
that, for 0 < T < ~,
and
39
www.MathGeek.com
www.MathGeek.com
J J
1
4xdx+ 4(1-x)dx
o 1-T
2T2 +4(1- (1- T)) - 2(1- (1- T)2)
4T2.
(C) For a Neyman-Pearson test at level 1/100, we choose T so that Qo = 1/100. That is,
we choose T so that 1 - 2T = 1/100. Solving implies that T = 0.495. The power of the test
is given by 1 - Ql, which in this case is 1 - 4T2 = 0.0199.
Under H o,
1
Po(A(X) > T) + pPo(A(X) = T) = 10.
Thus, it follows that T = e and p = e/10. The resulting test announces HI with probability
e/10 if x 2': 1 and announces Ho otherwise. The power of the test is given by
(3
e e
0+ - X 1 =-.
10 10
(B) The critical function of any Neyman-Pearson test in this situation must have the form
40
www.MathGeek.com
www.MathGeek.com
Note that A(x) is never greater than e. Thus, the only requirement on ¢ is that it be
zero when A( x) < e; that is, when 0 ::; x ::; 1. If we require in addition that our test be
nonrandom, then we must require that ¢ take on only the values 0 or 1. A test that satisfies
both conditions is given by
, {I
ifx>A
qJo(x) = 0 if x ::; A '
where A > 1. The value of A is determined by the requirement that the size of our test be
1/10. That is,
1
Po (Announce HI)
10
Po(X> A)
J
CXJ
e-xdx
)..
exp( -A).
Thus, A = In(IO), and our nonrandomized test announces HI when x> In(IO) and announces
Ho otherwise. Note that both the randomized test in (A) and the nonrandomized test in (B)
are Neyman-Pearson tests. However, the test in (A) is constant on the threshold, whereas
the test in (B) is not.
Solution 1.24. Let f (x) denote a Gaussian density function with mean 0 and variance (J2.
Then, under H o, it follows that Y has density 1(y), and, under HI, it follows that Y has
density
1 1
"2 1(x - 8) + "2 1(x + 8).
That is, under HI, Y's distribution is an even mixture of two Gaussian distributions-one
centered at 8 and one centered at -8. Thus,
PI(y)
A(y) =
Po(Y)
1
-
1
exp
((y
-
- 8)2) + -1 1
exp
((y
-
+ 8)2)
"""'"""------:=-'-
2~ 2(J2 2~ 2(J2
1 exp (y2
- -)
V27T1y2 2(T2
1
-exp
2 (--8-2) (exp () +exp ())
2(T2
y8
~
(T2
-~
~
y8
( _8 (Y8)
2
)
exp 2(T2 cosh (T2 '
41
www.MathGeek.com
www.MathGeek.com
Ho
Note that this test is not UMP since it depends upon the parameter B.
A(k)
Taking the log of the test and noting that In(Ad Ao) is positive, we obtain a test of the form
HI
>
k T,
<
Ho
where k is the realization of our test statistic X. As usual, we choose T and p so that
42
www.MathGeek.com
www.MathGeek.com
Let Ao = 1 and ao = 0.02. By trial and error we first seek the largest value of T for which
Po(X > T) is not greater than 0.02. Note that
3 1
Po (X > 3) = 1- L k! e-
1
k=O
1- ~e (1 + 1 + ~2 + ~)
6
8
1--
3e
~ 0.019.
Further,
1
Po(X = 3) = - ~ 0.061.
6e
Thus, we seek p so that
which implies that p = 0.016. Thus, our test announces HI if X > 3, announces HI with
probability 0.016 if X = 3, and announces Ho otherwise.
and
-~
d 1n.fN
. () 2x
x = ~~.
d:c 1 - x2
for I:cl < 1. Thus, the locally optimal processor is given by
2x
gzo(x)=I_x 2 •
43
www.MathGeek.com
www.MathGeek.com
9NP(X) = ./ 9zo(U) du
x-s
x-s
Solution 1.27. (A) From class notes, we know that the test statistic in this case is given
by
Under HOl
3 wp 1/8
~ { ::~
wp 3/8
Z wp 3/8
wp 1/8.
\Ve choose T and p so that
1
16 = Po(Z > T) + pPo(Z = T).
Solving as usual we see that T = 3 and p = ~. Thus, our test announces HI with probability
~ if all of the observations are positive and announces Ho otherwise.
Thus,
(J ~ cxp( -IX1)dX) 3
(~ +/ ~ cxp( -IXlJdX) ,
44
www.MathGeek.com
www.MathGeek.com
and hence
1
PI(Z> 3) + "2 H (Z = 3)
(B) Let
1
PI ( x, s) = ----:----:---...,.....,.,..
7T(1 + (x - s)2)
and note that
o ) 2(x-s)
osPI(X, s = 7T(1 + (x - s)2)2
Thus, the locally optimal processor is given by
o
lim -:;;-PI (x, s)
s10 uS
S(x)
Po(x)
r 2(x-s)
;m 7T(1 + (x - S)2)2
1
45
www.MathGeek.com
www.MathGeek.com
T ----------------- ----
I
o I
I" .1
Announce H, here
Announce ~ elsewhere
2x
(1 + X 2 )
2x
1 + x2 ·
Note that this processor exhibits the same unusual behavior that we observed in part (A).
4. Estimation Solutions
Solution 2.1. If X denotes the number that we observe, then it would not seem unreason-
able to suppose that X =i with probability liN for 1 SiS N. Note that
N 1 1 N(N+1) N+1
E[Xl=Li N = N 2 =-2-·
i=1
46
www.MathGeek.com
www.MathGeek.com
x = _X__+_·._._+_X_
1 k
On the other hand, if we observe Xl, .. . , X).;, then our goal might be to find a value f"r for N
so that
k
II P(Xi = Xi)
i=l
is maximized. If N < Xi for some value ofi then P(Xi = Xi) = 0 and the entire product is
zero. If N 2: max1<i<k{X.i}, then the product is equal to N-k, which is maximized when N
is as small as possibl~. Thus, the product is maximized when N = max1<::i<::d:Ci}.
Thus, if we observe a runner with the number 87, then an unbiased estimate for the total
number of runners is 173 and a maximum likelihood estimate for the total number of runners
is 87.
Pe(x) = ~exp -
1 ((X-8)2) :2 '
21W2 2(}
and
Thus,
[J2 lnpe(X) ] 1
1(8) = -Ee [ ari
47
www.MathGeek.com
www.MathGeek.com
and
Thus,
Solution 2.4. (A) Note first that E[XIJ = E[X2J = 1 and E[X3J = i. Hence, our estimate
will be correct when it states that X3 's distribution has the largest mean. Thus, taking one
sample from each distribution, the probability that our estimator will correctly determine
the distribution with the largest mean is simply the probability that Xl :S X3 and X 2 :S X 3.
That is, the probability our estimate is correct is given by
(B) Let X denote the sample mean of the two samples from the third distribution, and note
that
I wp 1/4
X= 5/2 wp 1/4
{
7/4 wp 1/2.
Thus, the probability our estimator is correct in this case is given by
P(XI:SX,X2:SX) = p(X=~)
+P (X = ~) P(XI = 0)P(X2 = 0)
+P(X = l)P(XI = 0)P(X2 = 0)
~4+ (~ x ~ x ~) + (~ x ~ x ~)
222 422
7
16
48
www.MathGeek.com
www.MathGeek.com
(C) No. The probability of being correct decreased in (B) even though the number of
observations increased.
Solution 2.5. (A) Since .h(O) > .12(0), it follows that .h(O) - .1"2(0) > O. Thus, since.h and
12 are continuous, it follows that h(x) - 12(x) > 0 for all:c E [O,E] for some E > O. Further,
since hand 12 are even, it follows that h (x) - 12 (x) > 0 for all x E [- E, E]. Thus, since
E
it follows that
8+E 8+E
jh(X-8)dX> j12(X-8)dX.
8-E 8-E
\. V' .J \..'---Vv---'
P(IX-81-C;c) P(IY-81-C;E)
Using the result from (A), the desired result here will follow if we can show that f(O) > g(O).
Note that
f(O) = k; 1
amP
00
00
2 j (k -
-00
4
1)2
(1 +
1
I s l)2k
ds
(k - 1)2
2k - 1 .
Thus, f(O) > g(O) since we have assumed that k> 3.
1 See Table of Integrals, Series, and Products, Corrected and Enlarged Edition, (Academic Press: Orlando,
1980) by I. S. Gradshteyn and I. M. Ryzhik page 285.3.194(3).
49
www.MathGeek.com
www.MathGeek.com
e-
A
z=
00
k=O
(-2A)k
k!
e- A e- 2A
e -3A .
Thus, E[T2] = VAR[T] + 82 , which implies that E[T2] = 8 2 if and only if VAR[T] = 0; that
is, if and only if T is constant with probability one.
z= z= ... z=
I I
Xl=OX2=O XN=O
I
z= z= ... z=
I I I
!(XI, .. . xN)8 L : 1Xi (1- 8)N-'£f'=lXi~
v
polynomial in 8 of degree ::; N
for each fixed Xl, ... ,XN
~------------------vr------------------~
sum of such polynomials is also a
polynomial in 8 of degree ::; N
Thus, g( 8) must be a polynomial of degree not greater than N in order for T to possibly be
an unbiased estimator. As an application of this result, note that there exists no unbiased
estimate of the odds 8/(1 - 8) of 'heads' versus 'tails' for a coin that comes up 'heads' with
probability 8.
50
www.MathGeek.com
www.MathGeek.com
Solution 2.9. Assume that T(X) is an unbiased estimator of 1/ A. Then it follows that
00 -).A k 1
E[T(X)] = '""'T(k)_e- , =-.
~ k. A
k=O
where
[p2 ( -"21 (( X I
Eo [- ae - e) 2 + ... + (Xn - e) 2 )) ]
51
www.MathGeek.com
www.MathGeek.com
e2 1ne
1
~ exp(2)').
Ee[T(X)] LT(k)P(X = k)
k=O
T(O)P(X = 0)
e( -In e)O
1 X ,
O.
e,
which implies that T is unbiased. Also,
00
LT"2(k)P(X = k)
k=O
T2(0)p(X = 0)
') e(-ln e)O
1~ X ----'------'----
O!
e.
Thus, VARe[T(X)] = e - e2 = exp( -).) - exp( -2),). Since exp().) > ). + 1 when). > 0, it
follows that
52
www.MathGeek.com
www.MathGeek.com
or that
-). -2), -2), 1
e - e > Ae = I( e) ,
which implies that T is not efficient.
Solution 2.12. No. A maximum likelihood estimate of a mean need not be the sample
mean. For example, if Xl, ... ,Xn are mutually independent and uniform on (e - ~,e + ~) ,
e
then a maximum likelihood estimate of the mean is given by ~ (miul<;i<;n{Xi} + nUtXl<;i<;n{Xi}) ,
which is not the sample mean if n > 2.
(See Problem 2.16.)
Solution 2.13. A maximum likelihood estimate of p is given by
1/3 if:/; = 0
T(x) = { 2/3 if:r; = 1.
1)2
("3- (2)2
P (1-p)+ "3- P P
3p2 - 3p + 1
9
Let 8 (x) = ~ for x = 0 or 1. The mean square error of 8 is given by
Note that the mean square error of 5 is smaller than the mean square error of T for p E [~, ~J.
Solution 2.14. If
X = { 1 if the coin flip is a 'head'
o if the coin flip is a 'tail'
then
P(X = i) =
e ifi=1
{ 1- e ifi = 0,
53
www.MathGeek.com
www.MathGeek.com
A remedy to the problem is to let e= [0,1]. In this case, a maximum likelihood estimate
of 8 is given by
A)
8(x =
{I 0
ifx=l
if x = o.
That is, if we include physically impossible values in 8, then a maximum likelihood estimator
exists and it returns as its estimate the physically impossible values that we included!
Solution 2.15. (A) From class notes, we know that a maximum likelihood estimate in this
case is given by the sample mean,
Since
1 )
Pe(X1, . .. , Xk) = ( 0"Y"21T
kexp ( - 1 k
20"2 ~(Xj - 8)
2) ,
then
and
Thus,
Further,
1 k ~
VAR[8(X1' ... , X k )] = k2 L VAR[Xj] = :~.
j=l
54
www.MathGeek.com
www.MathGeek.com
55
www.MathGeek.com
www.MathGeek.com
T- kB
(Tyik
1- J vh _;2) exp ( du
-T- kB
o-Vk
1 + <I> (-T - !'1:
(TV k
kB) - <I> (T - kB)
!'1:
(TV k
(C) From class notes we know that a Neyman-Pearson test for this situation has the form
where
and
56
www.MathGeek.com
www.MathGeek.com
and note that L(e,X1"",Xn) = 1 if and only if Xi E [e- ~,e+~] for 1, ... , nand
equals zero otherwise. That is, L(e, Xl, . .. , Xn) = 1 if and only if
1
max {xJ - - :::; e:::; min {xJ
l:";;:";n 2 l:";i:";n
+ -21
and equals zero otherwise. Thus, any estimator T : ffi.n -7 ffi. such that
e
is a maximum likelihood estimate of for any value of A in [0,1]. Note that in this case,
not only is a maximum likelihood estimate not unique, but there exist uncountably many
distinct maximum likelihood estimates of e!
Note that
57
www.MathGeek.com
www.MathGeek.com
Thus,
e
Ee[Z] J
o
z fz(z) dz
e
nzn-1
J
o
z - -n d z
8
~n ( zn+1 ) Ie
8 17 + 1 0
n8
17+1
Thus, Z is not unbiased, and we conclude that in general a maximum likelihood estimator
need not be an unbiased estimator.
I - 8 if x = 0
L (8, x) = { 8 if x = 1.
Thus, L(8,1) is maximized when 8 = ~ and L(8,0) is maximized when 8 = i. That is, a
maximum likelihood estimate of 8 is given by
(8 - 0:)2(1 - 8) + (8 - (1 - 0:))28.
Thus, the maximum likelihood estimator e(x) from (A) has a mean square error given by
2
E [(8 - 81 / 3 (X)) 2] = -8 - -8 1
+ -.
3 3 9
58
www.MathGeek.com
www.MathGeek.com
In addition,
which is continuous in a and differentiable in a between the x/so If Xj < a < Xj+1, then
a -j 17 - j
aa lnL (a,XI""'X n )=7)+ A-a'
and
a22InL j n-j
aa (a,XI,'" ,x n ) = a2 + (A _ a)2 > O.
Since the second derivative is positive, any critical point between the Xi'S must correspond
to a minimum (not a maximum) of L. (That is, any critical point a such that Xj < a < Xj+1
must be a local minimum of L. If 0 ~ a < Xl, then
which is strictly increasing in a. Thus, no value of a in [0, Xl) can correspond to a maximum
a
of L. Similarly, no value of in (xn, A] can correspond to a maximum of L. Thus, we
conclude that a maximum of L is attainable only when a is equal to one of the observations
since no other value of a could possibly correspond to a maximum of L.
(B) The strict positivity of the second derivative within the intervals between the observations
implies that any local maximum of L that exists at Xj must correspond to a cusp of L; that
59
www.MathGeek.com
www.MathGeek.com
lim
OlXj (- - + - -.) <
J.
()
rz - J
A - () -
0 < lim
- OTxj
(_J_'_-_1 + _n_-_(,-j_-_1_))
() A - () ,
Thus, if the interval (0, A) is divided into n intervals of the form [;A, A], then the jth i!l
observation Xj cannot possibly correspond to a maximum of L unless it is contained within
the jth such interval.
82 2 3
-') In.fe(x) =
8()~
--2
()
+ (x + ())2 < 0
if and only if ri - 4()x - 2X2 < 0, which (since () is positive) implies that
2
8
----:lIn 10 (x) < 0 when () < ( 2 + J6) x.
8() > >
Since 2x < (2 + v'6) x, it follows that e
corresponds to a local maximum. Further, note
that In 10 (x) is monotonically decreasing to the right of (2 + J6) x since ~ - < 0 if and x!o
e
only if () > 2x. Thus, must correspond to a global maximum, and hence is a maximum
likelihood estimate for ().
60
www.MathGeek.com
www.MathGeek.com
Recall that
Since
since E[Xf] = 00. However, if 8(x) = A for any real number A, then E[(8(X) - 8)2] is finite.
e
Thus, we conclude that is not admissible.
61
www.MathGeek.com
www.MathGeek.com
Thus,
if 0 x = 0 and y = 0
1/V3 if x = 1 and y = 0
~/3
A
8(x,y)= if x = 0 and y = 1
{ if x = 1 and y = 1.
0.8
06
0.4
0.2
00 02 06 08
62
www.MathGeek.com
www.MathGeek.com
i=l
n
;=1
n
Thus, to maximize In fe 1 .e 2 (Xl, ... ,xn ) we want to make ()2 as small as possible. However,
fe 1 ,e2 (:C1, . .. ,xn ) = 0 if Xi > ()2 for any i. That is, ()2 cannot be less than any of the
observations. Thus, a maximum likelihood estimate for ()2 is given by
Since
[J2 11
8()21nfel,e2(x1, ... , xn) = - ()2 < 0,
1 1
we conclude that this critical point corresponds to a maximum point. Thus, a maXllImm
likelihood estimate for ()1 is given by
Solution 2.23. If X > 0 then fe(x) is a strictly increasing function of () and is thus max-
imized when () = 1. If :c < 0 then fe (x) is a strictly decreasing function of () and is thus
maximized when () = -1. If X = 0 then Ie (x) is a constant function of () and is thus
maximized for any choice of (). Thus a maximum likelihood estimate of () is given by
T(X)={ 1 ~fx~O
-1 If X < O.
63
www.MathGeek.com
www.MathGeek.com
Note that
1
Ee[T(X)] J
-1
T(x)fe(x) dx
+ J.(
o 1
1 + ax 1 + ax
J
-1
( - 1) 2 dx
0
1) 2 d:c
ax2]O
X
- [ -+- + [x-+-
ax2]1
2 4 -1 2 4 °
- (~ - ~) +(~+~)
~=fa
2 .
Thus, we see that T is not unbiased. The mean square error of T is given by
Ee[T2(X)] - 2aEe[T(X)] + a2
1 - 2a (2"a) + a :2
1.
However, the trivial estimator S(X) = 0 has a mean square error given by
which is less than or equal to 1 for all a! Thus, we conclude that T is not admissible.
-i + ;, ((~Xi) -ne)
64
www.MathGeek.com
www.MathGeek.com
Solving for e and selecting the positive root yields a maximum likelihood candidate given by
which equals zero if Xi < e for any value of i. If Xi 2: e for all i then
n
Is 8 unbiased? No. An easy way to see this is to note that 8 is never less than e and hence
could not be unbiased. For a more complete explanation, note that if X > e then
1- ( [ P+'JdH) n
1 - en (8-x) ,
65
www.MathGeek.com
www.MathGeek.com
.f"e(x) d~ (1 - en(iJ-x))
nen(iJ-x)
I
CXl
EiJ[B] xfe(x) dx
iJ
I
00
xnen(iJ-x) dx
o
1
8+ -,
n
66
www.MathGeek.com
www.MathGeek.com
a = 3= R(T/2)
I R(O) + R(T)
which become
or
T
J J J
T O T
J R(t) dt
a = 3= -----:"-0--:--_-,----:-
I R(O) + R(T) ,
67
www.MathGeek.com
www.MathGeek.com
which become
or
Solving we see that 0:1 = e- 1 and 0:2 = 0, and we estimate X(t) via e- 1X(t - 1). (Note that
X (t) is a Markov process and hence, as expected, our estimate depends only on the most
recent observation.)
Solution 2.30. Consider the subspace NI ~ £(e- t , te-t) of LdO, (0). We seek the point h
in j\{ that is closest to h, i.e. such that h - h ..1 M. Thus, we must have
00
./ (h(t) - h(t))e-tdt 0
o
00
2t 2t
0:1 . / e- dt + 0:2 . / te- dt
o 0 o
00 00
68
www.MathGeek.com
www.MathGeek.com
CXJ
Jo
2t
te- dt = ~
1
Jo
-t 1
e dt = 1 - -
e
J
CXJ
t"2 e-"2tdt = ~
o
1
Jo
-t
te dt = 1 -
2
~.
1
1--
e
2
1- -,
e
and
12
Lt2 = 4 --.
e
Note that
69
www.MathGeek.com
www.MathGeek.com
Solving we find that a = 3/2, b = -3/5, and c = 1/20. Thus, we estimate X 3 via
3 2 3 1
-X --X+-.
2 5 20
The mean square error of our estimate is given by
! (X3 - ~X2
o
+ ~x - 2~J) ,dx
1
2800
~ 3.5714 x 10- 4 .
Solution 2.32. (A.) Clearly Ilgll 2:: 0 for all g. Further, if 9 = 0 then Ilgll = 0, and if Ilgll = 0
then Ig(t) I = 0 for all t E [0,1], i.e. 9 = O. In addition,
Finally, Ilg + hll ::::; Ilgll + Ilhll since Ig(t) + h(t)1 ::::; Ig(t)1 + Ih(t)l·
(B) Let f and 9 be points in K and let 0 < ). < 1. Note that
1 1 1
70
www.MathGeek.com
www.MathGeek.com
(C) Let 9 E K. Since g(O) = 0 and 9 is continuous, it follows that Ilgll > 1 since
.I
1
g(t) dt = 1.
o
Clearly, however, if ~ > 0 then there exists some f E K such that Ilfll < 1 + E.
Solution 2.33. (A) PM(h) is the point in 1\IJ that is nearest to h with respect to the norm
induced by the inner product on H. If x E M then clearly PM(x) = x. Since PM(h) E 1\IJ
for any h E H, it follows that PM(PM(h)) = PM(h) for all hE H.
(B) Let x and Y be points in H. The Hilbert Space Projection Theorem implies that Y =
PM(y) + Q.u(Y) where PM(y) EM and Q.u(Y) E 1\IJl... Note that
(PM(x), PM(Y) + QM(Y))
(PM(:c), PM(y)) + (PM(x), QM(Y))
(PM(x), PM(Y)) since PM(x) E jVI and QM(Y) E Ml..
(PM(x), PM(Y)) + (Q.w(x), PM(Y)) since PM(Y) E ]1;J and QM(X) E ]1;Jl..
(PM(:C) + QM(X), PM(y))
(x, PM(y)).
If M ..1 N then
That is,
(y, PN(PM(X))) = 0
for all :1: and yin H, which implies that PN(PM(X)) = 0 for all x E H. Similarly,
71
www.MathGeek.com
www.MathGeek.com
Thus, M ..1 N.
and
E[( -1 + X2)(04,1 - 3X + 04,3 X2 + X 3 )]
E[04,lX 2 - 3X 3 + 04,3X4 + X 5 - (~4,1 + 3X - 04,3X2 - X 3]
E [04,1 X 2 + (~4,3X 4 - (t4,1 - (Y4,3X 2]
(t4,1 + 304,3 - 04.1 - (t4,3
204,3,
72
www.MathGeek.com
www.MathGeek.com
which are both equal to zero if 0:4,3 = 0:4,1 = O. Combining these results, we obtain
Yi 1
}2 X
Y3 X2 -1
3
Y4 X - 3X.
To obtain the best estimate of Y4 that is of the form O:lYi + 0:2}2 + 0:3Y3, we seek values of
0:1, 0:2, and 0:3 such that
Solution 2.35. (A) To begin, note that if x E Hand y E lvIl.. with Ilyll = 1, then
I\X,y)1 I\Pl\f(x) + Ql\f(x),y)1
I(PM(:C) , y) + (QM(:C), y) I
I( QM (x) , y) I since y E M l..
< IIQl\f(x)IIIIYII
I Ql\f(:c) I since Ilyll = 1.
Further, note that equality occurs if
(B) Let H denote the Hilbert space consisting of all square integrable functions on [-1, 1]
with
1
\1,g) =./ -1
f(x)g(x)dx.
Let Ai denote the subspace consisting of all polynomials on [-1,1] of degree not greater
than 2. Since M is finite dimensional, it follows that ]0. is closed. Note that if h E lvIl..,
then \1, h(x)) = \X, h(x)) = \X2, h(x)) = O. Our goal is to find
max
gE.U~
\X 3,g(x)) .
Ilgll=l
73
www.MathGeek.com
www.MathGeek.com
The Hilbert Space Projection Theorem suggests that we can express :c 3 as PM(X 3 ) + QM(X 3 )
where P M (X 3 ) E }v! and Q}H(X 3 ) E }\;J~. Note that since PM(X 3 ) E }v!, it follows that
P M (X 3 ) = a + bx + ex 2 where a, b, and e must satisfy
1
j (x' - a - bx - ex ) dx
-1
3 2
o
j(x 3 - a - bx - cx 2 )xdx o
-1
1
j (x 3 - a - bx - CX 2 )X 2 dx o.
-1
Simplifying we obtain:
1 1 1 1
3 2
j :c dx - a j d:c - b j x dx - c j x dx 0
-1 -1 -1 -1
1 1 1 1
2
j X4 dx - a j x dx - b j x dx - c j x 3 dx 0
-1 -1 -1 -1
1 1 1 1
5
j x dx - a j x dx - b j x dx - c
2 3
J' x4 dx 0,
-1 -1 -1 -1
which becomes
2
-2a- -e 0
3
2 2
- - -b 0
5 3
2 2
--a - -c O.
3 .5
Solving we find that a = 0, b = ~, and c = O. Thus,
and
x3 PM (X 3 )
-
3 3
"x --x
5" .
74
www.MathGeek.com
www.MathGeek.com
f (,3 - ~,,) 2
-1
dx
2
-v14
35
~ 0.21381.
75
www.MathGeek.com