Sei sulla pagina 1di 75

www.MathGeek.

com

Communication Theory: Problems and Solutions

Eric B. Hall

Contents

1 Detection Problems . 2

1.1 Bayesian and l\Iinimax Tests . 2

1.2 Neyman-Pearson Tests 3

1.3 Locally Optimal Tests 9

2 Estimation Problems 10

2.1 General Problems 10

2.2 Unbiased Estimators 11

2.3 Maximum Likelihood Estimation 13

2.4 Minimum Mean Square Estimation 16

2.5 Hilbert Spaces . 17

3 Detection Solutions 19

3.1 Bayesian and Minimax Tests . 19

3.2 Neyman-Pearson Tests 23

3.3 Locally Optimal Tests 43

4 Estimation Solutions 46

4.1 General Problems 46

www.MathGeek.com
www.MathGeek.com

4.2 Unbiased Estimators 50


4.3 Maximum Likelihood Estimation 53
4.4 :Minimum :Mean Square Estimation 66

4.5 Hilbert Spaces . 70

1. Detection Problems

1.1. Bayesian and Minimax Tests

Problem 1.1. As an example in which randomization reduces the maximum risk, suppose
that a coin is known to be either standard (HT) or to have heads on both sides (HH). The
nature of the coin is to be decided on the basis of a single toss, the loss being one for an
incorrect decision and zero for a correct decision. Let the decision be HT when T is observed,
and let the decision be made at random if H is observed with probability p for HT and 1- p
for HH. Show that the ma.ximum risk is minimized when p = ~.

Probleln 1.2. Consider a zero mean Gaussian random variable X with positive variance
(J2. This random variable is passed through one of two equally likely nonlinearities to obtain

an output Y. The null hypothesis Ho is that Y = X 2 and the alternative hypothesis HI is


that Y = exp(X). Design a test that on the basis of one observation will choose one of the
above hypotheses while minimizing the probability of error.

Problem 1.3. Consider two independent, zero mean Gaussian random variables Xl and
X 2 that under hypothesis Ho each have unit variance and under HI each have a variance
equal to 2. Assign a unit cost to an incorrect decision and zero cost to a correct decision.
Find a minimax test for this problem based upon the two observations Xl and X 2 .

Problem 1.4. Let N be a random variable with a density function given by

1
f(x) = "2 exp( -Ixl)

for all x E lR, and consider a hypothesis testing problem in which Ho states that Y = Nand
HI states that Y = N + s. Let s = 2, and assume that the prior probabilities for Ho and HI
are given by ITo = 0.1 and ITI = 0.9, respectively. Design a hypothesis test that minimizes
the probability of error. For what values of the observation Y does your test announce that
the signal s is present?

www.MathGeek.com
www.MathGeek.com

Problem 1.5. (A.) Let 8 be a random variable with probability density function f(8), and
assume that a probability density function for our observation X is given by Pe (x) when
8 = 8. Show that b is a Bayesian decision rule if for each x the decision b( x) is chosen to
minimize fJR L( 8, b(:c))g(8Ix) d8, where

f(8)pe(x)
g(81:c) = f f(¢)pq,(x)d¢

is a conditional probability density function of <-:> given X = x.

(B) Consider a two-decision problem in which Wo and WI are, respectively, the sets of 8-values
for which do and d l are the correct decisions. Assume that the loss is 0 when the correct
decision is made, and otherwise is given by L(8, do) = a if 8 E WI and L(8, dd = f3 if 8 E Woo
Show that a Bayes solution consists in choosing decision do if

aP(8 E wIIX = x) < f3P(8 E wolX = x)

and decision d l if the reverse inequality holds where the choice of decision is immaterial in
case of equality.

1.2. Neyman-Pearson Tests

Problem 1.6. Consider a Gaussian random variable X with mean 8 and unit variance. Let
0:0 = 0.05 and design a level-o:o Neyman-Pearson test for testing Ho: 8 = 0 against HI:
8 = 1000. If we observe x = 1.8, then will we accept or reject the hypothesis Ho? Does this
make sense for an "optimal" test?

Problem 1. 7. Consider a random variable Y and two probability density functions f and
g. Let Ho denote the hypothesis that Y is distributed according to the density function f
and let HI denote the hypothesis that Y is distributed according to the density function g.
A hypothesis test is designed for this situation such that the false alarm probability 0: of the
test is equal to 0.01 and the power of the test f3 is equal to 0.0099. Is it possible that this
test is a Neyman-Pearson test for the level ao = 0.01? Why or why not?

Problem 1.8. Consider a detection problem in which we desire to test for the presence
of a decaying exponential signal in zero mean, mutually independent Gaussian noise. vVe
will base our decision on k mutually independent samples of the received signal. Hence, we
model the situation as follows where the t/s denote our sampling times:

Ho: Xi = Ni for i = 1, ... , k


HI: Xi = exp( -t,) + Ni for i = 1, ... , k.

www.MathGeek.com
www.MathGeek.com

Assume that the Ni's are identically distributed and mutually independent with NI having
a zero mean Gaussian distribution with a positive variance denoted by 1T2. Unfortunately,
in taking our samples we encounter a problem in synchronizing the clock at the receiver. In
particular, assume that the sampling times t l , ... ,tk are modeled as ti = iA - e where A > 0
and e is a fixed yet unknown parameter that lies in the interval

[-A A]
10' 10 .

Design and describe a level-ao Neyman-Pearson detector for this situation. What is the
threshold in terms of 000, A, 0", k, and e? Does the threshold depend on e? Find an
expression for f3 in terms of 000, A, IT, k, and e.

Problem 1.9. Consider the following decision problem: Ho: Yi = Ni for i = 1,2, ... , k
versus HI: Yi = s + Ni for i = 1,2, ... , k, where s is a fixed positive constant and where the
N/s are mutually independent zero mean Gaussian random variables each with variance 2.

(A) Let k = 9. What is the smallest positive signal s for which a Neyman-Pearson test with
false alarm probability 0.01 has a detection probability 13 ~ 0.97?

(B) Let s = 1. \Vhat is the smallest number of observations k for which a Neyman-Pearson
test with false alarm probability 0.05 has a detection probability 13 > 0.99?

Problem 1.10. Consider a collection {Xl, ... ,X,,} of mutually independent, identically
distributed random variables each with a N (e, 1) distribution for some real number e. Let
Ho denote the hypothesis that e = 0 and let HI denote the hypothesis that e = 1/2. Let
000 = 0.005.

(A) Design a Neyman-Pearson test for this problem as a function of n. \Vhat is the power
of the test if n = 9?

(B) Now, consider a different test. Flip a fair coin. If the outcome is heads then perform
the test in part (A) with n = 2. If, on the other hand, the outcome is tails then perform the
test in part (A) with n = 16. What is the power of this test? \Vhat can you conclude about
tests that allow random samples sizes?

Problem 1.11. Consider a coin that, when flipped, comes up heads with probability p and
tails with probability 1 - p. We flip the coin twice and then must decide whether p = ~
(which we will call hypothesis Ho) or p = ~ (which we will call hypothesis Hd. Determine
a procedure that is most powerful for testing Ho against HI subject to the constraint that
0: not exceed ~. What is the power of your test?

Problem 1.12. Consider the problem of detecting a continuous signal s(t) = exp( -t) for
t ~ 0 on the basis of two samples. Assume that the observations are either of the form

www.MathGeek.com
www.MathGeek.com

Yi = Ni (which we will call hypothesis Ho) or Yi = s(h.i) + Ni (which we will call hypothesis
Hd where h is a positive constant and where i = 1,2. Assume that N1 and N2 are zero
mean, mutually Gaussian random variables such that E[N?] = E[N?] = 1 and E[N1N2] = ~.
Let ao = 0.1. Design a Neyman-Pearson test for this situation. What is the power of the
test if h = 0.1?

Problem 1.13. Consider a problem in which we wish to test for the presence of a known
positive signal s(t) = exp( -t) defined for t 2: 0 on the basis of three samples corrupted
by additive Gaussian noise. In particular, let Sk = s((k - l)h) for k = 1,2,3 and h > 0,
let Zl, Z2, Z3 be mutually independent, identically distributed Gaussian random variables
each with mean zero and variance 110 , and let Xl, X 2 , X3 denote our observations. Further,
for k = 1,2,3, let Ho denote the hypothesis that X k is equal to Zk and let H1 denote the
hypothesis that X k is equal to Zk + Sic. Design a Neyman-Pearson test with a level of
significance given by ao = 0.05. What is the maximum sampling interval h such that the
power of the test is not less than 0.99?

Problem 1.14. Consider a binary channel such that if we transmit a 'zero' we receive a
'zero' at the other end with probability 1 - Ao and we receive a 'one' with probability Ao.
Further, if we transmit a 'one' we receive a 'one' with probability 1 - A1 and we receive
a 'zero' with probability A1. (Assume that 0 < Ao < 1, 0 < A1 < 1, and Ao + A1 < 1.)
Consider the transmission and subsequent receipt of a single binary digit. Let Ho denote
the hypothesis that a 'zero' was transmitted and let H1 denote the hypothesis that a 'one'
was transmitted. Design a Neyman-Pearson test for this situation and find the power as a
function of the size ao for 0 S ao S 1. Sketch an ROC (Receiver Operating Characteristic)
curve for the special case when Ao = A1 = ~ and for the special case when Ao = A1 = 1/8.
(An ROC curve is a plot of the power of a test as a function of the size of the test.)

Probleln 1.15. A test to determine whether a communications satellite is in working order


might be run as follows: A very strong signal is sent from Earth. The satellite responds by
sending a signal v > 0 for n seconds if it is working or does not respond if it is not working.
After an appropriate delay, we take n samples (one each second) Xl, ... , xn and we assume
that our observations are random samples from a Gaussian distribution with positive known
variance rJ"2 and mean JL. Hypothesis Ho is that JL = 0 and hypothesis H1 is that JL = 'U.
Two systems are available for testing the satellite. System A has a signal to noise ratio of
'u/rJ = 2 and System B is such that 'u/rJ = 1. System A costs $1,000,000 and System Beasts
$250,000. One second of transmission with either system costs $1000. You need to test your
satellite 100 times each year and decide to do so each time with a Neyman-Pearson test
designed for a level of significance of 0.05. Assume that each time you test the satellite, you
want the number of seconds of response to be sufficient to ensure that the power of your test
exceeds 0.95. What is the cost to purchase and operate each system for one year?

www.MathGeek.com
www.MathGeek.com

Problem 1.16. Consider a random variable X that under hypothesis Ho has density

Po(x) =
{
2.
g( x+1 )
ifO<x<l
otherwise.

and under hypothesis HI has density

PI (:c) = {0 if 0
1 <x<1
otherwise.

(A) Find a minima.x test of Ho versus HI based upon the single observation X.

(B) Find a Neyman-Pearson test of Ho versus HI with size 110 based upon the single obser-
vation X.

Problem 1.17. Consider a random variable X that under hypothesis Ho has a uniform
distribution on (0,1) and that under hypothesis HI has a uniform distribution on (~,~). Of
the following three tests, which one is or which ones are Neyman-Pearson tests of Ho versus
HI based upon the single observation X and with a level of significance equal to ~?

. 7
'() 11fx>-
CPI x = { 8
o otherwise

I 5
1 if-<x<-orx>l
q)2(X) = 2 8
{ o otherwise

9,(X) ~ I 1
1
-
4
0
if x> 1

.
.1
if-<x<l
2- -
1
If x < 2"

Problem 1.18. Consider three probability distributions Po, PI, and P2 on {O, 1, 2} defined
as follows:

1
Po ({O}) = Po ({I}) = Po ({2}) = -;
oJ

www.MathGeek.com
www.MathGeek.com

1
H({I}) = 1- H({2}) ="3

Consider the problem of testing Ho: {Po} against HI: {PI, P2 } on the basis of one sample.

(A) Does there exist a UMP test in this case that has a level of significance equal to 1/6? If
yes, then find the test. If no, then explain why not.

(B) Does there exist a UMP test in this case that has a level of significance equal to 1/27 If
yes, then find the test. If no, then explain why not.

Problem 1.19. Consider a collection {YI , Y2, . .. , Yd of random variables such that Xi =
f(t i ) + Ni for each i where f: JR. ---7 JR., where

and where the Ni's are mutually independent, zero mean Gaussian random variables each
with positive variance (J2. Hypothesis Ho states that f(t) = cos(t) and hypothesis HI states
that f(t) = sin(t). Find a Neyman-Pearson test of Ho against HI at size ao based on
{YI , Y2, . .. ,Yd· What is the power of your test? How should one choose the sampling times
in order to maximize the power?

Problem 1.20. Consider a random variable X with a probability density function given by

fa.b(X) =
I
b exp
(-(x-a))
a b
if x ~ a
{ if x < a

where a E JR. and b > o. (Assume that b is known.) Design a Neyman-Pearson test of size
ao for testing Ho : a = ao against HI : a = al based upon X where al < ao. \Vhat is the
power of your test?

Problem 1.21. Consider the detection problem

Ho Xi = Ni for i = 1, ... , 100


HI Xi = 2 + Ni for i = 1, ... , 100
where N I , ... , NlOO are mutually independent, Gaussian random variables each with mean
zero and variance 9. Find a value of ao so that a Neyman-Pearson test of Ho versus HI with

www.MathGeek.com
www.MathGeek.com

size ao is also a minimax test of Ho versus HI with respect to a loss function that assigns a
unit loss to an error and a zero loss otherwise.

Problem 1.22. Consider the following probability density functions:

l ifO<x<l
Po (:c) =
{O otherwise

14 ~:4'
1
ifO<x<-
- 2
PI (:r;) 1
if"2<x<l
otherwise.

Further, consider a random variable X that under hypothesis Ho has density Po and under
hypothesis HI has density Pl. On the basis of the single observation X:

(A) Find a Bayes test of Ho versus HI assuming that Ho and HI are equally likely.

(B) Find a minimax test of Ho versus HI with respect to a loss function that assigns a unit
loss to an error and a zero loss otherwise.

(C) Find a Neyman-Pearson test of Ho versus HI with size 160. What is the power of this
test?

Problem 1.23. Consider the following probability density functions:

exp( -x)
fo(x) { 0
if x> 0
if x::::; 0
exp( -(x - 1)) if x>l
JI(x) = { 0 if x<1.

Consider a random variable X that under hypothesis Ho has density fo and under hypothesis
HI has density JI. On the basis of the single observation X:

(A) Find a Neyman-Pearson test of Ho versus HI with size 11 •


0
What is the power of the
test?

(B) Find a non-randomized Neyman-Pearson test of Ho versus HI with size 11 •


0

Problem 1.24. Consider the following system in which the random variable N is Gaussian
with mean zero and positive variance (J2 and the random variable !vI takes on the value 1
with probability ~ and the value -1 with probability ~. Thus, under H o, Y = N, and under
HI, Y = aM a
+ N. Assume that is nonzero and that }v1 and N are independent. Find the
form of a Neyman-Pearson test of Ho versus HI based on the single observation Y. (You do
not need to find the threshold.) Is your test uniformly most powerful over all nonzero a?

www.MathGeek.com
www.MathGeek.com

M N

Problem 1.25. Consider a random variable X that has a Poisson distribution with para-
meter A. That is, assume that

Ak
P(X = k) = -, exp( -A)
k
for k = 0,1,2,.... Assume that A = Ao under Ho and that A = Al under HI where
Al > Ao > O. For what values of ao will there exist a nonrandomized Neyman-Pearson test
of Ho versus HI? Find a Neyman-Pearson test of Ho versus HI if Ao = 1 and ao = 0.02.

1.3. Locally Optimal Tests

Problem 1.26. Consider the detection problem

Ho Y; = Ni fori = 1, ... , k
HI Y;=.s+Ni fori=l, ... ,k

where k is a positive integer, where s is a positive constant signal, and where N 1 , ... , Nk are
mutually independent random variables each with probability density function

(A) Find the form of a locally optimal test of Ho versus HI. (You do not need to find the
threshold. )

(B) Use the result of Part (A) to find the form of a Neyman-Pearson test of Ho versus HI.
(Again, you do not need to find the threshold.)

Problem 1.27. Consider the following decision problem: Ho: Xi = Ni for i = 1,2,3 versus
HI: Xi = S + Ni for i = 1,2,3, where s is a fixed (unknown) positive constant and where
N 1 , N 2 , and N3 are mutually independent random variables each with a probability density
function given by
1
j(x) = "2 exp( -Ixl)

www.MathGeek.com
www.MathGeek.com

for x E II{.

(A) Design a locally optimal test for this problem subject to the constraint that the false
alarm probability not exceed 116 •

(B) What is the smallest value of s for which the detection probability /3 is not less than
0.497

(C) Under what circumstances will (3 exceed F


Problem 1.28. Consider a random variable N that possesses a probability density function
of the form
1
fN(X) = 1T(1 + x2)
Let X be a random variable that under Ho is equal to N and under HI is equal to l{ + 5
where s is positive. We desire to determine the true hypothesis Ho or HI based on the single
observation X.

(A) Find the form of a Neyman-Pearson test for this problem. (You do not need to find the
threshold.) Is it uniformly most powerful over all positive signals 57 \Vill it always announce
that the signal is present whenever the observation is much larger than s7

(B) Find the form of a locally optimal test for this problem. (You do not need to find the
threshold.) Will it always announce that the signal is present whenever the observation is
much larger than 57

2. Estimation Problems

2.1. General Problems

Problem 2.1. Consider a marathon with N participants. Further, assume that these N
runners are each wearing an identifying tag displaying a different number between 1 and
N. As you drive by you see a runner wearing the number 87. How might you use that
information to estimate N7

Problenl 2.2. Let X be a Gaussian random variable with mean 8 and positive variance (J2.
Find the Fisher information 1(8) that X contains about the parameter 8.

Problem 2.3. Let X be a Gaussian random variable with mean 8 and positive variance (J2.
Find the Fisher information 1((J2) that X contains about the parameter (J2.

10

www.MathGeek.com
www.MathGeek.com

Problem 2.4. Let Xl, X 2 , and X3 be mutually independent random variables with Xl and
X 2 identically distributed. Further, let Xl take on the values 0 and 2 each with probability
~ and let X3 take on the values 1 and ~ each with probability ~. Now, consider the problem
of estimating which of these three distributions has the largest mean. A. natural method
of proceeding is to take a random sample from each distribution and then to select the
distribution that produces the largest sample mean. Answer the following questions under
the assumption that such a procedure is used.

(A) What is the probability of correctly determining the distribution with the largest mean
if we take one sample from each distribution.

(B) \Vhat is the probability of correctly determining the distribution with the largest mean if
we take one sample from the distributions of Xl and X 2 and two samples from the distribution
governing X3?

(C) Will an estimate based upon 17, samples always be at least as good as an estimate based
on m samples if 17, > m?

Problem 2.5. Let X and Y be random variables with probability density functions iI (x - 8)
and h(x - 8), respectively, where 8 is some fixed yet unknown real number. Assume that Jl
and h are continuous and even.

(A) If Jl (0) > h(O) then show that P(IX - 81 :::; s) > P(IY - 81 :::; s) for some positive value
of s.

(B) Now, let k > 3 be a fixed integer, let 8 be a fixed yet unknown real number, and let Xl
and X 2 be independent random variables each with a density given by J(x - 8) where
k-1
J(:c) = 2(1 + Ixl)k'
Also, consider a cost function C E that assigns a cost of 1 to errors that are larger in magnitude
than some positive constant s and that assigns a zero cost to smaller errors. (In particular,
an estimator eof 8 is "good" if p(le - 81 :::; c:) is large.) Let Y denote the sample mean of Xl
and X 2 • Show that there exists some positive c: for which P(IXI - 81 :::; c:) > P(IY - 81 :::; c:).
That is, show that a single observation may yield a better estimate of the mean than will a
sample mean of two observations.

2.2. Unbiased Estimators

Problem 2.6. Consider a random variable X that has a Poisson distribution with parameter
). > 0; that is, assume that
P(X = x) = X" exp( -).)
x!
11

www.MathGeek.com
www.MathGeek.com

for x = 0,1,2, .... Consider the problem of estimating the parameter exp( -3A) based upon
one sample from the distribution for X. Show that T(X) = (-2)X is an unbiased estimator
for exp( -3A). Is T(X) a reasonable estimator for exp( -3A)?

Problem 2.7. Consider an unbiased estimator T of a parameter 8. Find a condition that


is both necessary and sufficient for T2 to be an unbiased estimator of 8 2 .

Problem 2.8. Let N be a fixed positive integer. Toss a coin N times and, for 1 :::; i :::; N,
let Xi be 1 or 0 according to whether the ith toss is a head or a tail. Let the probability of
tossing a head be given by some fixed yet unknown value 8 from the interval [0,1]. For what
functions g: [0,1] --7lR do there exist unbiased estimators of g(8)?

Problem 2.9. Consider a random variable X with a Poisson distribution given by

for each nonnegative integer k and for some fixed positive value of A. Let T(x) be any
nonconstant function of :c such that T(X) provides an estimate of 1/ A. Show that T(X)
cannot be an unbiased estimator of 1/ A.

Problem 2.10. Consider a parameterized family {fe : 8 E lR} of probability density func-
tions in which 113 is a Gaussian density function with mean 8 and unit variance. Let
Xl, ... ,Xn denote a collection of identically distributed, mutually independent random
variables each having a density function given by 113 for some fixed yet unknown value
of 8. Let T(X l , ... , Xn) be an unbiased estimator for 8. Find a positive lower bound for
VARe(T(Xl , ... , Xn)).

Problem 2.11. Consider a random variable X such that

for k = 0,1,2, ... and for some fixed, but unknown, positive constant A. (That is, let X have
a Poisson distribution with parameter A. Note that E[X] = VAR[X] = A.) Find the Fisher
information that X contains about the parameter 8 = exp( -A). Let

T(x) = { 1 ~f x = 0
o If x i= O.
Is T(X) an unbiased estimator of 8? Is T(X) an efficient estimator of 8? Hint: Note that
exp ( x) > :c + 1 for all nonzero :c.

12

www.MathGeek.com
www.MathGeek.com

2.3. Maximum Likelihood Estimation

Problem 2.12. Consider a random sample Xl, ... , Xn from a distribution with a fixed yet
unknown finite mean a E JR.. If a maximum likelihood estimate for a exists will it always be
given by the sample mean ~(XI + ... + Xn)? \Vhyor why not?

Problem 2.13. Consider a random variable X for which P(X = 1) = p and P(X = 0) =
1 - p where p is a fixed yet unknown element from [~, ~J. Find a maximum likelihood
estimate for p. \Vhat is the mean square error for this estimate? Find a constant estimator
for p which always has a smaller mean square error than the maximum likelihood estimator
that you found.

Problem 2.14. Consider a coin with a probability of heads given by some fixed yet unknown
value of a from the interval (0,1). Does there exist a maximum likelihood estimator of a
based upon a single flip of the coin. If not, why not, and if so then find one. If you found
that a maximum likelihood estimator does not exist then can you think of a simple remedy?
Does your remedy provide a reasonable estimator?

Problem 2.15. Consider a detection problem in which Ho states that Xi = Ni for i =


1, ... , k and HI states that Xi = a + Ni fori = 1, ... , k and lal > 0 where the N/s are iden-
tically distributed and mutually independent each with a zero mean Gaussian distribution
with a positive variance (J2. When (as in this case) a UMP test does not exist we some-
times use a generalized likelihood ratio test. A generalized likelihood ratio test is simply a
threshold test in which the "processor" is given by

_ fl(B(XI, ... ,Xk),XI, ... ,Xk)


A9 (XI,···,Xk ) - )
fO(XI, ... , Xk
where fO(XI, ... , Xk) is a joint probability density for the X/s under hypothesis H o, where
fl(a,XI, ... ,Xk) is a joint probability density for the Xi's under hypothesis HI assuming
that a is the true parameter, and where 8(XI, ... , Xk) is a maximum likelihood estimator of
a based upon the Xi'S under hypothesis HI.
(A) Find B. Is it unbiased? Is it efficient?

(B) Find a generalized likelihood ratio test for this example. What is the power of your test?

(C) Assume that a is known to be positive, and find a UMP test. What is the power of this
test? How does this power compare to your result in (B)?

Problem 2.16. Consider a family of probability density functions {fe : a E JR.} where

13

www.MathGeek.com
www.MathGeek.com

(Recall that IA (:c) equals 1 if x E A and equals 0 otherwise.) For a fixed positive integer
n, let Xl, ... , Xn be a collection of mutually independent, identically distributed random
variables each with a probability density function given by Ie for some fixed yet unknown
value of e. e
Find a maximum likelihood estimator for as a function of Xl, ... ,Xn . Is this
maximum likelihood estimator unique?

Problem 2.17. Consider a family of probability density functions {Ie : e E (0, oo)} where

For a fixed positive integer n, let Xl, ... ,Xn be a collection of mutually independent, iden-
tically distributed random variables each with a probability density function given by Ie for
some fixed yet unknown value of e. Find a maximum likelihood estimator for e as a function
of Xl, ... ,Xn . Is this maximum likelihood estimator unbiased?

e
Problem 2.18. An estimator is said to be admissible with respect to the squared error
cost function ifthere exists no estimator e such that Ee [( e-e)2] :::; Ee [( e- e)2] for all allowable
values of e with the inequality being strict for some value of e. Consider a random variable
X with a distribution given by P(X = 1) = e and P(X = 0) = 1 - e where i : :;
e :::; ~.

(A) Find a maximum likelihood estimator of e based on one sample from the distribution of
X.

(B) Consider the collection of all estimators ee< that are of the form

a ifx=O
en(x) = { 1 - a I'f- 1
x-

where i : :;
a :::; ~. (For what value of a is e",(X) equal to the estimator that you found in
part (A)?) Show that the maximum likelihood estimator you found in part (A) has a larger
e
mean square error than a for any a such that < a :::; ~. i
(C) Is the maximum likelihood estimator you found in part (A) an admissible estimator?

Problem 2.19. Let Ie be a density function of a triangular distribution on a fixed interval


[0, A] where A > 0 and where the peak of the triangle is at x = e where e is some fixed yet
unknown element from (0, A). That is,

J,(c) ~ I 2x
Ae
2(A - x)
A(A - e)
ifO:::;x:::;e

ife<x<A.

14

www.MathGeek.com
www.MathGeek.com

Consider a maximum likelihood estimator Bof (J based upon n samples Xl, ... ,Xn from the
distribution .Ie where we assume without loss of generality that the 11 observations have been
arranged so that 0 :::; Xl :::; X2 :::; ... :::; Xn :::; A.

(A) Show that a maximum of the likelihood function is attainable only when Bis equal to
one of the n observations.

(B) Prove or Disprove: For the jth sample to be a possible maximum likelihood estimator
for (J it must be true that
j-1 j
~~A<x <-A.
n J n

(C) Must Bbe given by any particular order statistic ofthe sample? If so, which one? (Note:
Consider a collection Xl' ... ' X" of random variables defined on some probability space
(0, J, P). For each W E 0, let Zl (w) take on the smallest value in the set {Xl (w), ... ,Xn(w)},
let Z2(W) take on the next smallest value in that set, and so on until Zn(w) which takes on
the largest value in that set. The random variable Zk is called the kth order statistic of the
set {Xl, ... , X n }.)

Problem 2.20. Consider independent random variables Xl and X 2 each with a probability
density function given by
2(J2
fe(x) = (x + (J)3
for X > 0 and zero for X :::; 0 where (J is some fixed yet unknown positive real number.

(A) Find a maximum likelihood estimate of (J based upon Xl. (That is, find a solution
to the likelihood equation and show that your solution corresponds to a maximum.) Is
your estimate unbiased? Is your estimate admissible with respect to the squared error cost
function?

(B) Find a maximum likelihood estimate of e based upon Xl and X 2 . (For this part of the
problem you need only find a solution to the likelihood equation. You do not need to prove
that your solution corresponds to a ma.ximum.)

Problem 2.21. Let Xl, X 2 , X3 be mutually independent, identically distributed random


variables such that P(Xi = 1) = 1 - P(X; = 0) = () for some fixed yet unknown value of ()
from the interval [0,1]. Let Y = X 2 X 3 . Find a maximum likelihood estimate of e based on
Xl and Y.

Problem 2.22. Consider a probability density function of the form

for 0 < x :::; ()2


otherwise

15

www.MathGeek.com
www.MathGeek.com

where 8 1 and 8'2 are positive constants. Let Xl, ... ,Xn be a collection of mutually indepen-
dent, identically distributed random variables each with probability density function ./e 1 ,e 2 •

(A) Assume that 81 is a known positive constant. Find a maximum likelihood estimate for 82
as a function of Xl, . .. , X n. Be sure to show that your solution corresponds to a maximu.m
of the likelihood function.

(B) Assume that 82 is a known positive constant. Find a maximum likelihood estimate for
81 as a function of Xl, ... ,Xn. Again, be sure to show that your solution corresponds to a
maximu.m of the likelihood function.

Problem 2.23. Consider a random variable X that has a probability density function of
the form
./e(x)=1~8x
where -1 < x < 1 and -1 ::; 8 ::; 1 and where, as usual, we assume that 8 is fixed but
unknown. Find a maximum likelihood estimate of 8 as a function of X. Is your estimate
unbiased? What is the mean square error of your estimate? Is your estimate admissible?

Problern 2.24. Consider a collection Xl, . .. , Xn of mutually independent, Gaussian ran-


dom variables each with mean 8 and variance 82 where 8 is some fixed, but unknown,
positive real number. Find a candidate for a ma.ximum likelihood estimate of 8 as a function
of Xl, . .. , X n. (You do not need to prove that your candidate corresponds to a maximum.)

Problem 2.25. Consider mutually independent, identically distributed random variables


Xl, ... ,Xn such that each has a probability density function of the form

fe(x) = { exp(-(x - 8)) if x ~.8


o otherwIse

where 8 is some fixed, but unknown, real number. Find a maximum likelihood estimate of
8 as a function of Xl, . .. ,Xn. Is your estimate unbiased?

2.4. Minimum Mean Square Estimation

Problem 2.26. Consider a zero mean, wide sense stationary random process {X (t) : t E JR.}
with autocorrelation function R(T) = E[X(t)X(t+T)]. Find a minimum mean square linear
estimate of X (t + k) as a function of X (t) where k is any fixed positive real number.

Problem 2.27. Consider a zero mean, wide sense stationary random process {X(t) : t E JR.}
with autocorrelation function R(T) = E[X(t)X(t+T)]. Find a minimum mean square linear

16

www.MathGeek.com
www.MathGeek.com

estimate of X(t) in terms of X(O) and X(T) where T is any fixed positive real number.
\Vhat is your estimate when t = T /2?

Problem 2.28. Consider a zero mean, wide sense stationary random process {X(t) : t E JR.}
with autocorrelation function R(T) = E[X(t)X(t + T)]. Fix a positive real number T and
T
assume that R(t) is integrable over [0, T]. Assume that the integral fo X(t) dt exists for
each sample path of X(t). Find a minimum mean square linear estimate of foT X(t) dt in
terms of X(O) and X(T).

Problem 2.29. Consider a zero mean wide sense stationary random process {X(t) : t E JR.}
with an autocorrelation function given by R(T) = E[X(t + T)X(t)] = exp(-ITI). We desire
to estimate X(t) via a linear combination of X(t - 1) and X(t - 2) so as to minimize
the mean square error. That is, we wish to estimate X(t) via f(X(t - 1), X(t - 2)) =
a1X(t - 1) + a"2X(t - 2) so that E[(X(t) - f(X(t - 1), X(t - 2)))"2] is minimized. Find the
constants a1 and a"2.

Problem 2.30. Suppose that we want to build a filter that is modeled by convolution of
an input :c(t) with h(t) where h(t) = 1 if 0 ::; t < 1 and h(t) = 0 elsewhere. Since we cannot
it
build such a filter we decide to construct an approximation to it using in place of h where
h(t) = Ct1 exp( -t) + C~"2t exp( -t) for t :2: 0 and h(t) = 0 for t < o. Our goal will be to
minimize fo'XO Ih(t) - h(t)12 dt. Find a1 and a2.

Problem 2.31. Consider a random variable X that has a uniform distribution on the
interval [0,1]. Find a minimum mean square affine estimate of X 3 in terms of X 2 and X.
\Vhat is the mean square error of your estimate?

2.5. Hilbert Spaces

Problem 2.32. Let X denote the set of all continuous functions f:[O, 1] ----:- JR. such that
.1(0) = O. Let 11.111 denote the supremum (least upper bound) of the set {If(t)1 : t E [0, I]}.
Let K denote the collection of all functions 9 in X such that f01 g(t) dt = 1.

(A) Show that II . II is a norm on X.

(B) Show that K is convex.

(C) Show that there is no point in K with minimum norm.

Problem 2.33. Consider a Hilbert space H and a closed proper subspace M of H. Consider
the function PM(h) that maps a point h in H to the point in !vI that is nearest to h. That
is, PM is the orthogonal projection of H onto M.

17

www.MathGeek.com
www.MathGeek.com

(B) Let N be another closed proper subspace of H. Show that the subspace j\l1 is orthogonal
to the subspace N if and only if PM 0 P N = P N 0 PM = o.

(C) Show that PM 0 PN = 0 if and only if P N 0 PM = o.


Problem 2.34. For this problem consider the Hilbert space H of all second order random
variables defined on some probability space where (S, T) = E[ST]. Let X be a zero mean,
unit variance Gaussian random variable. Let

1
a2.l +X

for real numbers (~i,j. For what choice of the ai,j'S are the Y; 's orthogonal? Find the
minimum mean-square linear estimate of Y4 based upon Y1 , }2, and }3. Find the minimum
mean-square nonlinear estimate of Y4 based upon Yi, }2, and Y3. (Hint: E[X4] = 3.)

Problem 2.35. (A) Let ]'vl be a closed subspace of a Hilbert space H and, for :c E H, let
x= PM(x) + QM(X) where PM(x) E J'vl and QM(X) E M.1.. Prove that

max { 1\ x, y) I : y E M.1., II y II = 1 } = II QM (x) II·

(B) Use the result from Part (A) to find the maximum value of

subject to the following constraints on g:

1: g(x) dx 0

1: xg(x) dx 0

1: 2
x g(x) dx 0

1: i(:c) dx 1.

18

www.MathGeek.com
www.MathGeek.com

3. Detection Solutions

3.1. Bayesian and Minimax Tests

Solution 1.1. Let the parameter set be given by 8 = {HH,HT}, where HH denotes the
coin with two heads and HT denotes the standard coin. The sample space for a single flip of
the coin is given by {H,T}, where H means that we observed a Head and T means that we
observed a Tail. Our decision set D is the same as 8. Let X denote the observation from
our single coin flip. Our decision rule b has the following distribution:

P(b(X) = HTIX = T) 1
P(b(X) = HTIX = H) p
P(b(X) = HHIX = T) o
P(8(X) = HHIX = H) 1- p.

Thus, the risk function R is given by

R(HH, b) EHH[L(HH, b(X))]


PHH(b(X) = HT)L(HH, HT) + PHH(b(X) = HH)L(HH, HH)
p x 1 + (1 - p) x 0
p

and

R(HT, 8) EHT[L(HT, b(X))]


PHT(8(X) = HT)L(HT, HT) + PHT(b(X) = HH)L(HT, HH)
(~+~) xO+~(1-P)X1
1
2"(1 - p).

The maximum risk is then given by max (p, ~(1- p)), which is minimized when p =~.

Solution 1.2. Since X is Gaussian with mean zero and variance ()2, it follows that X/IT
has a standard Gaussian distribution. Thus, X 2 / ()2 has a chi-square distribution with one
degree of freedom. That is, X 2 / ()2 has a density function given by

1
f(x) = y'2Jrx exp
(X)
-"2
for X > o. Thus, X 2
has a density function given by

1
Po (x) = ()2 f (X)
()2
1
= (J y'21TX (X)
exp - 2()2

19

www.MathGeek.com
www.MathGeek.com

for x > o.
Next, note that

P(X ::; In(x))


In(x)

J
-00
_ 1 exp (_ y"2. ) dy
(Jy27f 2(J"2

for x > o. Further, it follows from Leibnitz's rule that

Thus, eX possesses a density function given by


1 1 In (x)
PI(X) = - - - e x p ( - - -"2 . - )
rY/2ii X 2rY"2
for x > o.
From class notes, we know that a test that minimizes the probability of error is given by
Hl
A(y) = PI(y) > ITo.
Po(Y) < ITI
Ho

Since the nonlinearities are equally likely, it follows that ITo ~. Thus, our test
consists in comparing A(y) to 1, where

_1_!exp (_In"2(y))
rYy27f y 2rY2
A(y)
------::1= exp ( _ _ y )
(J yl2ny 2(J"2
1
ylYex p ( 2rY"2(y-ln
1 (y))2) .
Solution 1.3. To begin, note that

20

www.MathGeek.com
www.MathGeek.com

which after reducing yields a test of the form

Let 5 = Xf + X5
and note that 5 is our test statistic. Under H o, 5 has a chi-square
distribution with 2 degrees of freedom. Thus, under H o, 5 has density

Po(x) = ~ exp (-~)


for x ~ o. Under HI, 5/2 is chi-square with 2 degrees of freedom. Thus, under HI, 5 has
density
PI(X) = ~po (~) .
For a minimax test, we choose T so that the probability of error of the first kind (Qo) is
equal to the probability of error of the second kind (Qd. Note that

and that

Thus, we seek T such that

~exp
2
(_T) = ~ _ ~exp T)
4 2 2
(_
2'

or that y"2 +y - 1 = 0, where y = exp( - T /4). Solving yields


-l±VS
y=---
2
which implies (since y > 0 is required) that

T = -4ln
(
-1 +
2 vs) ;::::: 1.925.

Solution 1.4. From class notes we know that the processor in this case is given by
-2 ify:S;O
g(y) = ~(y - 1) ifO:S;y:S;2
{ if y ~ 2,

21

www.MathGeek.com
www.MathGeek.com

where the threshold is given by

T = In (~:) ~ -2.197.
Since g(y) > T for all possible values of y, it follows that our test will always announce that
the signal is present regardless of our observation.

Solution 1.5. (A) Assume that there exists a decision rule 8 that minimizes

J
00

L(B, 8(x))g(Blx) dB
-00

for each x. Let 8 be another rule, and note that

J
00

rU,8) = R(B, 8)f(B) dB


-00

J
00

Ee[L(B, 8(X))lf(B) dB
-00

JJ
00 00

L(B, 8(x))pe(x) dx f(B) dB


-00 -00

= =
JJ
-00 -00
L(B, 8(x)) pe~;~(B) dB h(x) dx where h(x) = J
00

-00
Pe(x)f(B) dB

JJ
00 00

L(B, 8(x) )g( Blx) dB h(x) dx


-00 -00

JJ
(Xl (Xl

< L(B,lJ(:c))g(Blx) dB h(x) dx by our assumption


-00 -00

J=
-00
Ee[L(B, 8(X))lf(B) dB

rU,8).

Thus, it follows that 8 is a Bayes rule since its average risk is not greater than the average
risk of any other test.

22

www.MathGeek.com
www.MathGeek.com

(B) Note that

00
a J g(e Ix) de if b(x) = do
/ L(e, b(x))g(elx) de
{ 132 g(elx) de if b(x) = d l
-00

= {aF(8 E wIIX = x) if 8(:£) = do


f3P(8 E wolX = x) if 8(:c) = dl .

From Part (A), we know that a rule that minimizes the previous expression is a Bayes rule.
Thus, a Bayes rule 80 is given by

b () = {do if aF(8 E wIIX = x) < f3P(8 E wolX = x)


oX d l if aF(8 E wIIX = x) ;:::: (3P(8 E wolX = x).

Since the decision rule b~ given by

8' (x) = {do if aF(8 E wIIX = x) ::; )3F(8 E wolX = x)


o dl if aF(8 E wIIX = x) > 13F(8 E wolX = x).

is also a Bayes rule, it follows that the decision taken when

aF(8 E wIIX = x) = f3P(8 E wolX = x)

is immaterial.

3.2. Neyman-Pearson Tests

Solution 1.6. Finding the ratio of the densities and making the standard reductions yields
a test that consists of comparing the observation X to a threshold T. Since X has a standard
Gaussian distribution under H o, it follows that T must be such that 1 - <I>(T) = ao, where
ao is the level of significance. For ao = 0.05, it follows that T ::::::; 1.65. Thus, if we observe
1.8 then our optimal test will reject Ho in favor of HI even though Ho is virtually certain
to be the conect hypothesis in light of such an observation! Although seemingly surprising,
the trouble is due to the large value of ao that we chose. We are in effect forcing our test to
be wrong 5% of the time. For this example, which is virtually singular, we could allow ao
to be much smaller without significantly lowering the power of the test.

Solution 1. 7. It is not possible for this test to be a Neyman-Pearson test. To demonstrate


this, all we need to do is find another test with a larger power and the same size. (If the
test were Neyman-Pearson then there would exist no other test with a larger power and the
same size.) Consider the test given by ¢( x) = ao. That is, consider a completely randomized

23

www.MathGeek.com
www.MathGeek.com

test that announces HI with probability ao no matter what the observation x is. The size
of this test is given by
a = Eo [<p(X)] = Eo[cto] = ao,
and the power of this test is given by
/J = Ed<;6(X)] = Edao] = ao·
Thus, since it is always possible to find a test for which the power is equal to the size, it is
impossible to have a Neyman-Pearson test for which the power is less than the size.

Solution 1.8. From class notes, we know that a Neyman-Pearson test in this situation has
the form
k HI
~Xs>T
D J J<
j=l Ho

for some threshold T where 5j = exp( -t j ) = exp( -jA) exp(B). After absorbing the constant
exp( B) into the threshold, we obtain a test of the form
k HI

Zk == Z=Xjexp(-jA) > T
.
J=l
<
Ho

for some threshold T. Under H o, the test statistic Zk is Gaussian with mean zero and
vanance
k
:2 _ z=:2
0" 0 - 0" exp
(_.) _ :2
2) A - 0"
(ex P(-2A(k + 1)) -
()
eXP(-2A))
.
exp -2A - 1
j=l
The Neyman-Pearson lemma states that the false alarm probability PO(Zk > T) must equal
ao. Thus, it follows that T = O"ocp-l(1 - ao). Note that the test does not depend upon B.
Under HI, the test statistic Zk is Gaussian with mean
k
m = Z=E[exp(-jA)exp(B) + N j ]exp(-jA)
j=l
k

z= exp(B) exp( -2jA)


j=l

and variance 0"2;. Thus, the power of the test is given by


/3 P 1 (Zk > T)
PI (Zk - m> T - m)
(T;cm)
0"0 0"0

1- cp

1 - cp ( cp-l(1 - ao) - :::J .


24

www.MathGeek.com
www.MathGeek.com

Solution 1.9. From class notes, we know that the test in this situation consists of comparing
the sum of the observations to a threshold given by
T = vk(J1)-I(l - ao).
Further, the power of the test is given by

S= 1 - 1)
(
v'ks) ,
1) -1( 1 - ao ) - ----;;-

where in this case (J = V2.


(A.) Let k = 9 and ao = 0.01. \Ve seek a value of s such that

S= 1 - 1) (1)-1(0.99) - ~) 2: 0.97.

Thus, we require that

(B) Let s = 1 and ao = 0.05. We seek a value of k such that

19 ~ j - if> (,,-1(0.95) - ~) 2- 0.99.

Thus, we require that


k 2: 2(1)-1(0.95) - 1)-1(0.01))2 ~ 31.6,
which implies that we should choose k = 32.

Solution 1.10. (A) From class notes, we know that the Neyman-Pearson test for this case
has the form

where the threshold T is given by


T = yn1)-1(0.995) = 2.576yn.
Further, the test statistic Zn has a Gaussian distribution with mean 0 and variance n under
Ho and Gaussian distribution with mean n/2 and variance n under HI. Finally, recall from
the class notes that the power of the test (written as a function of n) is given by

f3(n) = 1 - 1) (1)-1(0.995) - ~)
1 - 1)
(2.576 - 2yin) .
25

www.MathGeek.com
www.MathGeek.com

Note that 13(9) = 1 - <1.>(1.076) = 0.141.

(B) Note that the expected number of observations in this case is H2) + H16) = 9, which is
the same number of observations as were used in part (A). The power in this case, however,
is given by
1 1
s 2"13(2) + 2"13(16)
~(0.031) + ~(0.282)
0.157,

which is larger than the power of the Neyman-Pearson test considered in part (A). Thus, a
Neyman-Pearson test with a fixed number of observations can have a smaller power than a
test with a random number of observations even when the expected number of observations
for the second test is the same as the fixed number of observations used in the first test.
This does not violate the Neyman-Pearson lemma since that lemma was based on fixed
distributions and the distribution is not fixed when the number of observations is allowed to
vary.

Solution 1.11. Let Y denote the number of Heads that we observe after two flips of the
coin. Note that under Ho

y~{
0 wp 1/4 ==Po
1 wp1/2==PI
2 wp 1/4 ==P2
and that under HI

y~{
0 wp 4/9 == qo
1 wp 4/9 == qI
2 wp 1/9 == q2.
Thus, if An = qn/Pn, then Ao = 16/9, Al = 8/9, and A2 = 4/9. Using our analogy of a buyer
with a limited budget, the most "valuable" point is 17 = 0 since it has the largest ratio of
value (qn) to price (Pn). Unfortunately, the price of item 0 (Po) is 1/4, which exceeds our
budget (ao) of 1/8. Thus, we can only purchase a piece of item o. That is, we must use a
randomized test.

Our test then will be to announce HIwith probability P if Y 0, and to announce Ho


otherwise. The size of our test is given by
1
a = Po (Announce HI) = 4P.
Since the Neyman-Pearson test requires that a = ao, it follows that P = 1/2. The power of
the test is given by

26

www.MathGeek.com
www.MathGeek.com

Solution 1.12. Consider first the general situation in which we have the two hypotheses

Ha Xj = N j
HI Xj =.'3j+Nj

for j = 1, ... ,n, where the Ni's have a zero mean multivariate Gaussian distribution with
covariance matrix~. Under H a, the Xl's have joint density

and under HI, the Xl's have joint density

A(x) = PI (:r:)
Pa(x) exp (-~x T~-IX)

exp ( -"21 (x - 1
.'3) T ~ - (x - .'3) + "2IxT ~ - 1)x
exp ( -~ (x T~-IX - X T~-1 S - S T~-IX + S T L -l s - X T~-IX))

exp ( -~ (-x T~-ls - S T~-IX + S T~-ls)) .

Note that

ST(L-l)TX

.'3 T~-IX since ~ is symmetric


(sTL- 1 X)T since sT~-IX is real.

Thus, after taking the natural log of A(x) and cancelling constants, we obtain a test of the
form

Under H a, the test statistic XT~-1 s is Gaussian with mean zero and variance

S T~-IE[XXTl~-ls

sT~-I~~-IS

.'3T~-IS.

27

www.MathGeek.com
www.MathGeek.com

Under HI, the test statistic XTL- I s is Gaussian with mean s TL-I s and variance s TL-I s.
Thus,

and

Now, for the particular problem under consideration, we have TI = 2, Sl = exp( -h), and
S2 = exp( -2h). Further,

and hence
4/3 -2/3] = [()U ()12]
[ -2/3 4/3 - ()21 {)22 .
The processor is given by

Further,
h
[
h 2h ] [ 4/3 -2/3] [ e- ]
e- e- -2/3 4/3 e- 2h

-4 (-2h
e -e -3h + e -4h) .
3
Thus, our test is given by
Hi r----------------
4 -h 2 e -2h)
-"3 (4 e -2h
+"3 2 -h) > / 4 .2h .
e- 3h + e- 4h ).
( "3 e Xl - "3e X2 < 1.28Y"3 (e- -
Ho

If h = 0.1, then s TL-I s = 0.9976 and (-J = 1 - <P (1.28 - )0.9976) = 0.389.

Solution 1.13. From class notes, we know that a Neyman-Pearson test for this situation
has the form

28

www.MathGeek.com
www.MathGeek.com

where Sj = exp( -(j - 1)11,) and

Further, the power of the test is given by

;3 = 1 - <I> ( <I> -1 (0.95) - In)


(JG '

where

Thus, we seek the largest value of h for which

;3 = 1 - <I> (1.65 - JiOyfl + exp( -2h) + exp( -4h)) ~ 0.99.

Setting y = exp(-2h), we are then seeking the largest value of h for which

1.65 - JiOyfy2 + y + 1 ::; <I>-1(0.01) = -2.33,

and hence such that y2 + y - 0.584 > o. The roots of the corresponding quadratic equation
are Yl = 0.413 and Y2 = -1.413. Since y must be positive, we conclude that y must be
greater than 0.413. If y > 0.413, then h < -~ In(0.413) = 0.44215.

Solution 1.14. To begin, note that

Po(Y) = { 1 - Ao if y = 0
Ao if y = 1

and
AI if Y = 0
PI (y) = { 1 - Al if Y = 1.
Thus,
Al
if y = 0
A(y) = Pl(y) = { 1- Ao
Po(Y) 1 - Al
if y = 1.

29

www.MathGeek.com
www.MathGeek.com

Note that
. Al
1 IfT< - -
I - AO
Al 1 - Al
Po(A(Y) > T) = AO if - - < T < - -
I - AO - AO
o if T > 1 - Al
- AO '
where we have noticed that since AO + Al < 1, it follows that Al < 1 - AO and 1 - Al > AO,
and hence that
Al 1 - Al
--<1<--.
1 - AO AO
The Neyman-Pearson Lemma implies that

Po(A(Y) > T) + pPo(A(Y) = T) = ao.


Solving for T and p, we see that

1 - Al

~1
if 0 :::; ao < AO
AO
Al
T("o) -- if AO :::; 000 < 1
1- AO
0 if 000 = 1

and that
(to

~1
if 0 :::; ao < AO
AO
(to - AO
p(<>o) if AO :::; 000 <1
1 - AO
0 if (to = 1.
(Note that T(l) may actually take on any value less than A1/(l - AO), and that p(l) IS
arbitrary.) Thus, our test is given by the following procedure:

Test
o :::; 000 < AO Announce HI wp 0:0
An
if we observe '1'; else announce Ho.
000 - AO
AO :::; 000 < 1 Announce HI if we observe '1' and wp if we
1 - AO
observe '0'; else announce H o.
000 = 1 Always announce HI.

The power of the test is given by

(J = H(A(Y) > T) + pPl(A(Y) = T).

30

www.MathGeek.com
www.MathGeek.com

Thus, we see that


ao
-(1- Ad if 0 :::; ao < Ao
Ao
(1- Ad + ( a1o-- AO) Al if Ao :::; [Yo < 1
Ao
1 if ao = 1.

An ROC plot is given in Figure 3.1.

AI =A 2 =1/8

AI =A 2 =3/8

Figure 3.1: Receiver Operating Characteristic Curve for Problem 1.14

Solution 1.15. From class notes, we know that the Neyman-Pearson test consists of com-
paring the sum of the observations to the threshold T = vnO"1>-l(l - ao). Further, the
power of the test is given by

(
(3 = 1 -1> 1>-1(1 - [Yo) - vn-;;'0) .
Note that 1>-1(1 - ao) = 1>-1(0.95) = 1.65.

For System A, we seek the smallest value of n for which 1 -1>(1.65 - 2.jn) > 0.95. Solving
for n implies that we must have 17 > 2.7. Thus, for System A, we choose n = 3. For
System B, we seek the smallest value of 17 for which 1 - 1>(1.65 - fo) > 0.95. Solving for
n implies that we must have TI > 10.9. Thus, for System B, we choose n = 11. The cost
of System A is $1,000,000 + (100 x 3 x $1000) = $1,300,000 and the cost of System B is
$250,000 + (100 x 11 x $1000) = $1,350,000. Thus, we should choose System A.

Solution 1.16. (A) To begin, note that

A(x) = P1(X) = ~_1_


Po(x) 2x +1
31

www.MathGeek.com
www.MathGeek.com

for 0 < x < 1. Reducing the test, we obtain a test of the form
Ho
>
x T.
<
HI

(Note the change in the inequalities.) For a minimax test, we choose T so that Qo = Ql
where Qo = Po(X < T) and Ql = H(X > T). Thus,
T 1

Qo = Ql {::? J~(x +
o
1) dx = J
T
dx

{::? 23 (1:2
- -T+T) =1-T
2
1 2 5
{::? -T + -T - 1 = O.
3 3
Thus, since 0 < T < 1, we conclude that

T = - ~2 + v'37
2
~ 0.54138.

(B) For a Neyman-Pearson test, we will choose T so that Qo = ao where ao = 1/10. That
IS,

T
1
10
J~(x +
o
1) d:c

1 2 2
-T + -T.
3 3
Again, since 0 < T < 1, we conclude that

yi130
T = ----w- -1 ~ 0.140175.

Note that the power of the test is given by

J
0.140l75

S= Pl(X < T) = dx = 0.140175.


o

Solution 1.17. To begin, note that

() {O ifO<x<~
A( x) = PI X = 1 if ~ < x < 1
Po(x) . 3
00 Ifl<x<2.

32

www.MathGeek.com
www.MathGeek.com

Further, under H o,

A(C)~{ L
1
wp 2"
1
wp 2
wp 0
and under HI,
wp 0
1
wp 2
1
wp 2.
The Neyman-Pearson lemma states that the threshold T and the randomization constant p
must be chosen so that
1
"8 = Po(A(X) > T) + pPo(A(X) = T). (3.1)

Note that Po(A(X) > T) takes values only in {O, ~, 1}. Thus, we must choose a value of T
such that Po(A(X) > T) = 0 and such that Po(A(X) = T) i- o. The only such value of Tis
1. \Vith T = 1, equation (3.1) implies that

which thus implies that p = 1/4. Thus, our test has the form:

Announce HI if A(x) > 1


Announce HI wp ~ if A(x) = 1
Announce Ho wp ~ if A(:c) = 1
Announce Ho if A(x) < 1.

In terms of :c, the test has the form:

Announce HI if :c > 1
Announce HI wp ~ if ~ < x < 1
Announce H 0 wp ~ if ~ < x < 1
Announce Ho if x < ~.

Note that this test corresponds to 93, which means that 93 is a Neyman-Pearson test. What
about tests 91 and 93?

Although the test 93 is constant on the threshold, nothing in the Neyman-Pearson test
says that a Neyman-Pearson test must be constant on the threshold. Actually, we can do
anything we want to do on the threshold so long as the size of the test is equal to ao. Note

33

www.MathGeek.com
www.MathGeek.com

that <PI and <P2 are identical to <P3 off the threshold. Thus, <PI and <P2 will be Neyman-Pearson
tests if they have a size equal to ao. The size of <PI is given by

~)
1
Po (Announce Hd = Po (X>
8
and the size of ~62 is given by

Po (Announce Hd Po (~ < X < ~) + Po (X > 1)


1 1
- + 0 =-.
8 8
Thus, <PI and ¢2 are Neyman-Pearson tests, also.

Solution 1.18. Our strategy for this problem will be to first find a Neyman-Pearson test
of Ho : {Po} versus HI : {PI}, and then to find a Neyman-Pearson test of Ho : {Po}
versus HI : {P2 }. If the two tests are identical then a UMP test exists for Ho : {Po} versus
HI : {PI, P2 }.

(A) For testing Ho : {Po} versus HI : {PI} we have:

0 1/3 0 0
1 1/3 1/3 1
2 1/3 2/3 2

Note that we can 'purchase' only a piece of one item. Thus, we randomize on the point
n = 2, and we obtain a test that announces HI with probability 1/2 if we observe '2' and
we announce Ho otherwise. For testing Ho : {Po} versus HI : {P2 } we have:

0 1/3 1/3 1
1 1/3 0 0
2 1/3 2/3 2

Again, note that we can 'purchase' only a piece of one item. Thus, we randomize on the
point 17 = 2, and we obtain a test that announces HI with probability 1/2 if we observe '2'
and we announce Ho otherwise. Since these two tests are identical, we conclude that this
test is a UMP test for testing Ho : {Po} versus HI : {PI, P2 } at level 1/6.

(B) For testing Ho : {Po} versus HI : {PI} we have:

34

www.MathGeek.com
www.MathGeek.com

0 1/3 0 0
1 1/3 1/3 1
2 1/3 2/3 2

Note that we can 'purchase' one entire item and a piece of a second item. Thus, we purchase
item '2' and randomize on the point 17 = 1. We then obtain a test that announces HI if we
observe '2', that announces HI with probability 1/2 if we observe '1', and that announces
Ho otherwise. For testing Ho : {Po} versus HI : {P2 } we have:

0 1/3 1/3 1
1 1/3 0 0
2 1/3 2/3 2

Again, note that we can 'purchase' one entire item and a piece of a second item. Thus, we
purchase item '2' and randomize on the point n = O. We then obtain a test that announces
HI if we observe '2', that announces HI with probability 1/2 if we observe '0', and that
announces Ho otherwise. Since these two tests are not the sane, we conclude that there does
not exist a UMP test for testing Ho: {Po} versus HI: {PI,P2 } at level 1/2.

Solution 1.19. Let Wi = ms(t i ) and let 8i = sin(t i ) fori = 1, ... , k. Note that

PO(YI, ... ,Yk) = (fT~) k exp (- 2~2 ((YI - wd 2 + ... + (Yk - Wk)2))

and that

Thus,

PI(YI,···,Yk)
PO(YI, . .. , Yk)

cxp ( - 2~2 (t(Yj - 8j)' - (y; - Wj)2) ) .


Reducing this test further by taking the natural log of each side and cancelling constants,
yields a test of the form

35

www.MathGeek.com
www.MathGeek.com

Under H o, X k is Gaussian with mean


k

fLo = L Wj(Sj - Wj)


j=l

and variance
k
2 2"
V = iT L ) S j - Wj )2 .
j=l

Under H 1 , X k is Gaussian with mean


k

fL1 = L Sj(Sj - Wj)


j=l

and variance 1)2. \Ve choose T so that PO(Xk > T) = ao. Since

Po (Xk > T) = Po (Xk - fLo >T- fLO) = 1_ <p (T - fLO) ,


1) 1) 1)

it follows that

k k
(J L(Sj - Wj)2<P-
1
(1 - ao) +L Wj(Sj - Wj).
j=l j=l

The power f3 of the test is given by

where
k k
l: Sj(Sj - Wj) - l: Wj(Sj - 'Wj)
fL1 - fLo j=l j=l
11 Ie
iT l: (Sj - Wj)2
j=l

k
1
L(Sj - Wj)2.
j=l

36

www.MathGeek.com
www.MathGeek.com

To maximize the power, we want this latter term to be large. Thus, our goal is to choose
the ti'S so that (Sj - Wj)2 is as large as possible. Note that
(,sj - Wj)2 (sin(tj ) - coS(tj))2
2
sin (tj ) - 2 sin(tj ) cos(t j ) + cos 2 (t j )
1 - 2 sin(tj) cos(t j )
1 - sin(2tj).
This term is maximized when sin(2tj) = -1. This occurs when
7r
tj = j7r - "4
for j E N.

Solution 1.20. To begin, note that

A(x) = Pl(:C) = { exp (-~((X - al) - (x - ao))) for x 2:: ao


Po(x) 00
for al ~ x < ao.
Thus, taking the natural log and cancelling constants, we obtain a test of the form
HI

S(X) > T
<
Ho

where
S { al - ao for x 2:: ao
(x) = 00 for a1 <_ x < a0

= (al - ao)I[ao,oo) (x) + ooI[al,ao) (x).


As usual, we choose T and P so that
Po(S(X) > T) + pPo(S(X) = T) = ao.
Since, under H o, S(X) = al - ao with probability one, it follows that T must be al - ao.
For this choice of T, it follows that P = ao. Our test then is to announce HI if S(x) = 00
(i.e. if al ~ :r; < ao), to announce HI with probability ao if S (:r;) = al - ao (i.e. if:r; 2:: ao),
and to announce Ho otherwise.

To find the power of our test, we must find the distribution of S(X) under HI. Note that
H(S(X) = al - ao) H (X 2:: ao)
1 - PI(X < ao)

j. b (-(X b- ad)
o
1-
a 1
- exp
dx

al - ao)
exp ( b .

37

www.MathGeek.com
www.MathGeek.com

Thus, under HI,

~
al - ao)
wp exp ( b
S(X) {
a1 - a o )
00 wp 1 - exp (
b .

Thus,

(J H(S(X) > T) + PPI(S(X) = T)


H(S(X) > al - ao) + aoH(S(X) = al - ao)
1 - exp ( al ~ ao ) + ao exp ( al ~ ao )

1 - (1 - ao) exp ( al ~ ao ) .

Solution 1.21. Recall from the class notes that the test in this case consists of comparing
the sum of the observations to the threshold

Further, the power of the test is given by

}3 = 1 - <I>
(
sV'k) .
<I> -1·(1 - ao) - -----;;-

This is a Neyman-Pearson test for any choice of ao. Our goal is to find a value of ao for
which the corresponding Neyman-Pearson test is also a minimax test. From class notes, we
know that such a value of ao will be 1 - }3. (That is, if ao = 1 - (3 then Qo = QI, which is
the minimax equation.) Note that

1 - }3 = <I>
(
sV'k) .
<I> -1·(1 - ao) - -----;;-

Thus, ao = 1 - (J if

Solving this equation for ao implies that

ao = 1- <I>
(sV'k)
~ .

38

www.MathGeek.com
www.MathGeek.com

When s = 2, k = 100, and (J"2 = 9, it follows that ao = 0.0004.

Solution 1.22. To begin, note that

PI (x) { 4x if 0 < :c < ~


A(x)
Po (x) = 4 - 4:c if ~ < :c < 1
4xI(0,~)(x) + 4(1- x)I(~,1)(x).

(A) A Bayes test with equal prior probabilities is given by

Reducing, we obtain the test:

HI
>1
S(x) == xI(o l)(X) + (1
'2
- X)I(ll)(X)
2' < -4
Ho

Thus, we announce HI if ~ < x < ~, and we announce Ho otherwise.

(B) For a minimax test given by

Ho

we choose T so that Qo = Q1, where Qo = Po(S(X) > T) and Q1 = P1(S(X) < T). Note
that, for 0 < T < ~,

Qo = Po (X> T,O < X < ~) + Po (1 - X > T, ~<X < 1)


Po (T < X < ~) + Po (~ < X < 1 - T)
(~-T) + (l-T-~)
1- 2T,

and

Q1 H ( X < T,O < X < ~) + PI (1 - X < T, ~<X < 1)


H (0 < X < T) + PI (1 - T < X < 1)

39

www.MathGeek.com
www.MathGeek.com

J J
1

4xdx+ 4(1-x)dx
o 1-T
2T2 +4(1- (1- T)) - 2(1- (1- T)2)
4T2.

Thus, Qo = Ql if 1 - 2T = 4T2, which holds if T = 0.309, where we have discarded the


negative root.

(C) For a Neyman-Pearson test at level 1/100, we choose T so that Qo = 1/100. That is,
we choose T so that 1 - 2T = 1/100. Solving implies that T = 0.495. The power of the test
is given by 1 - Ql, which in this case is 1 - 4T2 = 0.0199.

Solution 1.23. (A) To begin, note that

exp(-((x - 1) - x)) if x> 1


A(x) = {
o ifO:::;:c<l
e iLr:>l
{ O ifO:::;x<1.

Under H o,

and under HI,


A(X) = {eo wp 1
wp O.
We choose T and p so that

1
Po(A(X) > T) + pPo(A(X) = T) = 10.

Thus, it follows that T = e and p = e/10. The resulting test announces HI with probability
e/10 if x 2': 1 and announces Ho otherwise. The power of the test is given by

(3
e e
0+ - X 1 =-.
10 10

(B) The critical function of any Neyman-Pearson test in this situation must have the form

¢(x) = {01 if A(:c) > e


if A(:c) < e.

40

www.MathGeek.com
www.MathGeek.com

Note that A(x) is never greater than e. Thus, the only requirement on ¢ is that it be
zero when A( x) < e; that is, when 0 ::; x ::; 1. If we require in addition that our test be
nonrandom, then we must require that ¢ take on only the values 0 or 1. A test that satisfies
both conditions is given by
, {I
ifx>A
qJo(x) = 0 if x ::; A '

where A > 1. The value of A is determined by the requirement that the size of our test be
1/10. That is,

1
Po (Announce HI)
10
Po(X> A)

J
CXJ

e-xdx
)..

exp( -A).

Thus, A = In(IO), and our nonrandomized test announces HI when x> In(IO) and announces
Ho otherwise. Note that both the randomized test in (A) and the nonrandomized test in (B)
are Neyman-Pearson tests. However, the test in (A) is constant on the threshold, whereas
the test in (B) is not.

Solution 1.24. Let f (x) denote a Gaussian density function with mean 0 and variance (J2.
Then, under H o, it follows that Y has density 1(y), and, under HI, it follows that Y has
density
1 1
"2 1(x - 8) + "2 1(x + 8).
That is, under HI, Y's distribution is an even mixture of two Gaussian distributions-one
centered at 8 and one centered at -8. Thus,

PI(y)
A(y) =
Po(Y)
1
-
1
exp
((y
-
- 8)2) + -1 1
exp
((y
-
+ 8)2)
"""'"""------:=-'-
2~ 2(J2 2~ 2(J2

1 exp (y2
- -)
V27T1y2 2(T2

1
-exp
2 (--8-2) (exp () +exp ())
2(T2
y8
~
(T2
-~
~
y8

( _8 (Y8)
2
)
exp 2(T2 cosh (T2 '

41

www.MathGeek.com
www.MathGeek.com

and our reduced test has the form

cosh (YB) ~< T.


/T 2

Ho

Note that this test is not UMP since it depends upon the parameter B.

Solution 1.25. Note that

A(k)

Taking the log of the test and noting that In(Ad Ao) is positive, we obtain a test of the form

HI
>
k T,
<
Ho

where k is the realization of our test statistic X. As usual, we choose T and p so that

000 = Po(X > T) + pPo(X = T).


The test will be nonrandom if [to = Po(X > T) for some choice of T. Let iTl denote the
smallest integer not less than T. Then

Thus, the test is nonrandom if 000 may be expressed in the form

for some integer m.

42

www.MathGeek.com
www.MathGeek.com

Let Ao = 1 and ao = 0.02. By trial and error we first seek the largest value of T for which
Po(X > T) is not greater than 0.02. Note that

3 1
Po (X > 3) = 1- L k! e-
1

k=O

1- ~e (1 + 1 + ~2 + ~)
6
8
1--
3e
~ 0.019.

Further,
1
Po(X = 3) = - ~ 0.061.
6e
Thus, we seek p so that

0.02 Po(X > 3) + pPo(X = 3)


0.019 + 0.061p,

which implies that p = 0.016. Thus, our test announces HI if X > 3, announces HI with
probability 0.016 if X = 3, and announces Ho otherwise.

3.3. Locally Optimal Tests

Solution 1.26. (A) The density of the noise is given by

for I:cl < 1. Thus,

and
-~
d 1n.fN
. () 2x
x = ~~.
d:c 1 - x2
for I:cl < 1. Thus, the locally optimal processor is given by

2x
gzo(x)=I_x 2 •

43

www.MathGeek.com
www.MathGeek.com

(B) From class notes, we know that


x

9NP(X) = ./ 9zo(U) du
x-s

x-s

-In(l - u'2 ) IXx-s


-In(l - x 2 ) + In(l - (x - .s?)
1-(X-.s)2)
In ( ?
1- x~

Solution 1.27. (A) From class notes, we know that the test statistic in this case is given
by

Under HOl
3 wp 1/8

~ { ::~
wp 3/8
Z wp 3/8
wp 1/8.
\Ve choose T and p so that
1
16 = Po(Z > T) + pPo(Z = T).

Solving as usual we see that T = 3 and p = ~. Thus, our test announces HI with probability
~ if all of the observations are positive and announces Ho otherwise.

(B) Note that under HI,


1 if NI > -s
sgn(XI ) = { -
l'f
1
N 1 < -.s.

Thus,

P(NI > -s)P(N2 > -s)P(N3 > -s)

(J ~ cxp( -IX1)dX) 3

(~ +/ ~ cxp( -IXlJdX) ,

44

www.MathGeek.com
www.MathGeek.com

and hence
1
PI(Z> 3) + "2 H (Z = 3)

0+"2 1(1 1 - "2 exp( -s) )3


1 ( 1 .
"2 1 - "2 exp( -s)
)3
\Ve seek the smallest value of s for which fJ ~ 0.49. Solving, we find that if s ~ 4.3 then
}3~ 0.49.

(C) The power fJ is never greater than ~.

Solution 1.28. (A) To begin, note that


2
A (x) = PI ( x) = 1+ x .
Po(x) 1 + (x - sF
Also, note that A(x) -----+ 1 as x -----+ ±oo and that A(s/2) = 1. Further, A(x) > 1 when
x> s/2 and A(:c) < 1 when:c < s/2. Consider Figure 3.2. For the threshold T shown in the
figure, it is clear that our test will announce that no signal is present when the observation
is sufficiently large! This phenomenon is due to the very heavy tails of the Cauchy noise
density. Finally, note that the test is not UMP since it depends upon the signal s.

(B) Let
1
PI ( x, s) = ----:----:---...,.....,.,..
7T(1 + (x - s)2)
and note that
o ) 2(x-s)
osPI(X, s = 7T(1 + (x - s)2)2
Thus, the locally optimal processor is given by

o
lim -:;;-PI (x, s)
s10 uS
S(x)
Po(x)
r 2(x-s)
;m 7T(1 + (x - S)2)2
1

45

www.MathGeek.com
www.MathGeek.com

T ----------------- ----

I
o I
I" .1
Announce H, here
Announce ~ elsewhere

Figure 3.2: Neyman-Pearson processor for Problem 1.28

2x

(1 + X 2 )
2x
1 + x2 ·

Note that this processor exhibits the same unusual behavior that we observed in part (A).

4. Estimation Solutions

4.1. General Problems

Solution 2.1. If X denotes the number that we observe, then it would not seem unreason-
able to suppose that X =i with probability liN for 1 SiS N. Note that

N 1 1 N(N+1) N+1
E[Xl=Li N = N 2 =-2-·
i=1

46

www.MathGeek.com
www.MathGeek.com

Thus, E[2X - 1] = N, which means that 2X - 1 is an unbiased estimate of N. In general,


if we observe k numbers Xl, . .. , X k , then 2X - 1 is an unbiased estimate of N, where

x = _X__+_·._._+_X_
1 k

On the other hand, if we observe Xl, .. . , X).;, then our goal might be to find a value f"r for N
so that
k

II P(Xi = Xi)
i=l

is maximized. If N < Xi for some value ofi then P(Xi = Xi) = 0 and the entire product is
zero. If N 2: max1<i<k{X.i}, then the product is equal to N-k, which is maximized when N
is as small as possibl~. Thus, the product is maximized when N = max1<::i<::d:Ci}.

Thus, if we observe a runner with the number 87, then an unbiased estimate for the total
number of runners is 173 and a maximum likelihood estimate for the total number of runners
is 87.

Solution 2.2. Note that

Pe(x) = ~exp -
1 ((X-8)2) :2 '
21W2 2(}

and

Thus,
[J2 lnpe(X) ] 1
1(8) = -Ee [ ari

Solution 2.3. Note that

47

www.MathGeek.com
www.MathGeek.com

and

Thus,

Solution 2.4. (A) Note first that E[XIJ = E[X2J = 1 and E[X3J = i. Hence, our estimate
will be correct when it states that X3 's distribution has the largest mean. Thus, taking one
sample from each distribution, the probability that our estimator will correctly determine
the distribution with the largest mean is simply the probability that Xl :S X3 and X 2 :S X 3.
That is, the probability our estimate is correct is given by

(B) Let X denote the sample mean of the two samples from the third distribution, and note
that
I wp 1/4
X= 5/2 wp 1/4
{
7/4 wp 1/2.
Thus, the probability our estimator is correct in this case is given by

P(XI:SX,X2:SX) = p(X=~)
+P (X = ~) P(XI = 0)P(X2 = 0)
+P(X = l)P(XI = 0)P(X2 = 0)
~4+ (~ x ~ x ~) + (~ x ~ x ~)
222 422
7
16

48

www.MathGeek.com
www.MathGeek.com

(C) No. The probability of being correct decreased in (B) even though the number of
observations increased.

Solution 2.5. (A) Since .h(O) > .12(0), it follows that .h(O) - .1"2(0) > O. Thus, since.h and
12 are continuous, it follows that h(x) - 12(x) > 0 for all:c E [O,E] for some E > O. Further,
since hand 12 are even, it follows that h (x) - 12 (x) > 0 for all x E [- E, E]. Thus, since
E

j fl(X) - 12 (x) dx > 0,


-6

it follows that
8+E 8+E

jh(X-8)dX> j12(X-8)dX.
8-E 8-E
\. V' .J \..'---Vv---'
P(IX-81-C;c) P(IY-81-C;E)

(B) Note that ~(Xl + X 2) has a density function given by


00

g(x - 8) = 2 j f(2x - t - 8) f(t - 8) dt.


-00

Using the result from (A), the desired result here will follow if we can show that f(O) > g(O).
Note that
f(O) = k; 1

amP
00

g(O) 2 j P(s) ds (since .f is even)


-00

00

2 j (k -
-00
4
1)2
(1 +
1
I s l)2k
ds

(k - 1)2
2k - 1 .
Thus, f(O) > g(O) since we have assumed that k> 3.
1 See Table of Integrals, Series, and Products, Corrected and Enlarged Edition, (Academic Press: Orlando,
1980) by I. S. Gradshteyn and I. M. Ryzhik page 285.3.194(3).

49

www.MathGeek.com
www.MathGeek.com

4.2. Unbiased Estimators

Solution 2.6. Note that T is unbiased since


Ak
E[T(X)] = f(_2)k :!-A
k=O

e-
A
z=
00

k=O
(-2A)k
k!

e- A e- 2A
e -3A .

However, T is not a reasonable estimator since it is negative when X is an odd integer


yet e- 3A is positive. Indeed, the estimate T oscillates wildly between positive and negative
values.

Solution 2.7. Note that

VAR[T] E[T2] - E2 [T]


E[T2] - 82 .

Thus, E[T2] = VAR[T] + 82 , which implies that E[T2] = 8 2 if and only if VAR[T] = 0; that
is, if and only if T is constant with probability one.

Solution 2.8. If T(XI' ... ,XN) is an unbiased estimator of g(8) then

Ee[T(XI, ... ,XN)] = g(8)

z= z= ... z=
I I

Xl=OX2=O XN=O
I

T(XI, ... XN)P(X I = Xl,··· ,XN = XN)

z= z= ... z=
I I I
!(XI, .. . xN)8 L : 1Xi (1- 8)N-'£f'=lXi~
v
polynomial in 8 of degree ::; N
for each fixed Xl, ... ,XN
~------------------vr------------------~
sum of such polynomials is also a
polynomial in 8 of degree ::; N

Thus, g( 8) must be a polynomial of degree not greater than N in order for T to possibly be
an unbiased estimator. As an application of this result, note that there exists no unbiased
estimate of the odds 8/(1 - 8) of 'heads' versus 'tails' for a coin that comes up 'heads' with
probability 8.

50

www.MathGeek.com
www.MathGeek.com

Solution 2.9. Assume that T(X) is an unbiased estimator of 1/ A. Then it follows that
00 -).A k 1
E[T(X)] = '""'T(k)_e- , =-.
~ k. A
k=O

Rearranging terms, we see that


f-.. T(k) Ak = e)..
~ k! A
k=O
Thus, if such an estimator T exists then there must exist a Taylor series expansion for e). / A
about the origin. However, no such Taylor series expansion exists since e). / A is not defined
at the origin. Thus, we conclude that no such estimator T exists.

Solution 2.10. Since T is unbiased, it follows that


1
VARo[T] 2: 1(e)'

where

1(e) Eo [- :;2 (vk)


In ( n exp ( -~((XI - e)2 + ... + (Xn - e)2)) )]

[p2 ( -"21 (( X I
Eo [- ae - e) 2 + ... + (Xn - e) 2 )) ]

Eo [- :e ((Xl - e) + ... + (Xn - e))]

Eo [- :e (Xl + ... + Xn - ne)]


Eo[n]
n.

Thus, the variance of T is lower bounded by l/n.

Solution 2.11. In terms of A, X has a discrete probability density function given by


AX
P(X = x) = e-).-, == P).(x)
x.
for x = 0,1,2, .... In terms of e = exp( -A), X has density
e(-lney
P (X = x ) = ---'----,----'--
x!
for x = 0,1,2, .... Note that
In Po ( x) = In e+ x In ( - In e) - In:r:!

51

www.MathGeek.com
www.MathGeek.com

and thus that


1 a
- + x-ln( -In e)
e ae
1 1 1
e+ x elne ·
Further,

e2 1ne
1
~ exp(2)').

In addition, note that


00

Ee[T(X)] LT(k)P(X = k)
k=O
T(O)P(X = 0)
e( -In e)O
1 X ,
O.
e,
which implies that T is unbiased. Also,
00

LT"2(k)P(X = k)
k=O
T2(0)p(X = 0)
') e(-ln e)O
1~ X ----'------'----
O!
e.
Thus, VARe[T(X)] = e - e2 = exp( -).) - exp( -2),). Since exp().) > ). + 1 when). > 0, it
follows that

52

www.MathGeek.com
www.MathGeek.com

or that
-). -2), -2), 1
e - e > Ae = I( e) ,
which implies that T is not efficient.

4.3. Maximum Likelihood Estimation

Solution 2.12. No. A maximum likelihood estimate of a mean need not be the sample
mean. For example, if Xl, ... ,Xn are mutually independent and uniform on (e - ~,e + ~) ,
e
then a maximum likelihood estimate of the mean is given by ~ (miul<;i<;n{Xi} + nUtXl<;i<;n{Xi}) ,
which is not the sample mean if n > 2.
(See Problem 2.16.)
Solution 2.13. A maximum likelihood estimate of p is given by

1/3 if:/; = 0
T(x) = { 2/3 if:r; = 1.

Thus, the mean square error of T is given by

(T(O) - p)2 P(X = 0) + (T(1) - p)2 P(X = 1)

1)2
("3- (2)2
P (1-p)+ "3- P P

3p2 - 3p + 1
9
Let 8 (x) = ~ for x = 0 or 1. The mean square error of 8 is given by

(8(0) - p)2 P(X = 0) + (5(1) - p)2 P(X = 1)


1)2
("2- (1)2
P (1-p)+ "2- P P
4p2 - 4p + 1
4

Note that the mean square error of 5 is smaller than the mean square error of T for p E [~, ~J.

Solution 2.14. If
X = { 1 if the coin flip is a 'head'
o if the coin flip is a 'tail'
then
P(X = i) =
e ifi=1
{ 1- e ifi = 0,

53

www.MathGeek.com
www.MathGeek.com

where 8 E 8 = (0,1). No maximum likelihood estimate of 8 exists since 8 possesses no


maximum in e.

A remedy to the problem is to let e= [0,1]. In this case, a maximum likelihood estimate
of 8 is given by
A)
8(x =
{I 0
ifx=l
if x = o.
That is, if we include physically impossible values in 8, then a maximum likelihood estimator
exists and it returns as its estimate the physically impossible values that we included!

Solution 2.15. (A) From class notes, we know that a maximum likelihood estimate in this
case is given by the sample mean,

Since

it follows that 8 is unbiased. Note that if

1 )
Pe(X1, . .. , Xk) = ( 0"Y"21T
kexp ( - 1 k
20"2 ~(Xj - 8)
2) ,

then

and

Thus,

Further,
1 k ~
VAR[8(X1' ... , X k )] = k2 L VAR[Xj] = :~.
j=l

Since VAR[8(X1' ... ,Xk )] = 1/1(8), we see that 8 is efficient.

54

www.MathGeek.com
www.MathGeek.com

(B) Note that

and thus that

Thus, our test has the form

where we choose T ~ 0 so that

"0 Po ( (t,Xj)' > T2)


1 - Po ( - t,
T<: Xj <: T)
2~ (~-~),
sInce
k
L Xj rv N(O, k(}2)
j=I

55

www.MathGeek.com
www.MathGeek.com

under Ho. Thus,


T = -vko-<I>-l (~o) .
Note that
k

L Xj "" N(kB, ko- 2


)
j=l

under HI. Thus, the power /3e; of the test is given by

T- kB
(Tyik
1- J vh _;2) exp ( du
-T- kB
o-Vk
1 + <I> (-T - !'1:
(TV k
kB) - <I> (T - kB)
!'1:
(TV k

1 + 'P (,p-1 (';0) _ B:;*) _~ (_~-1 (';0) _ B:;*) .

(C) From class notes we know that a Neyman-Pearson test for this situation has the form

where

and

If 000 = 10- 2 and yikB


(T = 5, then ifJe; and /3 NP both exceed 0.99.

56

www.MathGeek.com
www.MathGeek.com

Solution 2.16. Let


n

L(e, Xl,· .. , Xn) = II fe(Xi)


i=l

and note that L(e,X1"",Xn) = 1 if and only if Xi E [e- ~,e+~] for 1, ... , nand
equals zero otherwise. That is, L(e, Xl, . .. , Xn) = 1 if and only if

1
max {xJ - - :::; e:::; min {xJ
l:";;:";n 2 l:";i:";n
+ -21
and equals zero otherwise. Thus, any estimator T : ffi.n -7 ffi. such that

is a maximum likelihood estimate of e. In particular,

e
is a maximum likelihood estimate of for any value of A in [0,1]. Note that in this case,
not only is a maximum likelihood estimate not unique, but there exist uncountably many
distinct maximum likelihood estimates of e!

Solution 2.17. Note that


n. 1
L(e,:C1, ... ,Xn) = IIfe(xi) = en
i=l

when e ~ max1<i<n{ xJ and equals zero otherwise. Further, e- n decreases as e increases.


e
Thus, a maximum likelihood estimate Z of is given by

Note that

P(Z:::; z) P(X1:::; z, ... ,Xn:::; z)


P(X1 :::; z) ... P(Xn :::; z)

(~) ... (~)


(~)n

57

www.MathGeek.com
www.MathGeek.com

for 0 < z :::; 8. Thus, Z possesses a probability density function given by


d
Jz(z) = dzP(Z:::; z)
nzn-1
8n ifO<z:::;8
{ o elsewhere.

Thus,
e
Ee[Z] J
o
z fz(z) dz

e
nzn-1

J
o
z - -n d z
8

~n ( zn+1 ) Ie
8 17 + 1 0
n8
17+1
Thus, Z is not unbiased, and we conclude that in general a maximum likelihood estimator
need not be an unbiased estimator.

Solution 2.18. (A) Note that

I - 8 if x = 0
L (8, x) = { 8 if x = 1.

Thus, L(8,1) is maximized when 8 = ~ and L(8,0) is maximized when 8 = i. That is, a
maximum likelihood estimate of 8 is given by

e(x)={ 1/3 ifx=O


2/3 if x = 1.

(B) Note that e(x) = 81/ 3(:C). Further, note that


(8 - 8,,(0))2 P(X = 0) + (8 - 8 (1))2 P(X = 1)
0

(8 - 0:)2(1 - 8) + (8 - (1 - 0:))28.

Thus, the maximum likelihood estimator e(x) from (A) has a mean square error given by
2
E [(8 - 81 / 3 (X)) 2] = -8 - -8 1
+ -.
3 3 9

58

www.MathGeek.com
www.MathGeek.com

In addition,

for all 0; E [i, ~J and all a E [h ~l·


(C) No, since the estimator from (B) has a smaller mean square error than the maximum
likelihood estimator from (A) for all possible values of the parameter a. Thus, we conclude
that in general a maximum likelihood estimator need not be admissible.

Solution 2.19. (A) Note that

which is continuous in a and differentiable in a between the x/so If Xj < a < Xj+1, then

[,(8, x" ... ,xn ) ~ Gr j


8- (,1- B)-(n-
j
) 11 x, itt (,1- Xi),

a -j 17 - j
aa lnL (a,XI""'X n )=7)+ A-a'
and
a22InL j n-j
aa (a,XI,'" ,x n ) = a2 + (A _ a)2 > O.
Since the second derivative is positive, any critical point between the Xi'S must correspond
to a minimum (not a maximum) of L. (That is, any critical point a such that Xj < a < Xj+1
must be a local minimum of L. If 0 ~ a < Xl, then

which is strictly increasing in a. Thus, no value of a in [0, Xl) can correspond to a maximum
a
of L. Similarly, no value of in (xn, A] can correspond to a maximum of L. Thus, we
conclude that a maximum of L is attainable only when a is equal to one of the observations
since no other value of a could possibly correspond to a maximum of L.

(B) The strict positivity of the second derivative within the intervals between the observations
implies that any local maximum of L that exists at Xj must correspond to a cusp of L; that

59

www.MathGeek.com
www.MathGeek.com

is !1e In L > 0 to the left of x j and to


In L < 0 to the right of x j. Thus, for x j to correspond
to a local maximum it must be true that

lim
OlXj (- - + - -.) <
J.
()
rz - J
A - () -
0 < lim
- OTxj
(_J_'_-_1 + _n_-_(,-j_-_1_))
() A - () ,

which implies that


j 11j-j j-1 11-j+l
- -Xj + A - Xj -
< 0 < - - - + ----,---"----
- :r;j A - Xj
or that
j-1 j
--A<x
n -
<-A.
J - n

Thus, if the interval (0, A) is divided into n intervals of the form [;A, A], then the jth i!l
observation Xj cannot possibly correspond to a maximum of L unless it is contained within
the jth such interval.

(C) Let A = 10 and n = 6. If we observe 1,2,5,6,7,9 then a maximum likelihood estimate


turns out to be the fourth order statistic. If we observe 1,2,4, 7, 8, 9 then a maximum
likelihood estimate turns out to be the sixth order statistic. Thus, we know that a maximum
likelihood estimate will be given by some order statistic, but we cannot tell ahead of time
which order statistic it will bel

Solution 2.20. (A) Note that

In10(x) = In2 + 21n() - 3ln(x + ())

and hence that


8 2 3
8() In 10 (x) = (j - x + () = 0
if () = 2x. Thus, a candidate for a maximum likelihood estimator is 8(X) = 2X. Note that

82 2 3
-') In.fe(x) =
8()~
--2
()
+ (x + ())2 < 0
if and only if ri - 4()x - 2X2 < 0, which (since () is positive) implies that
2
8
----:lIn 10 (x) < 0 when () < ( 2 + J6) x.
8() > >
Since 2x < (2 + v'6) x, it follows that e
corresponds to a local maximum. Further, note
that In 10 (x) is monotonically decreasing to the right of (2 + J6) x since ~ - < 0 if and x!o
e
only if () > 2x. Thus, must correspond to a global maximum, and hence is a maximum
likelihood estimate for ().

60

www.MathGeek.com
www.MathGeek.com

Recall that

Since

it follows that eis not unbiased. Further, Hate that


E[4Xi] - 48E[X1 ] + 82
4E[Xi]- 38 2
00,

since E[Xf] = 00. However, if 8(x) = A for any real number A, then E[(8(X) - 8)2] is finite.
e
Thus, we conclude that is not admissible.

(B) Note that

and hence that

lnL(8,Xl,X2) = 2ln2 + 4ln8 - 3ln(xl + 8) - 3ln(:£2 + 8).


Thus,

which is equal to zero if

is a candidate for a maximum likelihood in this case.

61

www.MathGeek.com
www.MathGeek.com

Solution 2.21. Note that Y = 1 if and only if X 2 = 1 and X3 = 1. Thus,

Thus,

L(8, x, y) Pe(X1 = x)Pe(Y = y)


(1-82)(1-8) ifx=Oandy=O
8 (1 - 82 ) if x = 1 and y = 0
2
{ 83 (1 - 8) if x = 0 and y = 1
8 if x = 1 and y = 1
I - 8 - 82 + 83 if x = 0 and y = 0
8 - 83 if x = 1 and y = 0
= 82 - 83 if x = 0 and y = 1
{ 3
8 if x = 1 and y = 1.

Note that 1 - 8 - 82 + 83 is maximized when 8 = 0 and that 83 is maximized when 8 = 1.


Further, note that 8 - 83 is maximized when 8 = 1/V3 ;::::j 0.57735 and that 82 - 83 is
maximized when 8 = 2/3. (See Figure 4.1.) Thus,

if 0 x = 0 and y = 0
1/V3 if x = 1 and y = 0
~/3
A

8(x,y)= if x = 0 and y = 1
{ if x = 1 and y = 1.

0.8

06

0.4

0.2

00 02 06 08

Figure 4.1: L(8,:c,y) as a function of 8 for x = 0,1 and y = 0,1.

Solution 2.22. (A) Note that

62

www.MathGeek.com
www.MathGeek.com

i=l
n

;=1
n

-n()1 1n ()2 + 17 In ()1 + (()1 - 1) L In Xi·


;=1

Thus, to maximize In fe 1 .e 2 (Xl, ... ,xn ) we want to make ()2 as small as possible. However,
fe 1 ,e2 (:C1, . .. ,xn ) = 0 if Xi > ()2 for any i. That is, ()2 cannot be less than any of the
observations. Thus, a maximum likelihood estimate for ()2 is given by

(B) Note that

which is equal to zero when

Since
[J2 11
8()21nfel,e2(x1, ... , xn) = - ()2 < 0,
1 1

we conclude that this critical point corresponds to a maximum point. Thus, a maXllImm
likelihood estimate for ()1 is given by

Solution 2.23. If X > 0 then fe(x) is a strictly increasing function of () and is thus max-
imized when () = 1. If :c < 0 then fe (x) is a strictly decreasing function of () and is thus
maximized when () = -1. If X = 0 then Ie (x) is a constant function of () and is thus
maximized for any choice of (). Thus a maximum likelihood estimate of () is given by

T(X)={ 1 ~fx~O
-1 If X < O.

63

www.MathGeek.com
www.MathGeek.com

Note that
1

Ee[T(X)] J
-1
T(x)fe(x) dx

+ J.(
o 1
1 + ax 1 + ax
J
-1
( - 1) 2 dx
0
1) 2 d:c

ax2]O
X
- [ -+- + [x-+-
ax2]1
2 4 -1 2 4 °
- (~ - ~) +(~+~)
~=fa
2 .

Thus, we see that T is not unbiased. The mean square error of T is given by

Ee[T2(X)] - 2aEe[T(X)] + a2
1 - 2a (2"a) + a :2

1.

However, the trivial estimator S(X) = 0 has a mean square error given by

which is less than or equal to 1 for all a! Thus, we conclude that T is not admissible.

Solution 2.24. Note that

and hence that

:0 (-nln y2;; - nIne - 2~' (~(X; - eJ'))


17
-7) + a12 (~.
{;;{(x i - a) ) + a13 (~.
{;;{(x; .J)
- a)~

-i + ;, ((~Xi) -ne)
64

www.MathGeek.com
www.MathGeek.com

+ :3 ((t, X;) - 2B (t, Xi) + nO' )


-~ -:' (t,x,) + :' (t,x;)
which is equal to zero when

Solving for e and selecting the positive root yields a maximum likelihood candidate given by

Solution 2.25. Note that


n

L(e, Xl,···, Xn) = II exp( -(Xi - e))I[8,CXl) (Xi),


i=l

which equals zero if Xi < e for any value of i. If Xi 2: e for all i then
n

L(e, Xl, .. . , Xn) = e8 II e- x


"
i=l

which is maximized when e is as large as possible. Thus, a maximum likelihood estimator


for e is given by 8(X1,"" xn) = min1<i<n{Xi}, which is the largest value of e for which L
remaIns nonzero.

Is 8 unbiased? No. An easy way to see this is to note that 8 is never less than e and hence
could not be unbiased. For a more complete explanation, note that if X > e then

P(8 ~ x) = P ( min {Xi}


l::;,::;n
~ X)
1 - P ( min {XJ >
l::;,::;n
X)
1 - P(X1 > X, ... , Xn > x)
1- P(X1 > x)·· ·P(Xn > x)

1- ( [ P+'JdH) n

1 - en (8-x) ,

65

www.MathGeek.com
www.MathGeek.com

which implies that Bpossesses a probability density function given by

.f"e(x) d~ (1 - en(iJ-x))

nen(iJ-x)

for x > 8. Thus,

I
CXl

EiJ[B] xfe(x) dx
iJ

I
00

xnen(iJ-x) dx

o
1
8+ -,
n

from which it is again clear that 8 is not unbiased.

4.4. Minimum Mean Square Estimation

Solution 2.26. We choose a so that E[(X(t + k) - aX(t))X(t)] = 0, which implies that


E[X(t + k)X(t)] = (~E[X2(t)], or that R(k) = aR(O). Solving, we see that a = R(k)jR(O).
Thus, we estimate X(t + k) via R(k)X(t)j R(O).

Solution 2.27. We choose a and fJ so that they satisfy


E[(X(t) - aX(O) - fJX(T))X(O)] 0
E[(X(t) - aX(O) - i3X(T))X(T)] 0,
which become
E[X(t)X(O)] aE[X2(0)] + j3E[X(T)X(0)]
E[X(t)X(T)] aE[X(O)X(T)] + fJE[X2(T)]
or
R(t) aR(O) + f3R(T)
R(T - t) aR(T) + fJR(O).
Solving we see obtain
R(t)R(O) - R(T)R(T - t)
R2(0) - R2(T)
R(T - t)R(O) - R(t)R(T)
(J
R2(0) - R2(T)

66

www.MathGeek.com
www.MathGeek.com

and our estimate of X(t) is aX(O) + (3X(T). If t = T /2 then

a = 3= R(T/2)
I R(O) + R(T)

Solution 2.28. We choose a and (3 so that they satisfy

B [ (J X(l)dl- aX(O) -11X(T)) X(O)] 0

E [ ( [ X(l) dl- <>X(O) -IiX(T)) X(T)] 0,

which become

E [J X(t)X(O)dt] aE[X2(O)] + /JE[X(T)X(O)]

E [ [ X(t)X(T)dt] aE[X(O)X(T)] + /JE[X2(T)]

or
T

J R(t) dt aR(O) + (3R(T)


T °
J R(T-t)dt = aR(T) + /JR(O).
°
Solving and noticing (via a change of variable) that

J J J
T O T

R(T - t) dt = R(s)( - ds) = R(t) dt


° T °
we find that
T

J R(t) dt
a = 3= -----:"-0--:--_-,----:-
I R(O) + R(T) ,

67

www.MathGeek.com
www.MathGeek.com

and our estimate is o:X(O) + /3X(T).


Solution 2.29. We choose 0:1 and 0:2 so that they satisfy

E[(X(t) - (~lX(t - 1) - 0:2X(t - 2))X(t - 1)] 0


E[(X(t) - O:lX(t - 1) - 0:2X(t - 2))X(t - 2)] 0,

which become

E[X(t)X(t - 1)] 0:1E[X2(t - 1)] + 0:2E[X(t - 1)X(t - 2)]


E[X(t)X(t - 2)] O:lE[X(t - 1)X(t - 2)] + 0:2E[X2(t - 2)]

or

Solving we see that 0:1 = e- 1 and 0:2 = 0, and we estimate X(t) via e- 1X(t - 1). (Note that
X (t) is a Markov process and hence, as expected, our estimate depends only on the most
recent observation.)

Solution 2.30. Consider the subspace NI ~ £(e- t , te-t) of LdO, (0). We seek the point h
in j\{ that is closest to h, i.e. such that h - h ..1 M. Thus, we must have
00

./ (h(t) - h(t))e-tdt 0
o
00

./ (h(t) -11(t))te- tdt 0


o
where

Substituting for h we obtain


00 00

2t 2t
0:1 . / e- dt + 0:2 . / te- dt

o 0 o
00 00

(~1 j' te- 2t dt + 0:2 j't 2e- 2t dt


o 0

68

www.MathGeek.com
www.MathGeek.com

where we note that

CXJ

Jo
2t
te- dt = ~
1

Jo
-t 1
e dt = 1 - -
e

J
CXJ

t"2 e-"2tdt = ~
o
1

Jo
-t
te dt = 1 -
2
~.

Thus, we obtain the system of equations given by

1
1--
e
2
1- -,
e

and solving we see that

and
12
Lt2 = 4 --.
e

Solution 2.31. We estimate X 3 via aX 2 + bX + c. We choose a, b, and c so that

E[(X 3 - (aX2 + bX + C))X2] 0


E[(X 3 - (aX2 + bX + c))X] 0
E[X 3 - (aX2 + bX + c)] O.

Note that

69

www.MathGeek.com
www.MathGeek.com

Thus our system of equations simplifies to


1 1 1 1
-a + -b +-c -
.5 4 3 6
1 1 1 1
-a + -b +-c
4 3 2 5
1 1 1
-a+ -b+ c
3 2 4

Solving we find that a = 3/2, b = -3/5, and c = 1/20. Thus, we estimate X 3 via

3 2 3 1
-X --X+-.
2 5 20
The mean square error of our estimate is given by

! (X3 - ~X2
o
+ ~x - 2~J) ,dx
1
2800
~ 3.5714 x 10- 4 .

4.5. Hilbert Spaces

Solution 2.32. (A.) Clearly Ilgll 2:: 0 for all g. Further, if 9 = 0 then Ilgll = 0, and if Ilgll = 0
then Ig(t) I = 0 for all t E [0,1], i.e. 9 = O. In addition,

Ilagll sup{lag(t)I : t E [0, I]}


lal snp{lg(t)l}
lalllgll·

Finally, Ilg + hll ::::; Ilgll + Ilhll since Ig(t) + h(t)1 ::::; Ig(t)1 + Ih(t)l·
(B) Let f and 9 be points in K and let 0 < ). < 1. Note that
1 1 1

j(1 - ).)f(t) + ).g(t) dt (1 - ).) j f(t) dt j g(t) dt


+).
o o 0
(1 - ).) + ).
1.

Thus, since (1 - ).)f(t) + ).g(t) E K, we see that K is convex.

70

www.MathGeek.com
www.MathGeek.com

(C) Let 9 E K. Since g(O) = 0 and 9 is continuous, it follows that Ilgll > 1 since

.I
1

g(t) dt = 1.
o

Clearly, however, if ~ > 0 then there exists some f E K such that Ilfll < 1 + E.
Solution 2.33. (A) PM(h) is the point in 1\IJ that is nearest to h with respect to the norm
induced by the inner product on H. If x E M then clearly PM(x) = x. Since PM(h) E 1\IJ
for any h E H, it follows that PM(PM(h)) = PM(h) for all hE H.

(B) Let x and Y be points in H. The Hilbert Space Projection Theorem implies that Y =
PM(y) + Q.u(Y) where PM(y) EM and Q.u(Y) E 1\IJl... Note that
(PM(x), PM(Y) + QM(Y))
(PM(:c), PM(y)) + (PM(x), QM(Y))
(PM(x), PM(Y)) since PM(x) E jVI and QM(Y) E Ml..
(PM(x), PM(Y)) + (Q.w(x), PM(Y)) since PM(Y) E ]1;J and QM(X) E ]1;Jl..
(PM(:C) + QM(X), PM(y))
(x, PM(y)).

If M ..1 N then

o since PN(y) EN and PM(x) EM


(y, PN(PM(X))) via the previous result.

That is,
(y, PN(PM(X))) = 0
for all :1: and yin H, which implies that PN(PM(X)) = 0 for all x E H. Similarly,

which implies that PM(PN(y)) = 0 for all Y in H.


Now assume that PM(PN(X)) = PN(PM(X)) = 0 for all x in H. Let m E M and n E N.
Then

(m, n) (PM(m) + QM(m), PN(n) + QN(n))


(PM(rn), PN(n)) since m E M and n E N
(m, PM(PN(n))) using the previous result
(m,O)
O.

71

www.MathGeek.com
www.MathGeek.com

Thus, M ..1 N.

(C) Note that


PM(PN(X)) = 0 \Ix E H ¢::::::? (y, PM(PN(X))) = 0 \Ix, y E H
¢::::::? (PM(y),PN(x)) = 0 \I:c,y E H
¢::::::? (PN(PM(y)), x) = 0 \I:c, y E H
¢::::::? PN(PM(y)) = 0 \ly E H.

Solution 2.34. To begin, note that


E[02,1 + X]
02,1 +
E[X]
02,1 since E[X] = 0,

which is equal to zero if 02,1 = O. Further,

E[Y1 Y 3] E[03,1 + 03,2X + X2]


= 03,1 + 03,2E[X] + E[X2]
= .
03,1 + 1 SInce E [2]
X = 1,
which is equal to zero if 03,1 = -1. In addition,

E[X( -1 + 03,2 X + X2)]


E[X3] - E[X] + 03,2E[X2]
03,2 since E[X3] = 0,
which is equal to zero if 03,2 = O. Further,
E[XY4]
04,lE[X] + 04.2E[X2] + 04,3E[X3] + E[X4]
04.2 + 3 since E[X4] = 3,
which is equal to zero if 04,2 = -3. Finally,

and
E[( -1 + X2)(04,1 - 3X + 04,3 X2 + X 3 )]
E[04,lX 2 - 3X 3 + 04,3X4 + X 5 - (~4,1 + 3X - 04,3X2 - X 3]
E [04,1 X 2 + (~4,3X 4 - (t4,1 - (Y4,3X 2]
(t4,1 + 304,3 - 04.1 - (t4,3
204,3,

72

www.MathGeek.com
www.MathGeek.com

which are both equal to zero if 0:4,3 = 0:4,1 = O. Combining these results, we obtain

Yi 1
}2 X
Y3 X2 -1
3
Y4 X - 3X.

To obtain the best estimate of Y4 that is of the form O:lYi + 0:2}2 + 0:3Y3, we seek values of
0:1, 0:2, and 0:3 such that

E[(Y4 - (O:lYi + 0:2 Y2 + 0:3Y3))Yi] = 0


for i = 1,2,3. Since E[YiYj] = 0 when i =f j, it follows that Lt1 = Ct2 = Ct3 = O. That is, the
best linear estimate of Y4 in terms of Y1 , }2, and Y3 is z-;ero! The best nonlinear estimate of
Y4 in terms of Y 1 , }2, and Y 3 , however, is y;3 - 3}2, which has a zero mean square error.

Solution 2.35. (A) To begin, note that if x E Hand y E lvIl.. with Ilyll = 1, then
I\X,y)1 I\Pl\f(x) + Ql\f(x),y)1
I(PM(:C) , y) + (QM(:C), y) I
I( QM (x) , y) I since y E M l..
< IIQl\f(x)IIIIYII
I Ql\f(:c) I since Ilyll = 1.
Further, note that equality occurs if

Thus, since IIQM(:c)11 is an attainable upper bound, we conclude that

max I(:r;, y)1 = IIQM(X) II·


yEM~
IIYII=l

(B) Let H denote the Hilbert space consisting of all square integrable functions on [-1, 1]
with
1

\1,g) =./ -1
f(x)g(x)dx.

Let Ai denote the subspace consisting of all polynomials on [-1,1] of degree not greater
than 2. Since M is finite dimensional, it follows that ]0. is closed. Note that if h E lvIl..,
then \1, h(x)) = \X, h(x)) = \X2, h(x)) = O. Our goal is to find

max
gE.U~
\X 3,g(x)) .
Ilgll=l

73

www.MathGeek.com
www.MathGeek.com

The Hilbert Space Projection Theorem suggests that we can express :c 3 as PM(X 3 ) + QM(X 3 )
where P M (X 3 ) E }v! and Q}H(X 3 ) E }\;J~. Note that since PM(X 3 ) E }v!, it follows that
P M (X 3 ) = a + bx + ex 2 where a, b, and e must satisfy
1

j (x' - a - bx - ex ) dx
-1
3 2
o

j(x 3 - a - bx - cx 2 )xdx o
-1
1

j (x 3 - a - bx - CX 2 )X 2 dx o.
-1

Simplifying we obtain:
1 1 1 1
3 2
j :c dx - a j d:c - b j x dx - c j x dx 0
-1 -1 -1 -1
1 1 1 1
2
j X4 dx - a j x dx - b j x dx - c j x 3 dx 0
-1 -1 -1 -1
1 1 1 1
5
j x dx - a j x dx - b j x dx - c
2 3
J' x4 dx 0,
-1 -1 -1 -1

which becomes
2
-2a- -e 0
3
2 2
- - -b 0
5 3
2 2
--a - -c O.
3 .5
Solving we find that a = 0, b = ~, and c = O. Thus,

and

x3 PM (X 3 )
-

3 3
"x --x
5" .

74

www.MathGeek.com
www.MathGeek.com

From the result in Part (A), it follows that

f (,3 - ~,,) 2
-1
dx

2
-v14
35
~ 0.21381.

75

www.MathGeek.com

Potrebbero piacerti anche