N

Mathematical Statistics with
Applications
Students Solutions Manual
Kandethody M.Ramachandran
Department of Mathematics and Statistics
University of South Florida
Tampa,FL
Chris P.Tsokos
Department of Mathematics and Statistics
University of South Florida
Tampa,FL
AMSTERDAM BOSTON HEIDELBERG LONDON

NEW YORK OXFORD PARIS SAN DIEGO
SAN FRANCISCO SINGAPORE SYDNEY TOKYO
Academic Press is an imprint of Elsevier
Elsevier Academic Press

30 Corporate Drive, Suite 400, Burlington, MA 01803, USA
525 B Street, Suite 1900, San Diego, California 92101-4495, USA
84 Theobalds Road, London WC1X 8RR, UK
Copyright 2009, Elsevier Inc. All rights reserved.
No part of this publication may be reproduced or transmitted in any form or by any means, electronic or
mechanical, including photocopy, recording, or any information storage and retrieval system, without
permission in writing from the publisher.
Permissions may be sought directly from Elseviers Science & Technology Rights Department in Oxford, UK:
phone: (+44) 1865 843830, fax: (+44) 1865 853333, E-mail: permissions@elsevier.co.uk. You may also
complete your request on-line via the Elsevier homepage (http://elsevier.com), by selecting Customer
Support and then Obtaining Permissions.
Library of Congress Cataloging-in-Publication Data
Applications submitted
ISBN 13: 978-0-08-096443-0
For all information on all Elsevier Academic Press publications

visit our Web site at www.elsevierdirect.com
Typeset by: diacriTech, India
09 10
9 8 7 6 5 4 3 2 1
Contents
CHAPTER 1 Descriptive Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
CHAPTER 2 Basic Concepts from Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11
CHAPTER 3 Additional Topics in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
21
CHAPTER 4 Sampling Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

CHAPTER 5 Point Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
CHAPTER 6 Interval Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
CHAPTER 7 Hypothesis Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
CHAPTER 8 Linear Regression Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
CHAPTER 9 Design of Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
CHAPTER 10 Analysis of Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
91
CHAPTER 11 Bayesian Estimation and Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

CHAPTER 12 Nonparametric Tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
CHAPTER 13 Empirical Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
CHAPTER 14 Some Issues in Statistical Applications: An Overview . . . . . . . . . . . . . . . . . . 127
iii
This page intentionally left blank
Chapter
Descriptive Statistics
EXERCISES 1.2
1.2.1.
The suggested solutions:

For qualitative data we can have color, sex, race, Zip code and so on. For quantitative data
we can have age, temperature, time, height, weight and so on. For cross section data we can
have school funding for each department in 2000. For time series data we can have the crude
oil price from 1995 to 2008.
1.2.3.
The suggested questions can be

1. What types of data the amount is?
2. Are these Federal Agency get same amount of money? If not, why?
3. Which Federal Agency should get more money? Why?
The suggested inferences we can make is
1. These Federal Agency get different amount of money.
2. The differences of money between the Agencies are kind of big.
EXERCISES 1.3
1.3.1.
For stratied sample, we can say suppose we decide to sample 100 college students from the
population of 1000 (that is 10% of the population). We know these 1000 students come
from three different major, Math, Computer Science and Social Science. We have Math 200,
CS 400 and SS 400 students. Then we choose 10% of each of them Math 20, CS 40 and SS
40 by using random sampling within each major.
For cluster sample, we can say suppose we decide to sample some college students from the
population of 2000. We know these 2000 students come from 20 different countries and
we choose 3 out of the 20 countries by random sampling. Then we get all the individual
information from each of the 3 countries.
2 CHAPTER 1 Descriptive Statistics
EXERCISES 1.4
1.4.1.
By minitab
(a) Bar graph
Bar graph for the percent of road mileage
35.00%
30.00%
C2
25.00%
20.00%
15.00%
10.00%
5.00%
0.00%
Poor
Mediocre
Fair
Good
Very good
C1
(b) Pie chart

Pie chart of the percent of road mileage
Category
Poor
Very good
Good
Fair
Mediocre
(a) Bar graph

Bar graph
40.00%
30.00%
C2
1.4.3.
20.00%
10.00%
0.00%
Coal
Natural
Nyclear Petrolium Renewable
Gas Electric Power
Energy
C1
Students Solutions Manual 3
(b) Pareto chart

100
0.8
80
0.6
60
0.4
40
0.2
20
0
0.0
C1
Petrolium Natural
Gas
Percentage
Percent
Cum %
0.40
40.0
40.0
Coal
0.23
23.0
63.0
0.22
22.0
85.0
Nyclear Renewable
Electric
Energy
Power
0.08
0.07
8.0
7.0
93.0
100.0
(c) Pie chart

Pie chart of species
species
Category
Coal
Natural Gas
Nyclear Electric Power
Petrolium
Renewable Energy
(a) Bar graph

Bar graph
6
5
4
Count
1.4.5.
3
2
1
0
A
C
C1
Percent
Percentage
Pareto graph
1.0
(b) Pie chart

Pie chart
species
Category
A
B
C
D
F
(a) Pie chart

Pie chart
species
Category
Mining
Construction
Manufacturing
Transportation
Wholesale
Retail
Finance
Services
(b) Bar graph

Bar graph
C2
1.4.7.
8000
7000
6000
5000
4000
3000
2000
1000
0
in
in
on
ru
st
ct
io
an
uf
t
ac
ur
in
Tr
an
r
po
ta
tio
n
W
l
ho
es
C1
al
e
R
et
ai
l
n
Fi
an
ce
r
Se
vi
ce
1.4.9.
Bar chart
Bar graph
80
70
60
C2
50
40
30
20
10
0
1900
1980
C1
1990
2000
(a) Bar graph

Bar graph
300
250
C2
200
150
100
50
ea
te
ia
K rt
P n idn
e u ey
m
on
ia
St
ro
k
Su e
ic
id
e
r
be
ce
an
C
hr
C
Ac
ci
de
nt
s
on
ic
C1
(b) Pareto graph

Pareto graph
100
700
600
80
500
400
60
300
40
200
20
100
0
C1
H
Percentage
Percent
Cum %
ea
rt
C
an
ce
r
St
ro
ke
C
Pn
um
on
ia
Ac
d
ci
en
ts
D
e
ab
te
s
O
th
er
268.0 119.4 58.5 42.3 35.1 34.5 23.9 30.2

4.4
38.7 28.8 8.5 6.1 5.1 5.0 3.5
38.7 67.6 76.0 82.1 87.2 92.2 95.6 100.0
Percent
Percentage
1.4.11.
1960
1.4.13.
Histogram
9
8
Frequency
7
6
5
4
3
2
1
0
60
75
80
90
C1
(a) Stem and leaf

Stem-and-leaf of C1 N = 20
Leaf Unit = 10
1
4
7
3
4
99
8
5
00011
10
5
22
10
5
4455
6
5
6667
2
5
9
1
6
0
(b) Histogram
Histogram
5
4
Frequency
1.4.15.
3
2
1
0
480
500
520
540
C1
560
580
600
(c) Pie chart

Pie chart
species
Category
475
493
499
502
503
506
510
517
525
526
542
546
553
558
565
568
572
595
605
EXERCISES 1.5
1.5.1.
Mean is 165.6667 and standard deviation is 63.15397
1.5.3.
Data is 3,3,5,13 and standard deviation is 4.760952
1.5.5.
(a) lower quantiles is 80, median is 95, upper quantiles is 115 and inter quantile range
is 35. The lower limit of outliers is 27.5 and upper limit of outliers is 167.5.
(b) The box plot is
(c) Therefore there are no outliers.
1.5.7.
l
fi (mi x ) =
i=1
1.5.9.
l
fi (mi )
i=1
l
x = nx nx = 0
i=1
(a) Mean is 33.105, variance is 177.0430 and range is 48.19.

(b) Lower quantile is 24.9225, median is 32 and upper quantiles is 42.985. The inter
quantile range is 18.0625. The lower limit of outliers is 2.17125 and upper limit of
outliers is 70.07875. Therefore there are no outliers.
(c)
(d)
Histogram of y
8
Frequency
0
0
10
20
30
y
40
50
60
1.5.11.
(a) Mean is 110, standard deviation is 83.4847.

(b) 68%, 95%, 99.7%.
1.5.13.
(a) Mean is 3.7433, variance is 3.501 and standard deviation is 1.871323.
(b) Frequency table

Class
1
2
3
4
5
Interval
01.6
1.73.3
3.45
5.16.7
6.88.4
Frequency
4
10
9
5
2
mi
.8
2.5
4.2
5.9
7.6
Mi
3.2
25
37.8
29.5
15.2
(c) By grouped data, Mean is 3.69, variance is 3.62 and standard deviation is 1.9.
The results are similar to the none grouped data.
L = 25
1.5.15.
fm = 139615
w=4
Fb = 178859
n = 514661
w
M=L+
(.5n Fb ) = 27.24822.
fm
1.5.17.
(a) Mean is 44.27, variance is 536.15 and standard deviation is 23.15.

(b)
L = 40
fm = 59
w = 19
Fb = 69
n = 180
M=L+
w
(.5n Fb ) = 46.763.
fm
EXERCISES 1.8
(a)
Histogram of y
20
15
Frequency
1.8.1.
10
5
0
66
68
70
72
74
76
78
80
(b) Mean is 74.0625, median is 74, variance is 7.223892 and standard deviation is
2.68773.
(c)
The lower limit of outliers is 66 and upper limit of outliers is 82. Therefore we have no
outlier.
Chapter
Basic Concepts from Probability Theory

EXERCISES 2.2
2.2.1.
(a) S = {(R, R, R), (R, R, L), (R, L, R), (L, R, R), (R, L, L), (L, R, L), (L, L, R), (L, L, L)}
(b) P =
(c) P =
(d) P =
(e) P =
2.2.3.
(a) P =
(b) P =
(c) P =
2.2.5.
2.2.7.
(a) Probability space is

S = {(1, 1), (1, 2), (1, 3), (1, 4), (1, 5), (1, 6), (2, 1), (2, 2), (2, 3), (2, 4), (2, 5), (2, 6),
(3, 1), (3, 2), (3, 3), (3, 4), (3, 5), (3, 6), (4, 1), (4, 2), (4, 3), (4, 4), (4, 5), (4, 6),
(5, 1), (5, 2), (5, 3), (5, 4), (5, 5), (5, 6), (6, 1), (6, 2), (6, 3), (6, 4), (6, 5), (6, 6)}
(c) P =
2.2.11.
5
36
5
36
4
8
3
8
(d) P =

P A B = {(H, H), (H, T ), (T , H)} .
(b) P =
2.2.9.
7
8
4
8
3
8
2
8
6
36
1
36
(a) Probability space is S = {N, N, N, S, S}

N stands for normal and S stands for spoiled.
6
(b) P = 35 24 = 20
(c) No more than one means no or just one.
P = 35 24 + 2 25 34 = .9
P = p + 2q
11
12 CHAPTER 2 Basic Concepts from Probability Theory
2.2.13.
(a) Since A B then let A Ac = B, we know P(A) + P(Ac ) = P(B) by the axiom 3. Since
P(Ac ) 0 by axiom 1 we know P(A) + P(Ac ) P(A) then P(A) P(B).
(b) Let C = A B, A = Aa C and B = Ba C, by axiom 3 we know P(A B) =
P(Aa ) + P(Ba ) + P(C).
Since P(A) = P(Aa ) + P(C) and P(B) = P(Ba ) + P(C) by axiom 3 again, we know
P(Aa ) = P(A) P(C) and P(Ba ) = P(B) P(C). Plug them back in previous equation P(A B) = P(Aa ) + P(Ba ) + P(C) and we get the following equation P(A B) =
P(A) + P(B) P(C) = P(A) + P(B) P(A B). If A B = then by axiom 1 we know
P(A B) = 0 and just plug in we complete the proof.
2.2.15.
(a) From 2.2.13 we know P(A B) = P(A) + P(B) P(A B) and from axiom 2 we know
P(A B) 1, we can see that P(A) + P(B) P(A B) 1 and that complete the proof
P(A) + P(B) 1 P(A B).
(b) From 2.2.13 we know P(A1 A2 ) = P(A1 ) + P(A2 ) P(A1 A2 ). From axiom 1
we know P(A1 A2 ) 0 it means P(A1 A2 ) 0 and we can get the following
inequality P(A1 A2 ) = P(A1 ) + P(A2 ) P(A1 A2 ) P(A1 ) + P(A2 ).
P = .24 + .67 .09 = .82
P = 1 .82 = .18
P = 1 .09 = .91
P = 1 .09 = .91
P = 1 .82 = .18
2.2.17.
(a)
(b)
(c)
(d)
(e)
2.2.19.
(a) P = .55
(b) P = .3
(c) P = .7
2.2.21.
(a) P =
(b) P
(c) P
(d) P
2.2.23.
3
5
3
5
2
2
1
4 + 5 4 = .4
= 24 + 35 24 = .6
= 2 35 24 + 25 14 =
= 35 24 = .3
.7
Without loss of generality let us assume An is increasing sequence then A1 A2 . . .
An . . .. We know that if A1 A2 . . . An . . . then A1 A2 . . . An . . . = Ai =
i=1
limn An . From the condition we know lim An = Ai and if we take probability on

i=1

n

both sides then lim P(An ) = P Ai = P lim An

n
EXERCISES 2.3
2.3.1.
(a) 45
(b) 1
i=1
(c) 10
(d) 5400
(e) 2520
2.3.3.
1024
2.3.5.
53130
2.3.7.
155117520
2.3.9.
440
2.3.11.
(a) p = .4313 .44425 .46798 .5263 1 = .04719

(b) p = .0001189
(c) p = .4313 .21419 .10344 1 1 = .009557
2.3.13.
180
2.3.15.
(a) p = 1
365 364 ... (365 20 + 1)

36520
= .4114
(b) p = 1 .2936 = .7063

(c) If n = 23 then p = .4927
2.3.17.
p = 1 .27778 .16667 = .5556
2.3.19.
(a)
(b)
(c)
(d)
2.3.21.
The question is asking when the cell does the splitting to produce a child. There will be a
cell with half of the chromosomes. According to this understanding we have
(a) 223
7776
3.954 1021
5.36447 1028
3.022285 1012
(b)
23
9
223
= .097416
EXERCISES 2.4
2.4.1.
(a) .999
(b) 13
2.4.3.
(a) P(A|B) + P(Ac |B) =
P(AB)
P(B)
P(A|Bc ) =
P(Ac B)
P(B)
P(B|A)P(A)+P(B|Ac )P(Ac )
= P(B)
P(B)
P(B) = 1
c
know P(A|B ) = 1 P(A|B) = P(Ac |B)
(b) (i) if P(A|B) +

1 then we
means A and B are symmetric in probability. But it is not always true.
that
(ii) if P(A|B) + P(Ac |Bc ) = 1 then we know P(Ac |Bc ) = 1 P(A|B) = P(Ac |B)
That means B and Bc s conditional probability are same, which is same as A and B are
independent. But that is not always true.
2.4.5.
If A and B are independent then P(A B) = P(A) P(B)

(i) P(Ac B) = P(B)P(AB) = P(B)P(A)P(B) = (1P(A))(P(B)) = P(Ac )P(B)
then we know Ac and B are independent.
(ii) According to (i) just switch A and B and we can prove it.
(iii) P(Ac Bc ) = P(Bc ) P(A Bc ) = P(Bc ) P(A) P(Bc ) = (1 P(A))(P(Bc )) =
P(Ac ) P(Bc ).
2.4.7.
P(E|F ) =
2.4.9.
.1948
1
13
4
52
= P(E) then E and F are independent
2.4.11.
(a) P = .031125
(b) P = .06
2.4.13.
.8
2.4.15.
(a) P(a dime is selected) =

=
=
12
P(a dime is selected|box i is selected)P(box i is selected)
i=2
12
i
12 P(the sum of dies = i)
i=2
2(1)+3(2)+4(3)+5(4)+6(5)+7(6)+8(5)+9(4)+10(3)+11(2)+12(1)
12(36)
= .583333
(b) P(box 4 is selected|a penny is selected)
=
P(box 4 is selected & a penny is selected)

(3/36)(8/12)
=
P(a penny is selected)
1 P(a dime is selected)
= .13333
2.4.17.
P = .6575
2.4.19.
P = .60976
2.4.21.
(a) P(Accident rate) = .25 .086 + .257 .044 + .347 .056 + .146 .098 = .066548
(b) P(gourp4|Accident) =
.146.098
.0665
= .215
2.4.23.
P = .16667
2.4.25.
P(Working) = P(B, C) + P(A, B, notC) + P(A, notB, C)

= [1 P(notB) P(notC) P(notB, notC)] + P(A)P(B|notC)P(notC)
+ P(A) [P(notB) P(notB, notC)]
= [1 0.1 0.05 0.75 0.05] + 0.85 0.25 0.05 + 0.85 [0.1 0.75 0.05]
= 0.8875 + 0.010625 + 0.053125 = 0.95
2.4.27.
(b) P(type is O|type is B) =

2.4.29.
18
17
16
40 39 + 40
18
39 = .4615
(a) P(same type of blood) =
15
39
4
40
3
39
2
40
Let E denote A ends up with all the money when he starts with i.
1
39
= .358974
Let F denote A ends up with all the money when he starts with N i. For A starts with
N i means B starts with i because N is the total money A and B has so if we got P(E) then
P(F ) = 1 P(E).
Let H denote the event that the rst ip lands heads and p denote the probability to have
H on the rst ip.
P(E) = P(E|H)P(H) + P(E|H C )P(H C )
This probability represents that A gets a head and combined with the probability if B win
the rst coin.
Now we let P(E) = P(E|H)p + P(E|H C )(1 p) = Pi and dene this as the rst round.
New, given that the rst ip lands heads, the situation after the rst bet is that A has i + 1
units and B has N (i + 1) units.
Since the successive ips are assumed to be independent with a common probability p of
heads, it follows that, from that point on, As probability of winning all the money is exactly
the same as if the game were just starting with A having an initial fortune of i + 1 and B
having an initial fortune of N (i + 1)
Therefore, P(E|H) = Pi+1 and P(E|H C ) = Pi+1 . Let q to be 1 p,
Pi = pPi+1 + qPi1
i = 1, 2, . . . , N 1
By applying the condition that P0 = 0 and PN = 1

Pi = 1Pi = (p + q)Pi = pPi+1 + qPi1
Pi+1 Pi =
q
(Pi Pi1 )
p
i = 1, 2, . . . , N 1
After plug in i
We got
q
q
(P1 P0 ) = P1
p
p
2
q
q
P3 P2 = (P2 P1 ) =
P1
p
p
P2 P1 =
q
PN PN1 = (PN1 PN2 ) =
p
If pq = 1
Then P2 P1 = P1 and P2 = 2P1
P3 = 3P1
PN = NP1
PN = 1
N1
q
P1
p
Which means P1 =
Therefore Pi = Ni
If
q
p
1
N
q N1
q N
q
q 1 p
p
= P1 p
1, then PN P1 = 1 P1 = P1
p
1 pq
1 pq
N
N
q
1 pq
pq
p
1 = P1
+ 1 = P1
1 pq
1 pq
1 pq
P1 =
N
1 pq
Add all equations, we got

PN P1 = P1
q
+
p
2
N1
q
q
+ +
p
p
Add rst i 1 of all equations, we got

Pi P 1 = P1
i1
q
2
i1
1
p
q
q
q
q
+
+ +
= P1
p
p
p
p
1 pq
i
i
q
q
q
1
p
p
p
Pi =
=
N
N
1 pq
q
q
1 p
1 p
Then if we start with N i, just replace p by q and i by N i.
Ni
1 pq
p
Qi =
N if 1
q
p
1 q
Qi =
N i p
if = 1
N
q
Pi + Qi = 1
EXERCISES 2.5
2.5.1.
(a) c = e
(b) P = e
(c) P = 1 e e
2 e
2
2.5.3.
2.5.5.
2.5.7.
0,
.2,
F (x) =
.3,
.7,
1,
0,
.2,
p(x) =
.6,
.2,
(a) c =
where
x 5
where
5 x 0
where
0x3
where
3x6
where
66
where
x 1
where
1 x 3
where
3x9
where
x9
1
9
(b) P = .7037
0,
(c) F (x) =

2.5.9.
2.5.11.
2.5.13.
f (x) =
1 3
x ,
27
1,
0,
2x
,
(1+x)2
where
x0
where
0x3
where
x3
x0
x0
where
where
p = .7013 .55809 = .1432

p = .7364

0,
where
f (t) = (t)1 (t)
e , where
t0
0t
EXERCISES 2.6
2.6.1.

m(t) = 16 et + e2t + e3t + e4t + e5t + e6t
Var(x) = 2.9167
2.6.3.
(a) E(Y ) = 3.6

E(Y 2 ) = 17.2
E(Y 3 ) = 95.3
VAR(Y ) = 4.24
(b) My (t) = et .1 + .05 + e2t .25 + e5t .4 + e6t .2
2.6.5.
E(X) =
xP(X = x) =
2.6.7.
a = 12
b=1
2.6.9.
(a) E(c) =
n=1
2n P(X = 2n ) =
n=1
2n 21n =
cf (x) = c
(b) E(cg(x)) = cg(x)f (x) = c g(x)f (x) = cE(g(x))
n=1
1=
(c) E
gi (x) =
gi (x)fi (x) =
g(x)f (x) = E(gi (x))
(d) V (ax + b) = E((ax + b) E(ax + b))2 = E(ax + b aE(x) b)2 = E(a(x E(x))2
= a2 V (x)
Plug in b = 0 get another one.
2.6.11.
E(X) = c 1 + 0 = c
V (X) = Ex2 (E(x))2 = c2 c2 = 0
CDF is F (X) = (x c) where is indicator function
2.6.13.
Mx (t) = e(e 1)

t
E(X) = Mx (0) = et e(e 1) |t=0 =

t
t
E(X2 ) = Mx (0) = et e(e 1) + (et )2 e(e 1) |t=0 = + 2
V (X) = E(X2 ) (E(x))2 = + 2 2 =
2.6.15.
1
x
(a) p = x+1
let q = 1 p = x+1
where x start from 0 to innity means number of failures
st
before the 1 success. Therefore the total number of trails is x + 1.

x+1
E(x) =
xp(1 p)x = p
xp(q)x q 1q = qp
(q)x = qpp2 = pq by negax
x=0
x=0
x=0
tive binomial
Another way to prove is
E(x) ==
xp(1 p)x =p
x=0
E(x2 ) =
(q)x
dq
x2 p(1 p)x =pq
x2 (q)x1 = pq
x=0
= pq

d p1
(xpqx )
x=0
= pq
dq

dq
q
x=0

d(xq x )
dq
x=0
(xqx )

d(q)x
dq

1
d 1q
1
pq
q
= pq
= pq
= 2 =
2
dq
p
(1 q)
p
x=0

1 E(x)
d 1q
= pq
dq
1 q
d (1q)2
d 1q
p
1 2q + q2 + 2q 2q2 pq(1 q2 ) pq pq3
= pq
= pq
=
=
dq
dq
(1 q)4
p4
p4
V (x) = E(x2 ) (E(x))2 =

=
x=0
x=0
= pq
x(q)x1 q = pq
x=0

= pq
pq pq3
q2
pq pq3 p2 q2
pq pq2 (q + p)
2 =
=
4
4
p
p
p
p4
pq(1 q)
p2 q
q
= 4 = 2
4
p
p
p
2.6.17.
2.6.19.
2.6.21.
p
p
(b) Mx (t) =
ext p(1 p)x = p
(et q)x = 1e
t q = 1(1p)et
x=0
x=0

1
When et q 1 and that is et q 1 take ln both sides give us ln(et ) ln 1p
That is when t ln(1 p).
2
1
3
2
2 3
1
3
E(x) = 0 x(x2 )dx + 1 x 6x2x
dx + 2 x (x3)
2
2 dx = 8 + 1 + 8 = 1.5

0
1
Mx (t) = 0 ext 12 ex dx + ext 21 ex dx = (t+1)(t1)

(t)y
1
1
My (t) = 0 ety ey dy = 0 etyy dy = (t)
d(t )y = (t)
e(t)y |
0 =
0 e
1
(t) = (t) t and 0
and MGF uniquely dene the PDF therefore we know that x

Since My (t) = Mx (t) = (t)
has the
same distribution as y and
ey , 0 and 0 x
g(x) =
0,
otherwise
Chapter
Additional Topics in Probability

EXERCISES 3.2
3.2.1.
(a) P(X = 7) =
10
7
107
7 (0.5) (0.5)
= 120(0.5)7 (0.5)3
= 0.117
(b) P(X 7) = 1 P(8) P(9) P(10)
= 1 0.044 0.010 0.001
= 0.945
(c) P(X > 0) = 1 P(0)
= 1 0.001
= 0.999
(d)
E(X) = 10(0.5) = 5
Var(X) = 10(0.5)(1 0.5) = 2.5
3.2.3.
(a) P(Z > 1.645) = 0.05, so z0 = 1.645.

(b) P(Z < 1.175) = 0.88, so z0 = 1.175.
(c) P(Z < 1.28) = 0.10, so z0 = 1.28.
3.2.5.
(d) P(Z > 1.645) = 0.95, so z0 = 1.645.

20 10
(a) P(X 20) = P Z
= P(Z 2) = 0.9772
5

5 10
(b) P(X > 5) = P Z >
= P(Z > 1) = 0.8413
5

12 10
15 10
(c) P(12 X 15) = P
Z
5
5
= P(0.4 Z 1) = P(Z 1) P(Z < 0.4)
= 0.8413 0.6554 = 0.1859
21
Mathematical Statistics with Applications
Copyright 2009 by Academic Press, Inc. All rights of reproduction in any form reserved.
22 CHAPTER 3 Additional Topics in Probability
(d) P(|X 12| 15) = P(15 X 12 15) = P(3 X 27)

3 10
27 10
=P
Z
5
5
= P(2.6 Z 3.4) = P(Z 3.4) P(Z < 2.6)
= 0.9997 0.0047 = 0.9950
3.2.7.
Let X = the number of people satised with their health coverage, then n = 15 and p = 0.7.

10
1510
(a) P(X = 10) = 15
10 (0.7) (0.3)
= 3003(0.7)10 (0.3)5
= 0.206
There is a 20.6% chance that exactly 10 people are satised with their health coverage.
(b) P(X 10) = 1 P(X = 11) P(X = 12) P(X = 13) P(X = 14) P(X = 15)
= 0.515
There is a 51.5% chance that no more than 10 people are satised with their health
coverage.
(c) E(X) = np = 15(0.7) = 10.5
3.2.9.
Let X = the number of defective tubes in a certain box of 400, then n = 400 and
p = 3/100 = 0.03.

400
(a) P(X = r) =
(0.03)r (0.97)400r
r

400
(b) P(X k) = 400
(0.03)i (0.97)400i
i=k
i
(c) P(X 1) = P(X = 0) + p(X = 1) = 0.0000684
(d) Part (c) shows that the probability that at most one defective is 0.0000684, which is
very small.
3.2.11.
p(x) =
e x
x!
0 since > 0 and x 0.
Since each p(x) 0, then

p(x)
x p(x) =

1
2
e x
1 + + +
=
e
=
e
x
x
x!
x!
1!
2!
= e (e ) = 1 here we apply Taylors expansion on e .

This shows that p(x) 0 and
3.2.13.
x p(x)
= 1.
The probability density function is given by
1
,
f (x) = 10
0,
0 x 10
otherwise
Hence,
9
P(5 X 9) =
1
dx = 0.4.
10
Hence, there is a 40% chance that a piece chosen at random will be suitable for kitchen use.
3.2.15.
The probability density function is given by
1 ,
f (x) = 100
0,
0 x 100
otherwise
80 1
60 100 dx = 0.2.
100 1
(b) P(X > 90) = 90
dx = 0.1.
100
(c) There is a 20% chance that the efciency is between 60 and 80 units; there is 10% chance
that the efciency is greater than 90 units.
(a) P(60 X 80) =
3.2.17.
Let X = the failure time of the component. And X follows exponential distribution with rate
0.05. Then the p.d.f. of X is given by
f (x) = 0.05e0.05x , x > 0.
Hence,
10

R(10) = 1 F (10) = 1 0.05e0.05x dx = 1 1 e0.5 = e0.5 = 0.607.
0
3.2.19.
The uniform probability density function is given by

f (x) = 1,
0 x 1.
Hence,
P(0.5 X 0.65 and X 0.75)
P(X 0.75)
0.65
1dx
P(0.5 X 0.65)
0.15
= 0.5
=
= 0.2
=
0.75
P(X 0.75)
0.75
1dx
P(0.5 X 0.65|X 0.75) =
3.2.21.
First, nd z0 such that P(Z > z0 ) = 0.15.
P(Z > 1.036) = 0.15, so z0 = 1.036.

x0 = 72 + 1.036 6 = 78.22
The minimum score that a student has to get to an A grade is 78.22.

3.2.23.
3.2.25.
3.2.27.

1.9 1.96
2.02 1.96
Z
= P(1.5 Z 1.5) = 0.866
0.04
0.04
P(X < 1.9 or X > 2.02) = 1 P(1.9 X 2.02) = 0.134
13.4% of the balls manufactured by the company are defective.

125 115
(a) P(X > 125) = P Z >
= P(Z > 1) = 0.16
10

95 115
= P(Z < 2) = 0.023
(b) P(X < 95) = P Z <
10
(c) First, nd z0 such that P(Z < z0 ) = 0.95.
P(Z < 1.645) = 0.95, so z0 = 1.645.
x0 = 115 + 1.645 10 = 131.45
(d) There is a 16% chance that a child chosen at random will have a systolic pressure
greater than 125 mm Hg. There is a 2.3% chance that a child will have a systolic pressure less than 95 mm Hg. 95% of this population have a systolic blood pressure below
131.45.
P(1.9 X 2.02) = P
First nd z1 , z2 and z3 such that

P(Z > z1 ) = 0.2, P(Z > z2 ) = 0.5 and P(Z > z3 ) = 0.8
Using standard normal table, we can nd that z1 = 0.842, z2 = 0 and z3 = 0.842.

Then
y1 = 0 + (0.842) 0.65 = 0.5473 x1 = exp(y1 ) = 0.58, similarly we can obtain
x2 = 1 and x3 = 1.73.
For the probability of surviving 0.2, 0.5 and 0.8 the experimenter should choose doses 0.58,
1 and 1.73, respectively.

3.2.29.
(a) MX (t) = E(etX ) =
etx
0
1
=
()
1
x1 ex/ dx
()

1 t
x1 exp
x dx
1
()

0
1
=
()
u
1 t
1 t
1
eu
1 t
du by letting u =
x with 1 t > 0
1 t

u1 exp(u)du
0
note that the integrand is the kernel density of (, 1)

1
() 1
=
() 1 t
= (1 t) when t < 1 .

(0) = (1 t)1 ()
(b) E(X) = MX
= , and
t=0

(2)
E(X2 ) = M (0) = d [ (1 t)1 ()]
X
dt
t=0

= ( + 1)(1 t)2 ()2 t=0 = ( + 1)2
Then Var(X) = E(X2 ) E(X)2 = ( + 1)2 ()2 = 2 .
3.2.31.
(a) First consider the following product

()() =
u1 eu du

=
x
0
2(1) x2
2xdx
y2(1) ey 2ydv by letting u = x2 and v = y2

2
= 2
v1 ev dv

|x|
21 x2
dx 2
x2
|x|21 e
dx

=

|y|
21 y2
dy
noting that the integrands are

even functions
y2
|y|21 e
dy
|x|21 |y|21 e(x
2 +y2 )
dxdy
Transforming to polar coordinates with x = r cos and y = r sin

()() =
2
2
|r cos |21 |r sin |21 er rdrd
0 0

2

2

r 2+22 er rdr (cos )21 (sin )21 d
0
1
2
/2

s+1 es ds 4
(cos )21 (sin )21 d by letting s = r 2

0
/2
= ( + )2
(cos )21 (sin )21 d
0
1
= ( + )2
0
1
t 1/2 (1 t)1/2
dt by letting t = cos2
2 t(1 t)
1
t 1 (1 t)1 dt
= ( + )
0
= ( + )B(, )
Hence, we have shown that

B(, ) =
1
(b) E(X) =
B(, )
1
xx
(1 x)
()()
.
( + )
B( + 1, )
dx =
B(, )
1
0
x(+1)1 (1 x)1
dx
B( + 1, )
B( + 1, )
( + 1)() ( + )
=
1=
B(, )
( + + 1) ()()
=
E(X2 )
()()
( + )
=
, and
( + )( + ) ()()
+
1
=
B(, )
1
2
x x
(1 x)
B( + 2, )
dx =
B(, )
1
0
x(+2)1 (1 x)1
dx
B( + 2, )
B( + 2, )
( + 2)() ( + )
=
1=
B(, )
( + + 2) ()()
( + 1)()()
( + )
( + 1)
=
.
( + )( + + 1)( + ) ()()
( + )( + + 1)

2
( + 1)
=
Then Var(X) = E(X2 )E(X)2 =
+
.
( + )( + + 1)
( + )2 ( + + 1)
=
3.2.33.
In this case, the number of breakdowns per month can be assumed to have Poisson
distribution with mean 3.
3 1
(a) P(X = 1) = e 1!3 = 0.1494. There is a 14.94% chance that there will be just one
network breakdown during December.
(b) P(X 4) = 1 P(0) P(1) P(2) P(3) = 0.3528. There is a 35.28% chance that
there will be at least 4 network breakdowns during December.
7 3 x
e 3
(c) P(X 7) =
x! = 0.9881. There is a 98.81% chance that there will be at most 7
x=0
network breakdowns during December.

4
3.2.35.
(a) P(1 < X < 4) =

1
1
x21 ex/1 dx =
(2) 12
4

= xex 1
= 0.6442
4
1
4
xex dx
(ex )dx by integration by parts
The probability that an acid solution made by this procedure will satisfactorily etch a
tray is 0.6442.
4
4

4
1
1
11 x/2
(b) P(1 < X < 4) =
x e
dx =
ex/2 dx = ex/2 1 = 0.4712.
1
(1) 2
2
1
The probability that an acid solution made by this procedure will satisfactorily etch a
tray is 0.4712.
EXERCISES 3.3
3.3.1.
(a) The joint probability function is

8
x
P(X = x, Y = y) =
6
y
10
4xy

,
24
4
where 0 x 4, 0 y 4, and 0 x + y 4.

8
3
6
0
10
430

(b) P(X = 3, Y = 0) =
= 0.053.
24
4

8
6
10

2
2
x 1 4x1
(c) P(X < 3, Y = 1) =

P(X = x, Y = 1) =
= 0.429.
24
x=0
x=0

4
(d)
y
x
Sum
0.020
0.068
0.064
0.019
0.001
0.172
0.090
0.203
0.113
0.015
0.119
0.158
0.040
0.053
0.032
0.007
Sum
0.289
1 1
1 1
f (x, y)dxdy =
3.3.3.
1 1
0.317
0.085
0.007
0.461
0.217
0.034
c(1 x)(1 y)dxdy = c
1 1
=c x
0.421
1
x2
2 1

y
1
y2
2 1
1
= 4c
0.001
(1 x)dx
1
1.00
(1 y)dy

1 1
Thus, if c = 1/4, then 1 1 f (x, y)dxdy = 1. And we also see that f (x, y) 0 for all x
and y. Hence, f (x, y) is a joint probability density function.
3.3.5.
3.3.7.
By denition, the marginal pdf of X is given by the row sums, and the marginal pdf of Y is
obtained by the column sums. Hence,
xi
otherwise
fX (xi )
0.6
0.3
0.1
yi
otherwise
fY (yi )
0.4
0.3
0.1
0.2
From Exercise 3.3.5 we can calculate the following.

P(X = 1|Y = 0) =
3.3.9.
0.1
P(X = 1, Y = 0)
=
= 0.33.
fY (0)
0.3
(a) The marginal of X is

2
fX (x) =
2
f (x, y)dy =
4
8
xydy = (4x x3 ), 1 x 2.
9
9
1.75

8
4
xydy dx =
4x x3 dx
9
9
1.75

(b) P(1.5 < X < 1.75, Y > 1) =
1.5
1.5

1.75
x4
4
2x2
= 0.2426.
=
9
4 1.5
3.3.11.
Using the joint density in Exercise 3.3.9 we can obtain the joint mgf of (X, Y ) as
2 2
8
M(X,Y ) (t1 , t2 ) = E(et1 X+t2 Y ) =
et1 x+t2 y xydydx
9
1 x
2

2

2
1
8 t1 x
8 t1 x
x
t2 y
=
xe
e ydy dx =
xe
K et2 x + 2 et2 x dx
9
9
t2
t2
x
where K
e2t2
t22
8
= K
9
(2t2 1)
2
t1 x
xe
1
8
dx
9t2
2
2 (t1 +t2 )x
x e
1

2

x t1 x
1
8
e 2 et1 x
= K
9
t1
t1
1
8
dx + 2
9t2
2
1
xe(t1 +t2 )x dx

2

x2 (t1 +t2 )x
2x
2
(t
+t
)x
(t
+t
)x

1
2
1
2
e
e
+
e

t1 + t2
(t1 + t2 )2
(t1 + t2 )3
1

2

8
1
x
+ 2
e(t1 +t2 )x
e(t1 +t2 )x
(t1 + t2 )2
9t2 t1 + t2
1
8
9t2
After simplication we then have

t1 + 3t2 t12 3t22 4t1 t2 + t12 t2 + 2t1 t22 + t23 t1 +t2 (2t2 1)(1 t1 ) t1 +2t2
e
+
e
t22 (t1 + t2 )3
t12 t22

t1 3t2 + 2t12 + 6t22 + 8t1 t2 4t12 t2 + 8t1 t22 4t23
+
t22 (t1 + t2 )3

(2t2 1)(2t1 1) 2t1 +2t2
+
e
t12 t22
M(X,Y ) (t1 , t2 ) =
3.3.13.
(a) fX (x) =
f (x, y) =
y=0
y=0
6xy
n(n + 1)(2n + 1)
2
=
36x2
2
[n(n + 1)(2n + 1)]
y2
y=0
6x2
, x = 1, 2, . . . , n.
n(n + 1)(2n + 1)

2
n
n
n
6xy
36y2
f (x, y) =
=
x2
fY (y) =
2
[n(n + 1)(2n + 1)] x=0
x=0
x=0 n(n + 1)(2n + 1)
=
6y2
, y = 1, 2, . . . , n.
n(n + 1)(2n + 1)
Given y = 1, 2, . . . , n, we have

f (x|y) =
f (x, y)
=
fY (y)
2
6xy
6x2
n(n + 1)(2n + 1)
=
, x = 1, 2, . . . , n.
n(n + 1)(2n + 1)
6y2
n(n + 1)(2n + 1)
(b) Given x = 1, 2, . . . , n, we have

f (y|x) =
3.3.15.
(a) E(XY ) =
(b) E(X) =
E(Y ) =
x,y
x,y
x,y
f (x, y)
=
fX (x)
xy f (x, y) =
x f (x, y) =
2
6xy
6y2
n(n + 1)(2n + 1)
=
, y = 1, 2, . . . , n.
n(n + 1)(2n + 1)
6x2
n(n + 1)(2n + 1)
3
xy f (x, y) =
x=1 y=1
3
x f (x, y) =
5
, and
3
y f (x, y) =
11
.
6
x=1 y=1
y f (x, y) =
x=1 y=1
35
.
12
35 5 11
5

= .
12 3 6
36
3
3
2
5
(c) Var(X) = [x E(X)]2 f (x, y) =
x 53 f (x, y) = , and
9
x,y
x=1 y=1
Then, Cov(X, Y ) = E(XY ) E(X)E(Y ) =
Var(Y ) =
x,y
[y E(Y )]2 f (x, y) =
Then, XY =
3.3.17.
3
y
x=1 y=1

11 2
6
f (x, y) =
23
.
36
Cov(X, Y )
5/36
=
= 0.233.
(5/9)(23/36)
Var(X)Var(Y )
Assume that a and c are nonzero.

Cov(U, V ) = Cov(aX + b, cY + d) = acCov(X, Y ),
Var(U) = Var(aX + b) = a2 Var(X), and Var(V ) = Var(cY + d) = c2 Var(Y ).
Cov(U, V )
acCov(X, Y )
ac
XY
=
=
2
2
Var(U)Var(V )
a Var(X)c Var(Y )
(ac)2

ac
,
if ac > 0
.
=
XY = XY
XY , otherwise
|ac|
Then, UV =
3.3.19.
We st state the famous CauchySchwarz inequality:

|E(XY )| E(X2 )E(Y 2 ) and the equality holds if and only if there exists some constant
and , not both zero, such that P( |X|2 = |Y |2 ) = 1.
Now, consider

Cov(X, Y )
= 1 |Cov(X, Y )| = Var(X)Var(Y )
|XY | = 1

Var(X)Var(Y )

|E(X X )(Y Y )| = E(X X )2 E(Y Y )2
By the CauchySchwarz inequality we have

P |X X |2 = |Y Y |2 = 1
P(X X = K(Y Y )) = 1 for some constant K

P(X = aY + b) = 1 for some constants a and b.
3.3.21.
(a) First, we compute the marginal densities.

fX (x) =

f (x, y)dy = ey dy = ex , x 0, and
x
y
fY (y) =
y
f (x, y)dx =
ey dx = yey , y 0.
For given y 0, we have the conditional density as

f (x|Y = y) =
f (x, y)
ey
1
=
= ,
1 y
fY (y)
y
e
y
0 x y.
y
Then, (X|Y = y) follows Uniform(0, y). Thus, E(X|Y = y) = .
2

y

y dx dy = 1 y3 ey dy = 3,
(b) E(XY ) = y x xy f (x, y)dxdy = 0
xye
0
0 2

x
E(X) = x x fX (x)dx = 0 xe dx = 1, and

E(Y ) = y y fY (y)dy = 0 y2 ey dy = 2.
Then, Cov(X, Y ) = E(XY ) E(X)E(Y ) = 3 1 2 = 1.
(c) To check for independence of X and Y
fX (1)fY (1) = e2 = e1 = f (1, 1).
Hence, X and Y are not independent.

3.3.23.
Let 2 = Var(X) = Var(Y ). Since X and Y are independent, we have Cov(X, Y ) = E(XY )
E(X)E(Y ) = E(X)E(Y ) E(X)(Y ) = 0. Then, Cov(X, aX + Y ) = aCov(X, X) + Cov(X, Y ) =
aVar(X) = a 2 , and Var(aX + Y ) = a2 Var(X) + Var(Y ) = (a2 + 1) 2 . Thus, X, aX+Y =
a
Cov(X, aX + Y )
a 2
=
.
2
2
2
2
Var(X)Var(aX + Y )
a +1
(a + 1)
EXERCISES 3.4
3.4.1.
The pdf of X is fX (x) =
1
a
if 0 < x < a and zero otherwise.

yd

c
yd
FY (y) = P(Y < y) = P(cX + d < y) = P X <
=
fX (x)dx
c
1,
= yd,
ac
0,
Then, fY (y) =
3.4.3.
yd
1,
if
a
c
yd
yd
,
if 0 <
<a =
c
ac
0,
otherwise
if y ac + d
if d < y < ac + d
otherwise
dFY (y)
1
=
if d < y < ac + d and zero otherwise.
dy
ac
Let U = XY and V = Y .
Then X = U/V and Y = V , and
x

J = u
y

u
x 1 u

v = v v2 = 1 .

y

v

0 1
v
Then the joint pdf of U and V is given by

u 1
fU,V (u, v) = fX,Y
, v |J| = fX,Y
, v
v
v
v
u
Then the pdf of U is given by

fU (u) =
fU,V (u, v)dv =
1
, v dv.
v
v
u
3.4.5.
fX,Y
The joint pdf of (X, Y ) is

1
fX,Y (x, y) =
e
21 2
2
x2
+ y
212 222
, < x < , < y < ; 1 , 2 > 0.
We can easily show that the marginal densities are

fX (x) =
f (x, y)dy =
1
21

fY (y) =
x2
1
, < x < , and
f (x, y)dx =
1
22
y2
, < y < .

This implies that X N 0, 12 and Y N 0, 22 .
Also, notice that fX,Y (x, y) = fX (x)fY (y) for all x and y, thus X and Y are independent.
By the denition of Chi-Square distribution and the independency of X and Y , we know
that 12 X2 + 12 Y 2 2 (2). Therefore, the pdf of U = 12 X2 + 12 Y 2 is
1
fU (u) =
3.4.7.
1 u
e 2,
2
u > 0.
Let U = X + Y and V = Y .
Then X = U V and Y = V , and

x

u
J =
y

u

x

v 1 1
=
= 1.

y 0 1

v

fU,V (u, v) = fX,Y (u v, v) |J| = fX,Y (u v, v).
Thus, the pdf of U is given by fU (u) =
fX,Y (u v, v)dv.
3.4.9.
(a) Here let g(x) =
x
d 1
, and hence, g1 (z) = z + . Thus,
g (z) = . Also,
dz

2
1 x
1
fX (x) =
e 2 , < x < . Therefore, the pdf of Z is fZ (z) = fX (g1 (z))
2

d 1 1 12 z2
, < z < , which is the pdf of N(0, 1).
dz g (z) = 2 e
(b) The cdf of U is given by

(X )2
X
FU (u) = P(U u) = P
u
=
P
u
=P uZ u
u
=
1 2
e 2 z dz
u
=2
0
1 2
1
e 2 z dz, since the integrand is an even function.
2
d
2
1
1
u
1
u
FU (u) = e 2 = u 2 e 2 , u > 0 and
du
2
u
2
2
zero otherwise, which is the pdf of 2 (1).
Hence, the pdf of U is fU (u) =
3.4.11.
Since the support of the pdf of V is v > 0, then g(v) = 12 mv2 is a one-to-one function on

d 1
1
m 2
the support. Hence, g1 (y) = 2y
m . Thus, de g (y) = 2 2y m . Therefore, the pdf of E is
given by

d

2y
2y
1
c 2y 2y
f (y) = fV (g1 (y)) g1 (y) = c e m
=
e m ,
dy
m
2my
m3
3.4.13.
y > 0.
Y
Let U = X2 + Y 2 and V = tan1 X
. Here U is considered to be the radius and V is the
angle. Hence, this is a polar transformation and hence is one-to-one.
Then X = U cos V and Y = U sin V , and

x

u
J =
y

u

x

v cos v u sin v
=
= u cos2 v + u sin2 v = u.

y sin v u cos v

v

fU,V (u, v) = fX,Y (uv, v) |J| =
=
3.4.15.

1
1
exp
2 u2 cos2 v + u2 sin2 v u.
2
2
2
u u22
e 2 , u > 0, 0 v < 2.
2 2
The joint pdf of (X, Y ) is

fX,Y (x, y) = fX (x)fY (y) =
1 x+y
e 2 , x, y > 0.
4
Apply the result in Exercise 3.4.14 with = 2. We have the joint pdf of U =
as fU,V (u, v) = 12 e(u+v) , v > 2u, v > 0. Thus, the pdf of U is given by
XY
2
and V = Y

fU (u) =
fU,V (u, v)dv

v
1 (u+v)
1
e
dv = eu , u 0
2
2
1 (u+v)
1
e
dv = eu , u < 0.
2
2
2u
EXERCISES 3.5
3.5.1.
(a) Note that X follows (5, 5), then apply the result in Exercise 3.2.31 we have = 1/2
and 2 = 1/44. From the Chebyshevs theorem
P( K < X < + K) 1
3.5.3.
1
.
K2
Equating K to 0.2 and + K to 0.8 with = 1/2 and =

1
obtain K = 2. Hence, P(0.2 < X < 0.8) 1 2 = 0.75.
2
0.8
(b) P(0.2 < X < 0.8) = 0.2 630x4 (1 x)4 dx = 0.961.
2 = E(X )2 = (x )2 f (x)
xK
(x )2 f (x) +
xK
K<x<+K
(x )2 f (x) +
(x )2 f (x) +
1/44 = 0.15, we
(x )2 f (x)
x+K
(x )2 f (x)
x+K
Note that (x )2 K2 2 for x K or x + K. Then the above inequality can

be written as
2 K2 2
f (x) +
xK
f (x)
x+K
= K2 2 [P(X K) + P(X + K)]

= K2 2 P(|X | K)
This implies that P(|X | K)
1
K2
or P(|X | < K) 1
1
.
K2
3.5.5.
Apply Chebyshevs theorem we have

Xn
E(Xn np)2
P
p < 0.1 = P(|Xn np| < 0.1n) 1
n
(0.1n)2
=1
Var(Xn )
np(1 p)
=1
0.01n2
0.01n2
= 1 100
p(1 p)
100 1
1
1
, since p(1 p) .
n
n
4
4
This implies that we want to nd n such that

1
3.5.7.
25
100 1
= 0.9
= 0.1 n = 250.
n
4
n
Let X1 , . . . , Xn denote each toss of coin with value 1 if head occurs and 0 otherwise. Then,
X1 , . . . , Xn are independent variables which follow Bernoulli distribution with p = 1/2.
Thus, E(Xi ) = 1/2, and Var(Xi ) = 1/4. For any > 0, from the law of large numbers
we have

Sn
Sn
1
1
P
< 1 as n , i.e.
will be near to for large n.
n
2
n
2
If the coin is not fair, then the fraction of heads,

getting head for large n.
3.5.9.
Sn
n ,
will be near to the true probability of
Note that E(Xi ) = 0, and Var(Xi ) = i. Hence, X1 , . . . , Xn are not identically distributed.
Thus, the conditions of the law of large numbers stated in the text are not satised. There
is a weaker version of the weak law of large numbers which requires only E(Xi ) = , and
Var(X) 0 as n . However, in this case
n
n
n
Xi
Var(Xi )
i
n(n+1)
1
i=1 i=1
i=1
2
Var(X) = Var
=
=
as n .
=
2
2
2
n
2
n
n
n
Therefore, the conditions of the weaker version are not satised, either.
3.5.11.
First note that E(Xi ) = 2 and Var(Xi ) = 2. From the CLT,
X100 2
follows approximately
2/ 100
N(0, 1). Hence, we have

X100 2
22
P(X100 > 2) = P
>
2/ 100
2/ 100
3.5.13.

P(Z > 0) = 0.5, where Z N(0, 1).
First note that E(Xi ) = 1/2 and Var(Xi ) = 1/12. Then, by CLT we know that Zn =
X1/2
1/12/ n
approximately follows N(0, 1) for large n.
S
n n/2
n/12
3.5.15.
From the Chebyshevs theorem

P( K < X < + K) 1
1
.
K2
Equating K to 104 and + K to 140 with = 122 and = 2, we obtain K = 9.

Hence,
1
P(104 < X < 140) 1 2 = 0.988.
9
3.5.17.
Let Xi = 1 if the ith person in the sample is color blind and 0 otherwise. Then each
Xi follows Bernoulli distribution with estimated probability 0.02, and E(Xi ) = 0.02 and
Var(Xi ) = 0.0196. Let Sn = ni=1 Xi .

Sn /n0.02
We want P(Sn 1) = 0.99. By the CLT,
follows approximately N(0, 1). Then,
0.0196/n

1/n 0.02
0.99 = P(Sn 1) P Z
.
0.0196/n
3.5.19.
Using the normal table, 1/n0.02

= 2.33. Solving this equation, we have n = 359.05.
0.0196/n
Thus, the sample size must be at least 360.
1
100
Let X1 , . . . , X100 be iid with = 1 and 2 = 0.04. Let X = 100
i=1 Xi .
By the CLT,
X1
0.04/100
follows approximately N(0, 1). Then,

11
0.99 1
P(0.99 X 1) P
Z
= P(0.5 Z 0) = 0.1915.
0.04/100
0.04/100
3.5.21.
Let Xi = 1 if the ith dropper in the sample is defective and 0 otherwise. Then each Xi follows
500
Bernoulli distribution with estimated probability 10000
= 0.05, and E(Xi ) = 0.05 and
125
125 1250.05
follows approximately
Var(Xi ) = 0.0475. Let S125 = i=1 Xi .From the CLT, S
0.0475125
N(0, 1). Hence, we have

2 125 0.05
P(S125 2) P Z
= P(Z 1.744) = 0.041.
0.0475 125
Chapter
Sampling Distributions
EXERCISES 4.1
4.1.1.

(a) There are 53 = 10 equally likely possible samples of size 3, so the probability for each
is 1 10 without replacement:
X
1
2 3
1 3
1 3
0
(2, 1, 0)
(2, 1, 1)
(2, 1, 2)
(2, 0, 1)
(2, 0, 2)
(2, 1, 2)
(1, 0, 1)
(1, 0, 2)
(1, 1, 2)
(0, 1, 2)
13
0
13
23
M
1
1
1
0
0
1
0
0
1
1
S
1
21 3
41 3
21 3
4
41 3
1
21 3
21 3
1
(i)
X
p(X)
2/3
1/3
1/3
2/3
1/10
1/10
2/10
2/10
2/10
1/10
1/10
M
p(M)
1
3/10
0
4/10
1
3/10
(ii)
(iii)
S
p(S)
1
3/10
73
4/10
2
1/10
13 3
2/10
37
38 CHAPTER 4 Sampling Distributions
(iv)

E X = (1) 1 10 + 2 3 1 10 + 1 3 1 10 + 0 1 10 + 1 3 2 10

+ 2 3 1 10 + (1) 1 10 = 0

2

2

2
2 2
1
1
1
+
+
0
E X = (1)2 1 10 + 2 3
10
3
10
10
2

2
2 1
2
2
1
1
+ 13
10 + 3
10 + 1
10 = 3

2 2
1
1
Var X = E X EX = 02 =
3
3
(b) We can get 53 = 125 samples of size 3 with replacement

4.1.3.
Population: {1, 2, 3}. p(x) = 1 3, for x in {1, 2, 3}

N
N
(a) = N1
ci = 2, 2 = N1
(ci )2 = 2 3
i=1
i=1
(b)
Sample
Sample
Sample
(1, 1, 1)
(2, 1, 1)
11 3
(3, 1, 1)
12 3
(1, 1, 2)
11 3
(2, 1, 2)
12 3
(3, 1, 2)
(1, 1, 3)
12 3
(2, 1, 3)
(3, 1, 3)
21 3
(1, 2, 1)
11 3
(2, 2, 1)
12 3
(3, 2, 1)
(1, 2, 2)
12 3
(2, 2, 2)
(3, 2, 2)
21 3
(1, 2, 3)
(2, 2, 3)
21 3
(3, 2, 3)
22 3
(1, 3, 1)
12 3
(2, 3, 1)
(3, 3, 1)
21 3
(1, 3, 2)
(2, 3, 2)
21 3
(3, 3, 2)
22 3
(1, 3, 3)
21 3
(2, 3, 3)
22 3
(3, 3, 3)
11 3
12 3
21 3
22 3
1/27
1/9
2/9
7/27
2/9
1/9
1/27
X
p(X)

2
(c) E X = X x p(x) = 2, E X = X x2 p(x) = 42 9, then Var X = 2 9
4.1.5.
Since
(xi x)2 =
xi2 nx2 , we have E(S )2 =
1
n
EXi2
nEX
n
Assuming the sampling from a population with mean and variance 2 , we have

2
1
E S = n 2 + 2
n
2
+ 2
n

2
= 2
n

n1
=
2 < 2 = E S2
n
4.1.7.
Let X be the weight of sugar X N( = 5 lb, = 0.2 lb)

n
Then X = 1n
Xi is the mean weight, where n = 15.
i=1
2
n .
Then X N(5, 0.22 15), and X5

=Z
0.22 15
2
N(0,
1 ). Therefore, the probability requested is P(0.2 < X 5 < 0.2) = P( 15 < Z <
15) = 0.9999
By Corollary 4.2.2, E(X) = , and Var(X) =
4.1.9.
Let X be the height. X N( = 66, 2 = 22 ), and
170) = P(Z > 2 26) = 0
X66
2 26
= Z N(0, 12 ). Then, P(X >
4.1.11.
2
Let X be the time. X N( = 95, 2 = 102 ). Then X95
10 = Z N(0, 1 ). Therefore,
P(X < 85) = P(Z < 1) = 0.8413, or 84.13% of measurement times will fall below
85 seconds.
4.1.13.
According the information, = 215 and = 35.

= 0.0007
(a) If n = 55, we can assume X N(, ), then P X > 230 = P Z > 230215
35 55

(b) If n = 200, we can assume X N(, ), then P X > 230 = P Z > 230215
=0
35 200

(c) If n = 35, we can assume X N(, ), then P X > 230 = P Z > 230215
= 0.0056
35 35
(d) Increasing the sample size, decrement the probability
4.1.15.
Let T be the temperature.

Since n = 60, we assume T N(98.6, 0.952 ). Then T N(98.6, 0.952 60). Therefore, P(T
99.1) = 0
EXERCISES 4.2
4.2.1.
2
We have that Y (15)
(a) We can see, for example in a table, that P(Y 6.26) = 0.025. Then y0 = 6.26
(b) Choosing upper and lower tail area to 0.025, and since P(Y 27.5) = 0.975, and
2
= 27.5, a =
P(Y 6.26) = 0.025, then P(a < Y < b) = 0.95, then b = 0.975,15
2
0.025,15 = 6.26
(c) P(Y 22.307) = 1 P(Y < 22.3) = 0.10
4.2.3.
2 (k = v/2, = 2). In our case T (1, 2), then T =

If X (k, ), then X (v)
(n, 2/n)
(a) With n = 3, T (3, 2/3)
(b) P(T > 2) = 0.4232
1
n
i=1
4.2.5.
Since X1 , X2 , . . . , X5 are i.i.d. N(55, 223), then Y =

2
n X55
i=1

2
n X55
(Xi 55)2
223
2
(5)
2 , Z is Chi-square distributed with 4 degrees

(a) Since Z = Y 223 , and 223 (1)
of freedom and Y is Chi-square distributed with 5 degrees of freedom
(b) Yes
(c) (i) P(0.62 Y 0.76) = 0.0075 (ii) P(0.77 Z 0.95) = 0.0251
2
4.2.7.
2
(n1)
.
Since the random sample comes from a normal distribution, (n1)S
2
Setting the upper and lower tail area equal to 0.05, even this is not the only choices, and using
2
= 0.95,14
= 23.68,
a Chi-square table with n 1 = 14 degrees of freedom, we have (n1)b
2
(n1)a
2
and 2 = 0.05,14 = 6.57. Then, with = 1.41, b = 3.36, and a = 0.93
4.2.9.
Since T t8
(a) P(T 2.896) = 0.99
(b) P(T 1.860) = 0.05
(c) Since t-distribution is symmetric, we nd a such that P(T > a) =
4.2.11.
4.2.13.
4.2.15.
0.01
2 .
Then a = 3.35
According with the information, = 11.4, n = 20, y = 11.5, and s = 2, then t = sy

=
n
0.224. The degrees of freedom are n 1 = 19, so the critic value is 1.328 at = 0.05-level.
Then, the data tend to agree with the psychologist claims.

2 , then X = v , = 2 , then E(X) = v (2) = v and Var(X) = v
If X (v)
2
2
2
(2)2 = 2v
If X1 , X2 , . . . , Xn is from N(, 2 )
(n1)S 2
2
2
is from (n1)
&
2
then, by Exercise 4.2.13, Var (n1)S
= 2(n 1)
2
then, by Theorem 4.2.8,
Since Var(aX) = a2 Var(X),
(n1)2
Var(S 2 )
4
Simplifying after multiplying by
4
,
(n1)2
= 2(n 1)
we obtain Var(S 2 ) =
2 4
n1
4.2.17.
If X and Y are independent random variables from an exponential distribution with com2 and 2Y 2 then
mon parameter = 1, then using 4.2.16 with n = 1, 2X (2)
(2)
X
2X
=
F
(2,
2)
Y
2Y
4.2.19.
If X F (9, 12)
(a) P(X 3.87) = 0.9838
(b) P(X 0.196) = 0.01006
(c) F0.975 (9, 12) = 0.025 then F0.975 = 3.4358.

1
, where
0.025 = P(X < F0.975 ) = P X1 > F0.975
Then
1
F0.975
1
X
F (12, 9)
= 3.8682 and F0.975 = 0.258518. Thus, a = 0.2585, b = 3.4358
4.2.21.
If X F (n1 , n2 ) the PDF is given by

n1 1

2
n1
n1 (n1 +n2 )/2
[(n1 + n2 )/2] n1
x
1+
x
,
n2
n2
f (x) = (n1 /2)(n2 /2) n2
0,
0<x<
otherwise
Then

n1 1

2
n1 (n1 +n2 )/2
[(n1 + n2 )/2] n1
n1
EX = x
x
1+
x
dx
(n1 /2)(n2 /2) n2
n2
n2
0
n1 1

n1
[(n1 + n2 )/2] n1
n1 2
n1 (n1 +n2 )/2
x 2 1+
x
dx
(n1 /2)(n2 /2) n2
n2
n2
0

1
1
1
Let y = 1 1 + nn12 x
then x = nn12
y(1 y)1 and dx = nn12
(1 y)2 dy and

1
lim 1 1 + nn12 x
= 1, then
x
[(n1 +n2 )/2]

EX = (n
1 /2)(n2 /2)
n1
n2
n2
2
n2
n1
n2 +1 1
2
n2
2
(1 y)n1 2 dy, which converges for n1 > 2.
1
1 x
For > 0, > 0 y1 (1 y)1 dy = ()()
e dx with the
0 x
(+) , where () =
0
property () = ( 1)( 1).

Then EX =
[(n1 + n2 )/2]
(n1 /2)(n2 /2)
n1
n2
n2
2
n2
n1
n2 +1
2
[(n1 + n2 )/2]
n2
n1 (n1 /2)(n2 /2 1)(n2 /2 1)
n2
, n2 > 2
=
n2 2
=
(n1 /2 + 1)(n2 /2 1)
[(n1 + n2 )/2]
n (n /2)(n /2 1)
1
[(n1 + n2 )/2]
Similarly,
[(n1 + n2 )/2]
EX2 =
(n1 /2)(n2 /2)
n1
n2
n1
2
n2
n1
n2 +2 1
2
n1
2
(1 y)
n2
2 3 dy,
which converges for n2 > 4
2
[(n1 + n2 )/2]
n2
(n1 /2 + 1)(n1 /2)(n1 /2)(n2 /2 2)
(n1 /2)(n2 /2 1)(n2 /2 2)(n2 /2 2) n1
[(n1 + n2 )/2]
2
n1 (n1 + 2)
n2
=
, n2 > 4.
n1
(n2 2)(n2 4)
=
Now, Var(X) = EX2 (EX)2 . Therefore,

EX =
n2
, n2 > 2
n2 2
and
VarX =
n22 (2n1 + 2n2 4)

n1 (n2 2)2 (n2 4)
4.2.23.
If X1 , X2 , . . . , Xn1 is a random sample from a normal population with mean 1 and variance 2 and if Y1 , Y2 , . . . , Yn2 is a random sample from an independent normal population

(n 1)S 2
2
2
1
1
with mean 2 and variance 2 , then X N 1 , n1 , Y N 2 , n2 ,
2
2
(n2 1)S2
2
2
(n
, and
(n
.
1 1)
2 1)
2

2
2
(n 1)S 2
(n 1)S 2
2
Then X Y N 1 2 , n1 + n2 and 1 2 1 + 2 2 2 (n
1 +n2 2)
then
XY
' (1 2 )
2 2
n +n
1
N(0, 12 ) and
(n1 1)S12
2
(n2 1)S22
2
2
(n
1 +n2 2)
Then, since the samples are independent, we have by denition that

X Y (1 2 )
(
2
2
+
n1
n2
T(n1 +n2 2)
),
*
(n2 1)S22
* (n1 1)S12
+
+
2
2
[n1 + n2 2]
This after simplication becomes:

(
4.2.25.
X Y (1 2 )

T(n1 +n2 2)
(n1 1)S12 + (n2 1)S22 1
1
+
n1 + n2 2
n1
n2
Q.E.D.
2 with v > 0, then the pdf of X is given by

If X (v)
f (x) =
1
ex/2 xv/21 ,
(v/2)2v/2
0,
0<x<
x0
Then, by denition of MGF,

1
MX (t) =
(v/2)2v/2

ex(t1/2) xv/21 dx =
0
= (1 2t)v/2 ,
v/2 w v/21
1
e w
dw
1 2t
(v/2)
0
t<
1
2

(t) = (1 2t)v/21 , then EX = M (t) |
Since (v/2) = 0 ew wv/21 dw and MX
t=0 = v
X

v/22
, then EX2 = MX (t) |t=0 = v2 + 2v
MX (t) = (v + 2)(1 2t)
Therefore, VarX = EX2 (EX)2 = 2v
4.2.27.
Let X be a random variable with PDF

f (x) =
2
,
(1 + x2 )
0,
0<x<
otherwise
X does not follows T1 distribution (i.e. X is not t(1) distributed), even X2 F (1, 1).
Therefore, X2 F (1, n) does not (necessarily) imply X t(n)
EXERCISES 4.3
4.3.1.
f (x) =
1 x/10
,x
10 e
> 0, then the cumulative distribution of 1 is

t
F1 (t) =
t
1 x/10

e
dx = ex/10 = 1 et/10 .
0
10
Let Y represent the life length of the system, then Y = min(1 , 2 ) and FY (y) = 1
[1 Fi (y)]2 , then the pdf of Y is of the form fY (y) = 2fi (y)[1 Fi (y)] and is given by
ey/5 ,
fy (y) = 5
0,
0<y<
otherwise
4.3.3.
X1 , X2 take values 0, 1; X3 take values 1, 2, 3, and Y1 = min {X1 , X2 , X3 }

Since the values of X1 , X2 are less or equal to the values for X3 , Y1 take values 0, 1
Since the values of X3 are greater than the values for X1 , X2 , then Y3 = max {X1 , X2 , X3 }
take values 1, 2, 3
Since Y1 Y2 Y3 , Y2 take values 0, 1
4.3.5.
Let X1 , X2 , . . . , Xn be a random sample from exponential distribution with mean , then

x
the common pdf is given by f (x) = 1 e , if x > 0
Using Theorem 4.3.2, the pdf of the k-th order statistic is given by
fk (y) = fYk (y) =
y
n!
f (y)(F (y))k1 (1 F (y))nk , where F (y) = 1 e
(k 1)!(n k)!
&
% y &nk
y k1
1 e
e
% y &n1
ny
Then, the pdf of Y1 is f1 (y) = nf (y) e
= n e , which is the pdf of an exponential
distribution with mean n , and the pdf of Yn is
then fk (y) =
n!
(k1)!(nk)! f (y)
fn (y) = nf (y)[F (y)]n1

ny &n1
n y%
= e 1 e
4.3.7.
X1 , . . . , Xn a random sample are i.i.d with pdf f (x) = 12 , 0 x 2

x
then F (x) = 0 21 dx = 2x , if 0 x 2
1, x > 2
x
then F (x) =
, 0x2
2
0, x < 0
Then, using Theorem 4.3.3 the joint pdf of Y1 and Yn is given by

% x &11 1
n!
1 n11 %
y &nn 1 1
fY1 ,Yn (x, y) =
y x
1

(1 1)!(n 1 1)!(n n)! 2
2
2
2
2 2
=
n(n 1)
(y x)n2 , if 0 x < y 2 and fY1 ,Yn (x, y) = 0, otherwise.
2n
Now, let R = Yn Y1 and Z = Yn , and consider the functions r = yn y1 , z = yn , then

their inverses are y1 = z r, yn = z.
Then the corresponding Jacobian of the one-to-one transformation is

y1

r
J =
yn

r

y1

z 1
=
yn 0

z

1
= 1
1
Then the joint pdf of R and Z is

g(r, z) = |1|fY1 ,Yn (z r, z)
=
n(n 1) n2
r
, if 0 r z 2, and g(w, z) = 0, otherwise.
2n
Then, the pdf of the range R = Yn Y1 is

2
fR (r) =
g(r, z)dz =
r
4.3.9.
n(n 1)(2 r)r

, if 0 r 2 and fR (r) = 0, otherwise.
2n
X1 , . . . , Xn a random sample from N(10, 4)

P(Yn > 10) = 1 P(Yn 10)
The CDF Fn (y) of Yn is [F (y)]n , where F (y) is the cdf of X evaluated in y

Then, P(Yn y) = Fn (y) = [F (y)]n = [P(X y)]n

and P(X y) = P Z y10
2
%
&n
Then P(Yn y) = P Z y10
2
Then P(Yn > y) = 1 P(Y
% n < y)
=1 P Z
y10
2
n &
Therefore P(Yn > 10) = 1 [P(Z 0)]n

= 1 (0.5)n
4.3.11.
X1 , . . . , Xn is a random sample from Beta(x = 2, = 3)

The joint pdf of Y1 and Yn , according Theorem 4.3.3, is given by
fY1 ,Yn (x, y) =
n!
[F (x)]11 [F (y) F (x)]n11 [1 F (y)]nn f (x)f (y)
(1 1)!(n 1 1)!(n n)!
= n(n 1)[F (y) F (x)]n2 f (x)f (y),
if
x<y
Since Xi Beta(X = 21 = 3) for i = 1, 2, . . . , n, the pdf is

f (x) =
( + ) 1
(1 x)1 ,
()()
x [0, 1]
and, the DF is
0,
( + )
x 1
(1 t)1 dt,
F (x) =
0 t
()()
1,
x0
0x1
x1
In our case,
f (x) =
(5)
4!
x21 (1 x)31 =
x(1 x)2 = 12x(1 x)2 ,
(2)(3)
1!2!
if
0x1
and
x
F (x) =
,
12t(1 t)2 dt = 12
2x3
x4
x2
+
,
2
3
4
if
0x1
Then, the joint pdf fY1 ,Yn (x, y), using the 4.3.3, is given by
,

-n2
2y3
y4
2x3
x4
y2
x2
fY1 ,Yn (x, y) = n(n 1) 12
+
10
+
12x(1 x)2 12y(1 y)2
2
3
4
2
3
4

2
1
n2
1 2
= 12n n(n 1)
y x2 y 3 x3 + y 4 x4
xy(1 x)2 (1 y)2 ,
2
3
4
if 0 x y 1, and fY1 ,Yn (x, y) = 0, otherwise
EXERCISES 4.4
4.4.1.
2 = 4, then = 8 and 2 = 4
X1 , X2 , . . . , Xn , where n =150, = 8,
x
x

X x
2
z
By Theorem 4.4.1: lim P Z
= 1 e 2 du
2k
n
x

108
= 0.44
then P(7.5 < X < 10) = P 758
2 <Z< 2
4.4.3.
Let T be the time spent by a customer coming to certain gas station to ll up gas
Suppose T1 , T2 , . . . , Tn are independent random variables, with t = 3 minutes, t2 = 1.5
minutes, and n = 75
n
Ti is the total time spent by the n customers

Then Y =
i=1

Since, Y = 3 hours = 180 minutes, P(Y < 180) = P Y < 180
= P(Y < 2.4)
n

1.5
and by Theorem 4.4.1, Y N 3,
75
For practical purposes n = 60 is large enough

2.4 3
Therefore, P(Y < 180) = P Z <
= 0, where Z N(0, 12 )
0.02
There is 0% chance that the total time spent by the customers is less than 3 hours.
4.4.5.
1250 students took it, and = 69%, = 5.4%, and n = 60 students (is large enough).
Then, x = 69 and x = 5.4 = 0.6997137
60

Then P(X 75.08) = P Z 75.0869
0.697137 = 1
There is almost 100% chance that the average score is less than 75.08%.
4.4.7.
Xi N(1 , 12 ) for i {1, 2, . . . , n}, then X N(1 , 12 /n)

Yj N(2 , 22 ) for j {1, 2, . . . , m}, then Y N(2 , 22 /m)
Therefore,

2
2
X Y N 1 2 , 1 + 2
n
m
4.4.9.
X Binomial (n = 20, p = 0.2)

P(X 10) =
10

20 0.2x (1 0.2)20x = 0.9944
x
x=0
Using normal approximation:

10 np + 0.5
P(X 10) = P Z
= 0.99986
np(1 p)
4.4.11.
q = 6% of person making reservations will not show up each day

Rental company reserves for n =215 persons
200 automobiles available.
p = 0.94 is the probability of the person making reservation will show up each day
Let X be the number of the persons making reservation will show up.
Then X Binomial(n = 215, p = 0.94)

The probability requested is P(X 200) = P Z
200
np + 0.5
np(1 p)
= 0.3228
4.4.13.
SIDS occurs between the ages 28 days and one year

Rate of death due to SIDS is 0.0013 per year.
Randon sample of 5000 infants between the ages 28 day and are your
Let X be the number of SIDS related deaths
p = 0.00103, n = 5000
Then X Bin(n = 5000, p = 0.00103)
The probability requested is

10 np 0.5
P(X > 10) P Z >
= 0.0274
np(1 p)
Chapter
Point Estimation
EXERCISES 5.2
5.2.1.
The pdf of a geometric distribution is:

f (x) = p(1 p)x1
for x = 1, 2also, = p1 .
By the methods of moments E(X) = x =
1
p
p
=x
(b) For the given data
5.2.3.
x = n i = 2+4++22+12
= 16.11
18
p
= 16.11
Xi U( 1, + 1) By denition:
f (x) =
1
+1+1 ,
0,
if 1 < x < + 1
otherwise.
(a) By denition
E(x) = x =
1++1
2
x=
This implies the moment of estimator for = x (b) = x

11.72+12.81+12.09+13.47+12.37
4
= 53.18
5.2.5.
The pdf of the exponential function is given as:

f (x) =
5.2.9.
e(x) ,
0,
if x 0
otherwise.
Similar as above
49
50 CHAPTER 5 Point Estimation
5.2.11.
We have
E(X) = = x
2

Variance( 2 ) = 1n
Xi X
.
2
/
2 = 1n
xi x2
(

2
Xi
1
2
= n
Xi
n
The method of moments
(

2
Xi
1
2
X
i
n
n
5.2.13.
estimator
for
is
given
by
T (X1 , . . . . . . .Xn ) =
The method of moments estimator for = T (X1 , . . . . . . .Xn ) = E(X) = X

T (X1 , . . . . . . .Xn ) = X
2

Xi X
Since and 2 both are unknown. 2 = 1n
2

Xi X ,
This implies 2 = (n1).1
n.(n1)
2
2 = n1
n s
n1
Let s2 = n s2
method of moments estimator for 2 is given by:
2

s2 = 1n ni=1 Xi X
EXERCISES 5.3
5.3.1.
n x
nx
x p (1 p)
f (x) =
L(p, x1 , x2 , . . . .xn ) = log
n 1
0 n
i=1
xi

Xi log p + n
Xi log(1 p)
Xi
n Xi
+
(1)
p
1p
Xi
n Xi
log L(p, X1 , X2 , . . . .Xn ) =
p
p
1p
log L(p, X1 , X2 , . . . .Xn ) =

p
For maximum likelihood estimator of p
log L(p, X1 , X2 , . . . .Xn ) = 0

p
n Xi
Xi
=0
p
1p
(1
p)

2
3
Xi p n Xi = 0. This implies Xi = p
Xi + n Xi
p = n i,
p = X.
Hence, MLE of p = p = X.
By invariance property q = 1 p
is MLE of q.
5.3.3.
f (x) = 1 e implies
L() =
x
1
n e
Xi
Xi
ln L() = n ln
Now taking the derivatives with respect to and and setting both equal to zero, we have
i
ln L = n
+ 2 = 0
n + Xi = 0.
= X
From the given data:

1+2++7+2
.
14
MLE of is given by = X =
5.3.5.
= 6.07.
Here pdf of X is given by

f (x) =
x2
2x e 3
2
0,
2
4
x
L(, X1 , . . . . . . . .Xn ) = (22 )2 ni=1 Xi e 2
ln L = ln 2 2 ln + ni=1 ln Xi ni=1
2n
2
n
2
i=1 Xi
ln L = + 3
= 0 implies,
2
n
2
2n
i=1 Xi
+ 3
n2 + X2 = 0
2
X
2 = n i

2
Xi
=
n .
if, x > 0
otherwise
Xi 2
2
=0
5.3.7.
x
x1 e
if, x 0
f (x) =
0, otherwise.
L(, , X) =
n
n
1
e
i=1 nXi

Xi

2
Xi
ln Xi
n
Xi
X
n
L(,
,
x)
=
+
ln
X
n
ln
ln i
i
i=1

n
Xi (a)1
L(, , x) =

(1)

L(, , x) = n
Xi
+
L(, , x) = n ln ln + ( 1)
(5.1)
For maximum likelihood estimator of :

ln L = 0. This implies;

X
X
n
n
i
ln i = 0
i=1 ln Xi n ln
+

(1)
similarly, n
Xi = 0
+

(1)
n
Xi
+

5
6
Xi
Xi +1
n
=
0
solving
for
we
get
Hence,
1
X +1
n
i
+
Xi
Xi
2
n
,

Xi +1
ln
Xi ln
n
There is no closed form solution for and . In this case, one can use numerical methods
such as Newton-Raphson method to solve for , and then with this value to solve for .
5.3.9.
(2)
[x(1 x)](1),
()2
0
.
/
ln L(, x) = n ln (2) ln ()2 + ( 1) ni=1 ln(Xi 1)

%
&
2 (2)
2 ()2
ln
L(,
x)
=
n
+ ni=1 ln Xi (Xi 1).

2
(2)
(0)
f (x) =
5.3.13.
f (x) =
This implies L(, x) =
1
(3+2)2
3+2
0,
if, 0 x 3 + 2
otherwise.
for 0 x 3 + 2
1
. Which is positive and decreasing function
(3+2)2
max(Xi )2
, the likelihood drops to 0, creating discontinuity
3
When 3 +2 max(Xi ), the likelihood is

of (for xed n). However, for <
i )2
at point max(X
.
3
Hence we will not be able to nd the derivative. The MLE is the largest order statistic =
max(Xi )2
= Xn .
3
5.3.15.
Here X N(, 2 )
(xi )2
n
n
ln L(, , x) = ln 2 ln 2
2
2
2 2
n
i=1
n
L
=
(Xi )
i=1
L
n
1
2
Similarly
2 = 2 2 + 2 2
i=1 (Xi )
For maximum likelihood estimates of and ; (Xi ) = 0 implies

=X
Similarly, for 2 ,
n
n
1
+
(Xi )2 = 0
2 2
2 2
i=1
(
(Xi X)
=
.
n
5.3.17.
f (x) = 1 e x
It is given that the reliability R(x) = 1 F (x). This implies F (x) = f (x). Hence F (x) =
x
12 e .
Thus, R(x) = 1 F (x) and
L(x, ) =
n
0
Xi
1
1 + 2 e
.
i=1
EXERCISES 5.4
5.4.1.
5.4.3.
i.e E(X) = 1n
E(Xi ). Where, E(X) = xe(x) dx. By integration by

parts E(X) = (1 + ). Thus E X = 1n

(1 + ). This implies E(X) = 1 + .

1
Sample standard deviation s = n1

(xi X)2
E(X) = E
Xi
n
'
E(s) = E
'
E(s) = E
'
E(s) =
1
(Xi + X)2
n1
1
&
1 %
2
2
(xi ) (X )
n
1
&
1 %
2
2
E(xi ) E(X )
n
-1
' ,
1
2
2

E(s) =
n
n
5.4.5.
Let Y = C1 X1 + C2 X2 + + Cn Xn .
For an unbiased estimate, we need to have E(C1 E(X1 ) + + Cn E(Xn )) =
That is, C1 + Cn =
Which is possible if and only if Ci s =
1
1
n + + n =
n
n = i.e = . Veried.
1
n
for all i = 1, 2 . . . . . . . . . n. This implies
1
n
5.4.7.
Xi U(0.).
Yn = max X1 , . . . . . Xn
= Yn is the MLE of .
(a) By method of moment estimator E(X) = X =
Hence the method of moment estimator = 2X
(b) E( ) = E(Yn )
E( ) = E{max(X1 , . . . Xn )}
+0
2 .
This implies = 2X.
n
E( ) = n+1
E( ) = E(2X).
E( ) = 2.E(/2). That is, E( ) = . Hence, 2 ia an unbiased estimate of .
(c) E(3 ) = n+1

n E( ) This implies E(3 ) = .
is an unbiased estimate of .
5.4.9.
Here, Xi N(, 2 )

(x )2
f (x) =
exp
2 2
2 2
1
We have E()
= E(X) =
is an unbiased estimate for . By denition, the unbiased estimate
that minimizes the
mean square error is called the minimum variance unbiased estimate (MVUE) of .
MSE()
= E(
)2
That is MSE()
= var().
Minimizing the MSE implies that the minimization of Var().
is the MVUE for .

5.4.11.
E(M) = E(X) = . Thus, sample median is an unbiased estimate of population mean .
Now, Var(X) = 1n (X )2
Var(M) = 1n (M )2
Where Var(X) VarM
5.4.13.
f (X) =

|X|
1
exp
for
2
<x<
The likelihood function is given by:

1 1
|Xi |
f (X1 , X2 , . . . . . . . . Xn , ) = n n exp
.
2
6

|X |
Take g
|Xi | , = 1n exp i and h(X1 , X2 . . . . . . . Xn ) =
|Xi | is sufcient for .
1
2n
5.4.15.
(a) The likelihood function is given by:

1
Xi
f (x1 , x2 . . . . . . x n , ) = n exp
5
6

X
Take g
Xi , = 1n exp i and h(x1 , . . . . . xn ) = 1 for if x < 0 and h(x1 , . . . . . xn ) = 0
if x 0.
Xi is sufcient for .
5
6
Similarly, f (x1 , x2 . . . . . . x n , ) = 1n exp nX
5
6

Take g X, = 1n exp nX
and h(x1 , . . . . . xn ) = 1 for x < 0 and h(x1 , . . . . . xn ) = 0 if
x 0.
X is sufcient for .

Sample mean is an unbiased estimate for population mean EX = .
5.4.17.

f (x1 , x2 . . . xn ) =
1n
if, 2 xi 2 , i = 1, 2, . . ..
0,
otherwise.
Let us write

f (x1 , x2 . . . . . . . . . x n ) = h(x1 , x2 . . . . . . . . x n) g x(1) , x(n)
where
1

n
g min x(i) , max (i) =
0,
if, 2 (x(1) , x(n) ) 2 i = 1, 2, . . ..

otherwise.
Hence, (min Xi , max Xi ), 1 i n, is sufcient for .

5.4.19.

n
f (x1 , x2 , . . . . . . . . xn , ) = (2) 2 exp
(xi )2
2
The above expression can be written as:

n
1
n
5
6

2
2
2
f (x1 , X2 , . . . . . . . . xn , ) = exp xi 2x1 + n exp
xi 2
xi
i=1
i=2
2
3
Let g(X1 , ) = exp x12 2x1 + n2 This implies that we can not take the other function
which is only the function of Xi s. Hence by factorization theorem, X1 is not sufcient
statistics for .
5.4.21.
The likelihood function is given by
n 4n (x )1
i=1 i
f (x1 , . . . . . . x n ) =
0 otherwise
for 0 < x < 1, > 0
Let U = (X1 , X2 , . . . . . . . . Xn ) then

4
g(x1 , . . . . . . . Xn , ) = n ni=1 xi1 and h(x1 , . . . . . . x n ) = 1. Therefore U is sufcient for .
5.4.23.
The likelihood function is given by

2n
f (x1 , . . . . . . . . xn ) = n
n
0
1
Xi
2
xi
i=1

2 2n xi2
4
Let g
x , = n e and h(x1 , . . . . . . . . xn ) = ni=1 xi
i 2
Hence, xi is sufcient for the parameter .
EXERCISES 5.5
5.5.1.
54
n
n
ln L(p, x) = ln
xi ln p ni=1 (n xi ) ln(1 p) For MLE (1 p) xi
i=1 xi +
x
x
p (n xi ) = 0. This implies p = n2 i . Suppose Yn = n i . Thus p = Ynn . Where

E Ynn = 1n E(Yn ) = n12 nnp = p. p is an unbiased estimate of p. Similarly Var p = Var Ynn =
Var n2 i
This implies Var p =
5.5.3.
n
np(1 p).
n4
Thus Var p 0 as
E(Xi ) = i , E(Xi2 ) = 2 and E(Xi ) = 4 .

2

E Xi + X
E(s2 ) = 1n
This implies E(s2 ) = n n . That is E(s2 ) =

S 2 is an biased estimator of 2
2
Here S 2 = n1
n S
n1 2
2
Var(S ) = n2 0 as n .
2
n .
Yn
n
is consistent.
(n1) 2
.
n
(n1)s2
2
(n1)
.

2

2
(n1)s 2
Thus, Var
= 2(n 1). This implies (n1)
var(S 2 ) = 2(n 1).
2
4
2
2.
Finally, Var(S 2 ) = n1
2
Moreover, Bias(s2 ) = E(s2 ) 2 . This implies n 0 as n .
Thus s2 is an unbiased estimate of 2 .
Where
5.5.5.
5.5.7.
Here E(x) = and var(x) = 2 (For exponential distribution). Now E(X = E) and
2
Var() = n 0 as n . X is an unbiased estimate of .
n
Here, ln(, X) = n ln + 1
i=1 ln(Xi ).
Differentiating above equation and equating to zero we get =
n
i=1 ln Xi
5.5.9.
5.5.11.
1
1
Here, var( 1 ) = 12n
and Var( 2 ) = 12
.
Thus the efciency of 2 relative to 1 is (2 , 1 ) = 1n < 1 for n 2.
Here the efciency depends on n. Thus 1 is more efcient than 2 for n 2.
By denition E(x) = and var(X) = 2 . Here, ln f (x) = ln() x , E(X2 ) = 2 . Moreover

2
3

X
Var
ln f (x) = var 1
+ 2 . Hence,
1
ln f (x)
nVar
6 =
2
+ var(X).
n
X is efcient for .
5.5.13.
Using given pdf ln f (X) = c

Thus
x
2 2
where c =

n ln(2 2 )
.
2
1
2
1
ln f (x) = 2 .
2
This implies

2
1
E
ln f (x) = 2
Hence,

1
2
2
E
ln f (x) = E 2 ln f (x) .
5.5.15.
Because each Xi has a uniform distribution on the interval (0.), = E(Xi ) = /2 and
2
var(Yi ) = 12
.
2
E(2 ) = and var(2 ) = 3n
.
To nd the variance of 1 and 3 we must have density of Xn .

n1
fn (X) = n[FX (x)]n 1fX (x). That is fn (X) = nxn for 0 x .

This implies E( ) = n . Thus 1 is not an unbiased estimate of .
n+1
Similarly. E(3 ) = . 3 is an unbiased estimator of .

2
Now var 1 = (n+1)n2 (n+2) and var 3 = n(n+2)
.
(b) The efciency of 1 relative to 2 is given by:

(n + 1)3 (n + 1)
e 1 , 2 =
.
3n2
3
Hence 1 is more efcient than 1 if (n+1)3n(n+1)

>1
2

Similarly, e 2 , 3 = n+2
>
1
if
n
>
1.
3
2 is efcient than 3 if n > 1.
5.5.17.

It can be easily veried that E 1 = , E 2 = and E 3 =

31 2
6n17 2
, Var 2 = 25(n3)
.
Similarly, Var 1 = 81
2
Var 3 = n .

Now the corresponding efciencies are given by e 2 , 1 = 31775(n3)
81(6n17) , e 3 , 1 =

e 3 , 2 = (6n17)n .
31n
81 .
25(n3)
5.5.21.
The ratio of the joint density function of two sample points is given by;
,
n

n
1n
n

L(x1 , . . . . xn )
1
2
2
= exp
Xi
Yi 2
Xi
Yi
.
L(y1 , . . . yn )
2 2
i=1
i=1
i=1
i=1
For this ratio to be free of and 2 , we must have i=1 Xi = i=1 Yi and ni=1 Xi2 =
n
Yi2 .
i=1
Thus ni=1 Xi and ni=1 Xi2 are jointly minimal sufcient statistics for and 2 . Since X is
unbiased estimate for and s2 is an unbiased estimate for 2 . The estimators are functions
of the minimal sufcient statistics. This implies the X and s2 are MVUES for and 2 .
5.5.23.
The ratio of joint density function at two sample points, we have

4n
(Yi )
Xi
Yi
L(x1 , . . . . xn )
= 4ni=1
.
L(y1 , . . . yn )
i=1 (Xi )!
For the ratio to be free of we must have

sufcient statistics for .
Xi
Yi = 0. Thus
Xi , form the minimal
5.5.25.
L(x1 , . . . . xn )
= e Xi Yi
L(y1 , . . . yn )
For the ratio to be independent of , we need to have Xi = Yi . Thus Xi is minimal

sufcient for . Now E
Xi = nX is UMVUE, by RaoBlackwells theorem.
5.5.27.
4n
(Yi )
X2
Y 2
L(x1 , . . . . xn )
i
i .
4
= ni=1
e
L(y1 , . . . yn )
i=1 (Yi )!
The ratio to be free of , we must have Xi2 Yi2 = 0. Therefore Xi2 is MVUE for .
Moreover s2 is an unbiased estimator for 2 .
Chapter
Interval Estimation
EXERCISES 6.1
6.1.3.
(a) We are 99% condent that the estimate value of the parameter lies in the condence
interval.
(b) 99% condence interval is wider
(c) When is known but 2 is unknown we use t-distribution for the sample size n 30. If
the distribution is binomial and there are enough number of samples such that np 5
and np(1 p) 5, then we use normal approximation.
(d) More the information higher the condence interval. So the sample size is inversely
proportional to the width of the condence interval.

x
2.75 = k
(a) p 2.81 /
n

p 2.81 n x 2.75 n = k

p 2.81 n x 2.75 n + x = k

p x 2.75 n x + 2.81 n = k

(b) Condence interval for is given by x 2.75 n , x + 2.81 n
(c) Condence level = k
6.1.5.
(a) Here xi N(, 2 )
6.1.1.
(n1)s2
2
2
(n1)
2
Pivot = (n1)s
; where only 2 is unknown.
2

2
p a < (n1)s
<b =1
2

2
2
2
< (n1)s
<
p 1/2
/2 = 1
2
59
60 CHAPTER 6 Interval Estimation

p
2
1/2
<
2
(n1)s2

2
p (n1)s
< 2 <
2
/2

<
=1
2
/2
(n1)s2
2
1/2

=1
So (1 ) 100% condence interval for 2 is given by

(n1)s2
2
/2
< 2 <
(n1)s2
2
1/2
(b) n = 21, x = 44.3, s = 3.96

= 0.1, 1 = 0.9, /2 = 0.05
2
2
= 0.95,20
= 10.851
1/2,20
2 = 2
/2
0.05,20 = 31.410
90% condence interval is given by

(n1)s2
2
/2
6.1.9.
< 2 <
(n1)s2
2
1/2
20(3.96)2
31.41
< 2 <
20(3.96)2
10.851
We are 90% condent that 2 lies in the interval (9.985, 28.903).

x
b =1
(a) p a /
n

x
< z/2 = 1
p z/2 < /
n

p z/2 n < x < z/2 n = 1

p z/2 n x < < z/2 n x = 1

p x z/2 n < < x + z/2 n = 1

So the condence interval is x z/2 n , x + z/2 n .
(b) If 2 is unknown, we use sample variance for the estimation of condence interval
a sampling distribution. Thus the condence interval is
and t-distribution as
x t/2 n , x + t/2 n .
EXERCISES 6.2
6.2.1.
(a) n = 1200
p = 0.35
z/2 = z0.25 = 1.96

p)
0.35(10.35)
=
0.35
1.96
= (0.323, 0.377)
p z/2 p(1
n
1200
(b) p = 0.6
p z/2
p(1
p)

= 0.6 1.96 0.6(10.6)
= (0.572, 0.628)
1200
(c) p = 0.15
6.2.3.

0.15 1.96 0.15(10.15)
= (0.13, 0.17)
1200
(d) We are 95% condent that the percentage of people who nd political advertising to
be true is (0.323, 0.377), the percentage of people who nd political advertising to be
untrue is (0.572, 0.628), the percentage of people who nd falsehood in commercial
is (0.13, 0.17).
5
6
2
(a) Here f (x) = 1 exp (x)
2
2
2
5
6
n
(x )2
2
L(x1 , x2 , . . . , xn , , ) = (2 2 ) 2 exp 2i 2
ln L(x1 , x2 , . . . , xn , , 2 ) = 2n ln(2 2 )
ln L(x1 , x2 , . . . , xn , , 2 ) =
(xi )2
2 2
(xi )2
2 2
For MLE estimator
ln L(x1 , x2 , . . . , xn , , 2 ) = 0
(xi ) = 0
So
=x

(b) (1 )100% condence interval for is given by x z/2

p x z/2 n < < x + z/2 n = 1

p x 2 n < < x + 2 n = 0.954
(1 ) = 0.954
= 0.046
/2 = 0.023
z/2 = 2.0 from z table
Thus veried.

(c) p x k n < < x + k
(1 ) = 0.90
= 0.10
/2 = 0.05
z/2 = 1.645
k = 1.645
6.2.5.
n = 50
p =
18
50
9
25
= 0.90
, x + z/2
n
np = 18 > 5
n(1 p)
= 32 > 5
So the given data can be approximated as a normal distribution.
Here 1 = 0.98
= 0.02
/2 = 0.01
z/2 = z0.01 = 2.325
Thus the 98% condence interval is given by

9 16

p)
9
25 25
p z/2 p(1
=
2.325
= (0.202, 0.518)
n
25
50
6.2.7.
n = 50
x = 11.4
= 4.5
1 = 0.95
= 0.05
z/2 = 1.96
95% condence interval is

x z/2 n = 11.4 1.96 4.5
= (10.153, 12.647)
50
6.2.9.
n = 400
p = 0.3
np = 120 > 5
n(1 p)
= 280 > 5

p)
0.30.7
p z/2 p(1
=
0.3
1.96
= (0.255, 0.345)
n
400
6.2.11.
Proportion of defection p =
40
500
1 = 0.9
/2 = 0.05
z/2 = 1.645

2 23
2
25 25
1.645
= (0.06, 0.1)
25
500
2
25
6.2.13.
x N(, 16)
p(x 2 < < x + 2) = 0.95
z/2 n = 2
n = (z/2 2 )2 =
6.2.15.
1.964 2
2
= 15.37 16
n = 425
p = 0.45
np > 425 0.45 > 5
n(1 p)
= 425 0.55 > 5

p)
0.450.55
=
0.45
1.96
= (0.403, 0.497)
p z/2 p(1
n
425
For 98% condence interval
1 = 0.98
= 0.02
z/2 = 2.335

= (0.394, 0.506)
0.45 2.335 0.450.55
425
6.2.19.
p =
52
60
1 = 0.95
/2 = 0.025
z/2 = 1.96
The 95% condence interval is given by

52 8

p)
52
60 60
p z/2 p(1
=
1.96
= (0.781, 0.953)
n
60
60
6.2.21.
= 35
E = 15
E = z/2 n
z 2 1.9635 2
n = /2
=
= 20.92 21
E
15
6.2.23.
x = 12.07
= 12.91
1 = 0.98
/2 = 0.01
z/2 = 2.335
98% condence interval for mean is given by

1.91
x z/2 n = 12.07 2.335
= (11.32, 12.82)
35
EXERCISES 6.3
6.3.1.
(a) When standard deviation is not given and there is not enough sample size, we use
t-distribution.
(b) As differences decreases the sample size n increases which means we are closing in on
the true parameter value of .
(c) The data are normally distributed, and the values of x and the sample standard deviation
are known.
6.3.3.
x = 20
s=4
1 = 0.95

(a) x t/2,4
s
n

= 20 t0.025,4 4

(b) 20 t0.025,9 4
10

(c) 20 t0.025,19 4
20
6.3.5.
x = 2.22
s = 0.005
n = 26
98% condence interval for is

1.67
x t/2,25 sn = 2.22 2.485
= (3.03, 1.41)
26
6.3.7.
x = 0.905
s = 1.67
1 = 0.98
n = 10

0.905 t0.025,9 0.005

10
6.3.9.
6.3.11.
Similar to 6.3.8
x = 410.93
s = 312.87

x t/2,14 sn = 410.93 2.145 312.87

= (237.65, 584.21)
15
6.3.13.
x = 3.12
s = 1.04
n = 17

x t/2,4 sn = 3.12 2.921 1.04 = (2.40, 3.84)

17
6.3.15.
x = 3.85
s = 4.55
n = 20

x t/2,19 sn = 3.85 2.539 4.55 = (2.64, 5.06)

20
6.3.17.
x = 148.18
s = 1.91
n = 10

x t/2,9 sn = 148.18 2.262 1.91 = (147.19, 149.17)

10
EXERCISES 6.4
6.4.1.
x = 2.2
s = 1.42
1 = 0.90
= 0.10
n = 20
90% condence interval for 2 is given by

2 19(1.42)2
(n1)s2 (n1)s2
,
= 19(1.42)
= (1.1663, 4.3015)
32.85 , 8.90655
2
2
/2,19
6.4.3.
1/2,19
x = 60.908
s2 = 12.66
1 = 0.99
= 0.01
n = 10

(n1)s2 (n1)s2
912.66
,
= 912.66
= (4.8321, 65.8613)
23.58 , 1.73
2
2
/2,9
6.4.5.
1/2,9
x = 2.27
s2 = 1.02
1 = 0.99
= 0.01
n = 18

(n1)s2 (n1)s2
171.02
,
= 171.02
= (0.6287, 3.0475)
27.58 , 5.69
2
2
/2,9
6.4.9.
1/2,9
From excel or by calculation, sample variance s2 = 148.44, sample mean x = 97.24, n = 25

99% condence interval for population variance is given by

(n1)s2 (n1)s2
24148.44
,
= 24148.44
= (97.8456, 360.3642)
36.41 , 9.886
2
2
/2,24
6.4.11.
1/2,24
x = 13.95
s2 = 495.085
1 = 0.98
= 0.02
n = 25

(n1)s2 (n1)s2
24495.085
, 2
= 24495.085
= (276.5194, 1095.1189)
42.97 ,
10.85
2
/2,24
1/2,24
EXERCISES 6.5
6.5.1.
For procedure I, x1 = 98.4, s12 = 235.6, n1 = 10

For procedure II, x2 = 95.4, s22 = 87.15, n2 = 10
= 0.02
/2 = 0.01
z/2 = 2.985
98% condence interval for difference of mean is

'
x1 x2 z/2
s12
n1
s22
n2

= 98.4 95.4 2.985 235.6
10 +
= (13.9580, 19.9580)

87.15
10
6.5.3.
x1 = 16.0,
s1 = 5.6,
n1 = 42
x2 = 10.6,
s2 = 7.9,
n2 = 45
= 0.01
/2 = 0.005
z/2 = 2.575
99% condence interval for difference of mean is

'
x1 x2 z/2
6.5.5.

s12
n1
s22
n2

2
= 16.0 10.6 2.575 (5.6)
42 +
(7.9)2
45

= (1.6388, 9.1612)
x1 = 58, 550, s1 = 4, 000, n1 = 25

x2 = 53, 700, s2 = 3, 200, n2 = 23
Since 12 = 22 but unknown, we can use pooled estimator
Sp2 =
(n1 1)s12 +(n2 1)s22

(n1 +n2 2)
Sp2 =
24(4,000)2 +22(3,200)2
46
Sp = 3639.398
The 90% condence interval is

x1 x2 t/2,(n1 +n2 2) Sp n11 +
1
n2

1
+
= 58, 550 53, 700 2.326 3639.398 25
6.5.7.

1
23
= (2404, 7296)
x1 = 28.4, s1 = 4.1, n1 = 40
x2 = 25.6, s2 = 4.5, n2 = 32
(a) MLE of 1 2 is given by (x1 x2 )
(b) 99% condence interval for 1 2 is

'
x1 x2 z/2
s12
n1
s22
n2
= 28.4 25.6 2.565

= (0.1678, 5.4322)
6.5.9.
x1 = 148, 822, s1 = 21, 000, n1 = 100

x2 = 155, 908, s2 = 23, 000, n2 = 150
1 = 0.98
= 0.02
(4.1)2
40
(4.5)2
32
/2 = 0.01
z/2 = 2.575
98% condence interval for difference of mean is given by

'

2
s12
s22
x1 x2 z/2 n1 + n2 = 148, 822 155, 908 2.2 (21,000)
+
100
(23,000)2
150
= (508, 13, 664)

6.5.11.
x1 = 35.18, s12 = 19.76, n1 = 11

x2 = 38.76, s22 = 12.69, n2 = 13
1 = 0.9
= 0.1
90% condence interval for

12
22
is given by
s12
s12
1
1
,
2
F
s2 n1 1,n2 1,1/2 s22 Fn1 1,n2 1,/2
=
=
6.5.13.
19.76
1
19.76
1
12.69 F10,12,0.95 , 12.69 F10,12,0.05
19.76
1
19.76
12.69 2.75 , 12.69

2.91 = (0.5662, 4.5313)
x1 = 68.91, s12 = 287.17, n1 = 12

x2 = 80.66, s22 = 117.87, n2 = 12
1 = 0.95
= 0.05
/2 = 0.025
(a) 95% condence interval of difference of mean is

'
x1 x2 z/2
s12
n1
s22
n2

= 68.91 80.66 1.96 281.17
12 +

117.87
12
= (23.0525, 0.4475)
(b) 95% condence interval for

12
22
s12
s2
1
1
, 1
s22 Fn1 1,n2 1,1/2 s22 Fn1 1,n2 1,/2
is given by

=
=
281.17
1
281.17
1
117.87 F11,11,0.975 , 117.87 F11,12,0.025
281.17
118.87
1
281.17
3.48 , 118.87

3.48 = (0.6855, 8.3055)
Chapter
Hypothesis Testing
EXERCISES 7.1
: = 0
: > 0
: = 0
: > 1.20
7.1.1.
(a) H0
H1
(b) H0
H1
7.1.3.
H0 : p = 0.5
H1 : p > 0.5
n = 15
(a) = probability of type I error = p(reject H0 |H0 is true)
= p(y 10|p = 0.5) = 1 p(y 10|p = 0.5)
=1
15
c(15, y)(0.5)y (0.5)15y
y=10
= 1 0.941
= 0.059
(b) = p(accept H0 |H0 is false)
= p(y 9|p = 0.7)
=
9

c(15, y)(0.7)y (0.3)15y
y=0
= 0.278
(c) = p(y 9|p = 0.6)
=
c(15, y)(0.6)y (0.4)15y
y=0
= 0.597
69
70 CHAPTER 7 Hypothesis Testing
(d) For = 0.01

0.01 = p(y k|p = 0.5)
From binomial table = 0.01 falls between k = 2 and k = 3. However, for k = 3, =
0.018 which exceeds 0.01. If we want to limit to be no more than 0.01, we will take
k = 2. That is we reject H0 if k 2.
For = 0.01
0.03 = p(y k|p = 0.5)
From binomial table = 0.03 falls between k = 3 and k = 4. However, for k = 4, =
0.059 which exceeds 0.05. If we want to limit to be no more than 0.05, we will take
k = 3. That is we reject H0 if k 3.
(e) When = 0.01. From part d, rejection region is of the form (y 2).
For p = 0.7
= p(y 2|p = 0.7)
= 1 p(y < 1|p = 0.7)
= 1 0.000
=1
7.1.5.
n = 25
=4
H0 : = 10
H1 : > 10
(a) = probability of type I error = p(reject H0 |H0 is true)
= p(x > 11.2| = 10)

x
> 11.210
= P /
| = 10
n
= P(z > 1)
4/ 25
= 0.1587
(b) = p(accept H0 |H0 is false)
= p(x 11.2| = 11)

x
11.211
= P /
| = 11
n
4/ 25
= P(z 0.25)
= 0.5787
(c) 0 = 10
a = 11
z = z0.01 = 2.33
z = z0.8 = 0.525
n=
=
(z +z )2 2
(a 0 )2
(2.330.525)2 42
(1110)2
= 52.13 rounded up to 53
7.1.9.
2 = 16
H0 : = 25
H1 : = 25
n=
=
(z +z )2 2
(a 0 )2
(1.645+1.645)2 16
(1)2
= 173.19 rounded up to 174
EXERCISES 7.2
7.2.1.
H0 : = 0
H1 : = 1
L() = 1n
2
L(0 ) = 1n
2
L(1 ) = 1n
2
L(0 )
L(1 )
5
ln
6
(x )2
exp 2i 2
n
n
6
(xi 0 )2
exp 2
2
5
6
(xi 1 )2
exp 2
2
(xi 0 )2
(xi 1 )2
= exp 2
+
2
2 2
5
(xi 1 )2 (xi 0 )2
= exp
2 2
L(0 )
L(1 )
=
=
=
=
(xi 1 )2 (xi 0 )2
2
2
xi 2nx1 +21 xi2 2nx0 20

2
2
2nx(0 1 )(0 1 )(0 +1 )
2 2
(2 2 )
(0 1 ) xi
02 2 1
2
Therefore, the most powerful test is to reject H0 if
2

0 21
ln k
2 2
2

0 21
(0 1 ) xi
ln k +
2
2 2
2 2
2 ln k + 0 1

2
xi
=C
(0 1 )
(0 1 )
2
xi
Assume 0 < 1 , rejection region for = 1 is given by

Where C =

2
2
0 1
2 ln k+
2
(0 1 )
xi C
Rejection region for = 0 is given by

7.2.5.
xi C
H0 : = 0
H1 : < 0
2y y2 /2
e
x>0
2
n
2 2
n
4
2yi
L() =
e yi /
2
i=1
n
2 2
n
4
2yi
L(0 ) =
e yi /0
2
0
i=1
n
2 2
n
4
2yi
L(1 ) =
e yi /1
2
1
i=1
f (y) =

n
4
2yi
2
2 2
2 2
L(0 )
i=1
= n n e yi /1 yi /0
4 2yi
L(1 )
21
i=1 2n
=
6
1
0
yi2 /21
yi2 /20
5

1
0)
ln L(
=
2n
ln
yi2 /21 yi2 /20 ln k
L(1 )
0 +
2
yi C
5
6 2 2
0 1
Where C = ln k 2n ln 10
2 2
0
7.2.7.
H0 : p = p 0
H1 : p = p1 where p1 > p0
f (p) = px (1 p)1x
xi (1 p)n xi
x
L(p0 ) = p0 i (1 p0 )n xi
x
L(p1 ) = p1 i (1 p1 )n xi

xi
n
xi
L(p0 )
p0
1p0
L(p1 ) = p1
1p1
L(p) = p
Taking natural logarithm, we have

1 p0
p0
+ n
xi ln
ln k
p1
1 p1

1 p0
1 p0
p0
ln
ln
xi + n ln
ln k
p1
1 p1
1 p1

:

1 p0
p0
1 p0
ln
ln
xi ln k n ln
1 p1
p1
1 p1
xi ln
To nd the rejection region

for a xed
value
5

6 ;of
5 ,write
theregion 6as
p0
1p0
0
ln
xi C, where C = ln k n ln 1p
1p1
p1 ln 1p1
EXERCISES 7.3
7.3.1.
f (x) =
1
2
L(1 ) =
5
6
2
exp (x)
2 2
5
6
(xi )2
1
exp
n
2
n
2
Here 0 =
Hence
02
2 3
and a = R 02

1

n
2
1
(xi )2
L 0 = max
exp
202
20
Since the only unknown parameter in the parameter space is 2 , < 2 < ; maximum
likelihood function is achieved when 2 equals to its maximum likelihood estimator
1
(xi x)2
n
n
2 =
mle
i=1

=
12
n/2
02

exp
(xi x)2
(xi )2
212
n/2
n02
(xi )2
202
n (xi )2
(xi )2
exp
2 (xi x)2
202
The likelihood ratio test has the rejection region:

Reject if k, which is equivalent to
3

n
(x )2
(x )2
n
ln k
(xi x)2 2n ln n02 + 2
(xi x)2 2i 2
2 ln
7.3.3.
5
6
2
f (x) = 1 exp (x)
2
2
2
5
6
(x )2
L(1 ) = 1n n exp 2i 2
2
L(2 ) = 1n
2
L(22 )
L 12
2n
Let = =
6
(y )2
exp 2i 2
2
1
(yi )2
exp
2 2
2
(xi )2
212
Thus the likelihood ratio test has the rejection region

Reject H0 if k

n ln
2
1

+
(yi )2
22
(yi )2
222
(xi )2
12
(xi )2
212
ln k

2 ln k 2n ln
2
1
1
(xi x)2
n
n
12 =
22 =
1
n
i=1
n
(xi x)2
i=1

n (yi 2 )2
n (xi 1 )2
2
2
ln
k
2n
ln
1
(yi y)2
(xi x)2
The rejection region is

7.3.5.
(yi 2 )2
n (xi 1 )2
C
(yi y)2
(xi x)2
f (x) = 1 ex/ for x > 0

5
6
x
L() = 1n exp i
L(0 ) =
1
0n
L(1 ) =
1
1n
L(0 )
L(1 )
5
6
x
exp 0 i
5
6
x
exp 1 i
6
x
x
exp 1 i 0 i
n
5
6
x
x
= 10 exp 1 i 0 i
=
1n
0n
We reject the null hypothesis if

1
1
xi ln k
1
0

1
1 0
xi ln k n ln
0
0 1
n ln
1
0
Where we reject null hypothesis if xi m1 or xi m2

5
6

1 0
1
Provided m1 = ln k n ln 10
0 1 and m2 =
ln kn ln
1
0
0 1
1 0
EXERCISES 7.4
7.4.1.
n = 50, = 0.02, x = 62, s = 8

H0 : 64
H1 : < 64
(a) The observed test statistic (n 30) is
z=
x 0
62 64
2
= 1.769
=
=
1.13
s/ n
8/ 50
(b) p-value = p(z < 1.769) = p(z > 1.769) = 0.0384
(c) Smallest = 0.0384

p-value > 0.02
We fail to reject the null hypothesis.
7.4.3.
H0 : = 0.45
H1 : < 0.45
Here x =
s2
20.5
77
22.8
78
= 0.2923
= 0.2666
s = 0.16309
(a) Test statistic z =
x
0.2923 0.45
= 0.85
=
s/ n
0.16309/ 78
Rejection region is {z < z0.01 } = {z < 2.33}

Since z = 2.33 < 0.85, the null hypothesis is rejected at = 0.01
p-value = p{z < 0.85} = p{z > 0.85} = 0.1977
(b) Here z = 0.85
z/2 = z0.005 = 2.58
Rejection region is {z < z0.005 , z > z0.005 }, i.e. {z < 2.58, z > 2.58}
p-value = min{z < 0.85, z > 0.85} = 0.3954
(c) Assumptions: even though population standard deviation is unknown, because of the
large sample size, normal distribution is assumed.
7.4.5.
x = $58, 800, n = 15, s = $8, 300

(a) P(reject H0 |H0 is true) = 0
Since the probability of rejecting null hypothesis equals to zero. Therefore, the null
hypothesis is accepted.
(b) = 0.01
H0 : = 55, 648
H1 : > 55.648
t=
s/ n
58,80055,648
8,300/ 15
3152
2143.05
= 1.47
Rejection region is {t > t0.01,14 }

t0.01,14 = 2.624
Since t = 2.642 is greater than 1.47, we fail to reject the null hypothesis
7.4.7.
H0 : p0 = 0.3
H1 : p0 > 0.3
p =
z=
550
1500 =
pp
0

p0 q0
n
0.366
0.3660.3

0.30.7
1500
0.066
0.0118
Rejection region is {z > z0.01 }

z0.01 = 2.33
= 5.593
i.e. {z > 2.33}

Yes, customer has preference over ivory color
7.4.9.
(a) x = 42.9, s = 6.3674, = 0.1

H0 : = 44
H1 : = 44
The data is normally distributed.
z=
42.9 44
1.1
x
= 0.5644
=
=
2.013
s/ n
6.3674/ 10
2
Rejection region for z is |z| < z0.05 } where z0.05 = 1.645
z0.05 = 1.645 < 0.56444
We fail to reject the null hypothesis

(b) 90% condence interval for is x z/2 sn , x z/2 sn = (42.9 1.645
2.013, 42.9 + 1.645 2.013) = (39.588, 46.21138)
(c) From a, we can see that it is reasonable to take = 44. The argument is supported by
the condence interval in b.
7.4.11.
x = 13.7, s = 1.655, n = 20
H0 : = 14.6
H1 : < 14.6
t=
13.714.6
1.655/ 10
0.9
0.523
= 1.7198
Rejection region is {z > z0.01 }

z0.01 = 2.33
We reject the null hypothesis. Thus there is a statistical evidence to support this claim.
7.4.13.
x = 32,277, s = 1,200, n = 100, = 0.05

H0 : = 30,692
H1 : > 30,692
z=
32,27730,692
1,200/ 100
= 13.20
2
3
Rejection region is z > z0.05
z0.05 = 1.645
Hence we reject the null hypothesis that the expenditure per consumer is increased from
1994 to 1995
7.4.15.
H0 : = 1.129
H1 : = 1.129
t=
1.241.129
0.01/ 24
0.111
0.00204
= 54.41
t0.05,23 = 2.807
Where t = 54.41 > t0.05,23
Thus we reject the null hypothesis. That is price of gas is changed recently.
EXERCISES 7.5
7.5.1.
H0 : 1 2 = 0
H1 : 1 2 = 0
z=
(y1
y2 )(
1
2)
'
2
2
s1
s2
n1 + n2
7471
81 100
50 + 50
3
1.9026
= 1.5767
2
3
Rejection region for z is |z| > z0.025
z0.025 = 1.96
Since z0.025 > 1.57, we fail to reject the null hypothesis. To see the signicant difference we
need to have = 0.0582 level of signicance.
7.5.3.
x 1 = 58, 550, x 2 = 53, 700, s1 = 4000, s2 = 3200, n1 = 25, n2 = 23

H 0 : 1 2 = 0
H1 : 1 2 > 0
Sp2 =
(n1 1)s12 +(n2 1)s22

(n1 +n2 2)
24(4000)2 +22(3200)2
25+232
= 13245217
Sp = 3639.3979
t=
(x1x2 )0
Sp n1 + n1
(58,55053,700)0

1
1
3639 25
+ 23
= 4.61288
Rejection region is {t > t0.05,46 } i.e. {t > 1.679}

Since 1.679 < 4.61288, we reject the null hypothesis. Thus implies that there exists signicant
evidence to show that the males salary is higher than that of female.
7.5.5.
x 1 = 105.9, x 2 = 100.5, s12 = 0.21, s22 = 0.19, n1 = 80, n2 = 100

H 0 : 1 2 = 0
H1 : 1 2 = 0
Use t =
7.5.7.
('
x1 x2 )0
2
s1
n1
s2
and use two sided t-test.
+ n2
x 1 = 7.65, x 2 = 9.75, s1 = 0.9312, s2 = 0.852, n1 = 10, n2 = 10

(a) H0 : 1 2 = 0
H1 : 1 2 > 0
Sp2 =
(n1 1)s12 +(n2 1)s22

(n1 +n2 2)
9(0.9312)2 +9(0.852)2
18
Sp = 0.892479
t=
(x1x2 )0
Sp
1
1
n1 + n2
(7.659.75)0

1
1
0.89 10
+ 10
= 5.276
Rejection region is {t > t0.05,18 } i.e. {t > 1.734}

Thus fail to reject the null hypothesis.
(b) H0 : 12 = 22
H1 : 12 = 22
Test statistic F =
s12
s22
(0.9312)2
(0.852)2
= 1.1945
From the F -table F0.025(9,9) = 4.03
F0.95(9,9) =
1
4.03
= 0.248
Rejection region is F > 4.03 and F < 0.248

Since the observed value of the statistic 1.1945 < 4.03, we fail to reject the null
hypothesis
(c) H0 : d = 0
H1 : d > 0
d = 2.1
Sd = 1.1670
t=
0
dd
Sd / n
2.1
1.670/ 10
= 3.16
From t-table t0.05,9 = 1.833

We reject the null hypothesis, this implies hat the down stream is less than the upstream.
7.5.9.
(a) x 1 = 2.04, x 2 = 3.55, s1 = 0.551, s2 = 0.6958, n1 = 14, n2 = 14

H0 : 1 2 = 0
H1 : 1 2 < 0
Since the variances are equal and unknown
Sp2 =
Sp =
t=
(n1 1)s12 +(n2 1)s22

(n1 +n2 2)
13(2.04)2 +13(3.55)2
26
= 8.38
8.38 = 2.89
(x1x2 )0
Sp n1 + n1
(2.043.55)

1
1
2.89 13
+ 13
= 1.13321
Rejection region is {t < t0.05,13 } i.e. {t < 1.77}

Here 1.13321 > 1.77, we fail to reject the null hypothesis.
(b) Test statistic F =
s12
s22
= 0.6281
From the F -table F0.025(13,13) = 3.11

F0.95(13,13) =
1
3.11
= 0.321
Rejection region is F > 3.11 and F < 0.321, we fail to reject the null hypothesis
(c) H0 : d = 0
H1 : d < 0
d = 1.5071
Sd = 0.7467
t=
0
dd
Sd / n
1.5071
0.7467/ 14
= 7.55196
From t-table t0.05,9 = 1.833

t = 7.5598 < t0.05,9 = 1.83
We reject the null hypothesis.
7.5.11.
x 1 = 106, x 2 = 109, s1 = 10, s2 = 7, n1 = 17, n2 = 14

Test statistic F =
s12
s22
100
49
= 2.0408
From the F -table F0.01(10,7) = 2.585

F0.90(10,7) =
1
2.585
= 0.3868
Rejection region is F > 2.585 and F < 0.3868

Since observed value of test statistic = 2.0408 < 2.585, we fail to reject the null hypothesis.
EXERCISES 7.6
7.6.1.
c = 3, r = 3
(c 1)(r 1) = 4
2
= 9.48
0.05,4
Hence the rejection region is 2 > 9.48
By using contingency table
2 = 43.86
2 falls in the rejection region at = 0.05, we reject the null hypothesis. That is collective
bargaining is dependent on employee classication.
7.6.3.
O1 = 12, O2 = 14, O3 = 78, O4 = 40, O5 = 6

We now compute i (i = 1, 2, 3, 4, 5) using continuity correction
1 = p(x 55)

2 = p z 65.570
4

3 = p z 75.570
4

4 = p z 85.570
4

5 = p z 95.570
4
Taking above probability we need to nd Ei and follow 6.6.5.
7.6.5.
E1 = 950 0.35, E2 = 950 0.15, E3 = 950 0.20, E4 = 950 0.30

O1 = 950 0.45, O2 = 950 0.25, O3 = 950 0.02, O4 = 950 0.28
2 = 834.7183
From Chi-Square table
2
= 7.81
0.05,3
Thus 2 = 834.71 > 7.81, we reject the null hypothesis, at least one probabilities is different
from hypothesized value.
Chapter
Linear Regression Models

EXERCISES 8.2
8.2.1.
(a) Proof see example 8.2.1

(b) SSE
2 follows central Kai square with degree of freedom n 2

E SSE
= n 2 and 2 is a constant
2
Therefore E(SSE) = (n 2) 2
(a) Least-squares regression line is y = 84.1674 + 5.0384x
(b)
150
50
100
200
250
8.2.3.
30
40
50
60
70
8.2.5.
8.2.7.
(a) Check the proof in Derivation of 0 and 1 .

(b) We know the line of best t is y = 0 + 1 x and plug in the point (x, y ) we get
0 = y 1 x . Complete the proof.
y = 1 x +
n
n
SSE = e2i = [yi ( 1 xi )]2

i=1
i=1
81
82 CHAPTER 8 Linear Regression Models
(SSE)
1
[yi ( 1 xi )]2
i=1
= 2 [yi ( 1 xi )]xi
n
i=1
= 2 [xi yi ( 1 xi2 )] = 0
n
1 =
i=1
(xi yi )
i=1
n
i=1
(xi2 )
y = 40.175 + .9984x
Least-squares regression line is y = .62875 + .83994x
35
20
25
30
40
45
50
8.2.9.
20
40
x
50
Least-squares regression line is y = 2.2752 + .00578x
8.2.11.
30
50
100
x
150
60
EXERCISES 8.3
(a) Least-squares regression line is y = 57.2383 .4367x
(b)
15
20
25
30
35
40
45
50
8.3.1.
40
50
60
70
80
90
(c) The 95% condence intervals for 0 is (40.5929, 73.8837)

The 95% condence intervals for 1 is (.6806, .1928)
8.3.3.
1 and y are both normally distributed. In order to show these two are independent we just
need to show the covariance of them is 0. (Property of normal distribution)

Cov 1 , y = E 1 y E 1 E (y)
n

Sxy
1
y 1 E
yi
=E
Sxx
n

i=1
n

n
(yi y ) (xi x )
1
i=1
(0 + 1 x )
n
y 1
=E
2
n
i=1 (xi x )
i=1
= 1 (0 + 1 x ) 1 0 12 x
=0
(a) Least-squares regression line is y = 474.76 + 24.95x

(b)
1000
0
500
1500
2000
8.3.5.
30
35
40
x
45
50
(c) The 95% condence intervals for 0 is (3418.974, 2469.454)

The 95% condence intervals for 1 is (45.429, 95.329)
8.3.7.
By assumption and from normal equation we know

n
(yi y )2 =
i=1
n
i = 0 and
i=1
i xi = 0
i=1
(yi y + y y )2
i=1
=
=
=
=
=
=
=
n

i=1
n

i=1
n

i=1
n

i=1
n

i=1
n

i=1
n

i=1
(yi y )2 +
(yi y )2 +
(yi y )2 +
(yi y )2 +
(yi y )2 +
(yi y )2 +
(yi y )2 +
n

i=1
n

i=1
n

i=1
n

i=1
n

i=1
n

i=1
n

i=1
(y y )2 + 2
(y y )2 + 2
(y y )2 + 2
(y y )2 + 2
n

i=1
n

i=1
n

i=1
n
(yi y )(y y )
(i )(y y )
(i )(y) 2
n
(i )(y)
i=1
(i )( 0 + 1 xi ) 2
i=1
(y y )2 + 2 0
n

i=1
n
(i )(y)
i=1
(i ) + 2 1
n
(i )(xi ) 2(y)
i=1
(y y )2 + 2 0 0 + 2 1 0 2(y) 0
(y y )2
n

i=1
(i )
EXERCISES 8.4
8.4.1.
The 95% prediction interval for x = 92 is from 83.6195 to 111.7445

We can conclude with 95% condence that the true value of Y at the point x = 92 will be
somewhere between 83.6195 and 111.7445
8.4.3.
8.4.5.

We can conclude with 95% condence that the true value of Y at the point x = 85 will be
somewhere between 128.7849 and 191.9912
The assumption is the linear regression still valid for beyond our domain and therefore our
y do make sense.
EXERCISES 8.5
8.5.1.
8.5.3.
(a) At 95% condence level, we got the z is 1.3853 which is less than critical value. Therefore
we do not reject Ho and it means X and Y are independent.
(b) P-value is 0.08297713.
(c) The assumption is (x, y) follows the bivariate normal distribution and this test procedure
is approximate.

xy
Sxy
=
=
E() = E
Sxx Syy
xx yy
Therefore is not an unbiased estimator of the population coefcient .
8.5.5.
(a) At 95% condence level, we got the z is 1.237125 which is less than critical value.
Therefore we do not reject Ho and it means X and Y are independent.
(b) P-value is 0.108.
(c) y = 77.87 + .624x
(d) The usefulness of this model is we can use this model to make prediction.
(e) The assumption is (x, y) follows the bivariate normal distribution and this test procedure
is approximate.
EXERCISES 8.6
8.6.1.

y 1
y
2
(a) y = X where y =
y 3
y 4
11
23 24
11 24 39
(b) XT X = 9
41
.
,X = 1
X1
X2
/
43
and = 2
3
31
3.41489 .92553 .3936

(XT X)1 = .92553 .37234
.0319
.3936
.0319
.117

18
T
(X Y ) = 41
47
5.02128
(c) = .10638
.2766 31
(d) The estimation of error variance is 2.14286.
8.6.3.
y = 84.1674 + 5.0384x

8
347
T
X X=
347 16807

1.19648 .0247
(XT X)1 =
.0247 .0005695

1075
(XT Y ) =
55475
Chapter
Design of Experiments
EXERCISES 9.2
9.2.1.
Response is amount of fat was absorbed by has-brown potatoes

Factors are frying durations and different type of fats
Factor types are frying durations which quantitative and continuous and different type of
fats which is qualitative and discrete.
Treatments
There are 16 treatments
2 min with animal fat I, 2 min with animal fat II, 2 min with vegetable fat I and 2 min with
vegetable fat II
3 min with animal fat I, 3 min with animal fat II, 3 min with vegetable fat I and 3 min with
vegetable fat II
4 min with animal fat I, 4 min with animal fat II, 4 min with vegetable fat I and 4 min
with vegetable fat II
5 min with animal fat I, 5 min with animal fat II, 5 min with vegetable fat I and 5 min
with vegetable fat II
9.2.3.
Procedure for random assignment

1. Number the experimental units from 1 to 30.
2. Use a random number table or a statistical software to get a list of numbers that is a
random permutation of the numbers 1 to 30.
3. Give treatment 1 to the experimental units having the rst 10 numbers in the list. Treatment 2 will be given to the next 10 numbers in the list, and so on, give treatment 3 to
the last 10 units in the list.
87
88 CHAPTER 9 Design of Experiments
Here response is rose bushes and factor is different fertilizers. 24, 12, 30, 21, 8, 3, 20, 1,
11,18, 13, 15, 28, 5, 25, 29, 4, 10, 14, 19, 26, 9, 2, 6, 22, 16, 23, 7, 27,17
Brand
A
B
C
9.2.5.
24
13
26
12
15
9
30
28
2
21
5
6
20
4
23
1
10
7
11
14
27
18
19
17
Procedure for a randomized complete block design with 3 replications

1. Group the experimental units into 3 groups (called blocks), each containing 3*3
homogeneous experimental units.
2. In group 1, number the experimental units from 1 to 9 and generate a list of numbers
which are random permutation of the numbers 1 to 9.
3. In group 1, assign treatment 1 to the experimental units having numbers given by the
rst 3 numbers in the list. Assign treatment 2 to the experiments having next 3 numbers
in the list, and so on until treatment 3 receives 3 experimental units.
4. Repeat steps 2 and 3 for the remaining blocks of experimental units.
G
3(A)
1(A)
2(A)
6(B)
8(B)
7(B)
9(C)
5(C)
4(C)
9.2.7.
Subject
8
3
25 29
22 16
R
5(A)
4(A)
3(A)
9(B)
1(B)
6(B)
8(C)
7(C)
2(C)
J
1(A)
2(A)
4(A)
7(B)
8(B)
5(B)
6(C)
3(C)
9(C)
19, 37, 52, 42, 13, 34, 56, 48, 44, 43, 24, 12, 5, 32, 40, 23, 41, 11, 10, 6, 30, 26, 18, 8, 2, 29,
21, 36, 1, 54, 20, 39, 33, 27, 49, 16, 51, 15, 28, 47, 53, 35, 31, 3, 38, 25, 17, 55, 4, 50, 14,
22, 9, 46, 45, 7
Brand
A
B
C
D
19
40
1
31
37
23
54
3
52
41
20
38
42
11
39
25
13
10
33
17
34
6
27
55
Subject
56 48
30 26
49 16
4 50
44
18
51
14
43
8
15
22
24
2
28
9
12
29
47
46
5
21
53
45
32
36
35
7
9.2.9.
Start
Days
2
3
5
1
4
1
A
B
C
D
E
New material
3 2 4
B C D
C D E
D E A
E A B
A B C
Days
1
2
3
4
5
1
D
A
B
E
C
New material
3 2 4
E A B
B C D
C D E
A B C
D E A
5
C
E
A
D
B
Days
1
2
3
4
5
1
D
A
B
E
C
New material
2 3 4
A E B
C B D
D C E
B A C
E D A
5
C
E
A
D
B
5
E
A
B
C
D
Then
The nal is
9.2.11.
Grid
1
2
3
4
1
D
C
A
B
Grid
2 3
A B
B D
D C
C A
4
C
A
B
D
EXERCISES 9.3
9.3.1.
One factor at a time experiment to predict average amount of prot

Assume we x proportion at 40% then if we increase the quality from ordinary to ne we get
average prot decrease from 25,000 to 10,000. If we x proportion at 60% then if we increase
the quality from ordinary to ne we get average prot decrease from 9500 to 3000.
90 CHAPTER 9 Design of Experiments
Then if we change the setting from 40% and ordinary to 60% and ne then we get from
25,000 to 3000.
9.3.3.
In fractional factorial experiment, only a fraction of the possible treatments are actually used
in the experiment. A full factorial design is most ideal design through which we could obtain
information on all main effects and interactions. Due to prohibitive size of the experiments,
such designs are not practical to run. The total number of distinct treatments will be 22 = 4.
Fractional factorial experiments are used in which trials are conducted at only a well-balanced
subset of the possible combinations of levels of factors. This allows the experimenter to
obtain information about all main effects and interactions while maintaining the size of the
experiment manageable.
The experiment is carried out in a single systematic effort.
EXERCISES 9.4
9.4.1.
1 =
n1 =
4 = 2 and 2 =
9=3
2
3
100 = 40 and n2 =
100 = 60
2+3
2+3
EXERCISES 9.5
9.5.1.
= k[(X
T )2 + S 2 ] = k[(14.15 14.5)2 + .422 ] = .2946k.
L
Chapter
10
Analysis of Variance
EXERCISES 10.2
10.2.1.
(a) We need to test H0 : 1 = 2 vs. H1 : 1 = 2

From the random sample, we obtain the following needed estimates n1 = n2 = y,

i
y1 = 1.888889, y2 = 2.777778, i j yij2 = 120, i j yij = 42, Total SS = 2i=1 nj=1

2
2

i
yij y = 22, SSE = 2i=1 nj=1
yij y = 18.4444, SST = 3.55556
Where Total SS = SSE+SST, then MST =
and F = MST
MSE = 3.084337
SST
1
= 3.55556, MSE =
SSE
n1 +n2 2
= 1.152778,
With = 0.05, F,(1,n1 +n2 2) = 4.49

Since 3.0843 is not greater than 4.49, H0 is not rejected.
There is not enough evidence to indicate that the means differs for the tho populations.
(b) S 2 = Sp2 = MSE = 1.152778, y1 = 1.888889, y2 = 2.777778
Then, the t-statistic is
T = '
y1 y2

= 1.7562
1
1
2
S n1 + n2
Now, t0.025,14 = 2.12, and the rejections region is {|t| > 2.12}
Since 1.7562 is not less than 2.12, H0 is not rejected, which implies that there is
no signicant difference between the mean for the two populations. t 2 = F , implying
that in the two sample case, t-test and F -test lead to the same result.
10.2.3.
(a) At = 0.01, we need to test H0 : 1 = 2 vs. H1 : 1 = 2
We need the estimates y1 = 48.84615, y2 = 45.91667,

i
j yij = 56582,
i
2
2
2
ni
2
ni
y
=
1186,
Total
SS
=
y
y
=
318.16,
SSE
=
y
y
ij
ij
ij
j
i=1
i=1
j=1
j=1
= 264.609, SST = 53.551, where Total SS = SSE + SST, then MST = SST
1 = 53551,
SSE
MSE = n1 +n2 2 = 11.50474, and F = 4.654693. At = 0.01, F,(1,n1 +n2 2) = 7.881134,
then F0.01 (1, 23) = 7.88
91
92 CHAPTER 10 Analysis of Variance
(b) Assumptions: The samples are assumed to be independent from the Normal population
with respective means 1 , 2 and equal but unknown variances.
(c) S 2 = Sp2 = MSE = 11.50474, y1 = 48.84615, y2 = 4591667. Then, the t-statistic is:
y y2
t = ' 1
= 2.1574
1
1
2
S n1 + n2
Now, t0.005,23 = 2.807, and the rejection region is {|t| > 2.807}. Since 2.1574 is
not greater than 2.807, H0 is not rejected, which implies that there is no signicant difference between the mean time to relief for the two populations, and
t 2 = F implies that in the two sample case, t-test and F = test lead to the same
result
10.2.5.
Let Xi , with Xi N(, 2 ) for i = 1, 2, 3, . . . , n1 , and let Yj , with Yj N(, 2 ) for

j = 1, 2, 3, . . . , n2 , be two set of independent
random
variables. To test H0 : 1 = 2

vs. Ha : 1 = 2 we reject H0 when ' XY > tn1 +n2 1; 2 . Now, for ANOVA, with
n11 + n12 Sp2
=
n1
n2
i=1 xi + i=1 yi
n1 +n2
, we have
n1
(x)2 +
n2
(y)2
i=1
i=1
MST
n1 (x2 2x + 2 ) + n2 (y2 2y + 2 )
21
F=
=
n1
=
n2
2
2
MSE
Sp2
i=1 (xi x) + i=1 (yi y)
n1 +n2 2
n1 x2 + n2 y2 (n1 + n2 )2
Sp2
n1 n2
2
2
n1 x + n2 y
n1 +n2 [x 2x y + y ]
, since =
n1 + n2
Sp2
=
Therefore F =
2
(XY ) .
1
1
Sp2
+
n
n
1
(x y)2

1
1
2
n 1 + n 2 Sp
Then, we reject H0 if
Since (tn1 +n2 2,/2 )2 F1,n1 +n2 2, and
2
(XY )
1
1
Sp2
+
n
n
1
2
(xy)
1
1
Sp2
n +n
1
> F1,n1 +n2 2, .

> k ' XY > k for appropri n11 + n12 Sp2
2
ate values k and k , the probability for this events are the same. Hence, the two sample t-test
and the analysis of variance are equivalent for testing H0 : 1 = 2 vs. Ha : 1 = 2 .
Note: In the text, xi is y1i , x is y1 , yi is y2i , and y is y2 .
EXERCISES 10.3
10.3.1.
(a) Assuming that the samples are from populations which are normally distributed with
equals variances and means 1 , 2 , 3 . In our case, n1 = n2 = n3 = 4, k = 3,
ni
ti
N =
i=1 ni = 12, Ti =
j=1 yij , T i = ni , T1 = 1488, T2 = 1704, T3 = 1434,
k
ni

i
i=1
j=1 yij
T 1 = 372, T 2 = 426, T 3 = 358.3, y =
= 385.5, ki=1 nj=1
yij2 =
N

2
ni
k
2
1835388, CM = Ny = 1783323, Total SS =
= 52065, or
i=1
j=1 yij y
2
k
ni
k
ni
2
Total SS =
= 41859,
i=1
i1
j=1 yij CM = 52065, SSE =
j=1 yij T i

2
k Ti2
SSE
SST = i=1 ni T i y = i=1 ni CM = 10206, S 2 = MSE = n1 +n2 ++nk k =
SSE
SST
MST
Nk = 4651, MST = k1 = 5103, F = MSE = 1.0972. At = 0.05F,(k1,Nk) =
4.2565
Therefore, the ANOVA table is
Source of
Variation
Treatments
Error
Total
Degrees of
Freedom
2
9
11
ANOVA Table
Sum of Mean
Squares Square
10206
5103
41859
4651
52065
F -Statistic
1.0972
p-Value
0.37462
From the table, since the p-value is more than 0.05, we reject at = 0.05 the null
hypothesis H0 : 1 = 2 = 3
(b) Letting H0 : The mean auto insurance premium paid per six months by all drivers
insured for each of these companies is the same. Based on the data, there is evidence
to suggest that the mean auto insurance premium pay per six months by all drivers
insured for each of these companies is the same.
10.3.3.
n1 = n2 = = nk = n

n

j=1
because
Therefore,
10.3.5.
(yij y)2 =
n

.
n
/2
(yij T i ) + (T i y) =
(yij T i )2 + n (T i y)2 ,
j=1
j=1
ni
ni
ni

j=1 (yij T i ) =
j=1 yij n T i =
j=1 yij ni T i =
j=1 yij
j=1 yij
2
k
n
k
n
k
2
2

i=1
j=1 (yij y) =
i=1
j=1 (yij T i ) + n
i=1 T i y
= 0.

2
(a) From exercise 10.3.4 we know that SST = ki=1 ni T i y (where T stand for
2

i
Treatment), and SS Total = ki=1 nj=1
yij y . Then,
SSE = SS Total SST =
ni
k

yij y
2
i=1 j=1
ni
k

i=1 j=1
k

2
ni T i y
i=1
yij y
2
ni
k

2
Ti y
i=1 j=1
and
ni
ni

2
.

/2
yij y =
yij T i + T i y
j=1
j=1
ni

j=1
yij T i
2
ni

Ti y
2
j=1

ni
ni
ni
ni
j=1 yij T i =
j=1 yij ni T i =
j=1 yij
j=1 yij = 0
2
k
ni
2
2
k
ni
k
ni
Then i=1 j=1 yij y = i=1 j=1 yij T i + i=1 j=1 T i y
2

i
Therefore, SSE = ki=1 nj=1
yij T i

ni yij T i 2
(b) Since yij N , 2 , j=1 2
n2i 1 , and since they are independent, SSE
=
2

2
k
ni yij T i
k
follows a chi-square distribution with i=1 (ni 1), or N k,
i=1
j=1
2
degrees of freedom.
i
yij , T i = ntii , T1 = 92,
n1 = n4 = 5, n2 = n3 = 6, k = 4, N = ki=1 ni = 22, Ti = nj=1
Since
10.3.7.
ni
yij
T2 = 69, T3 = 75, T4 = 94, T 1 = 18.4, T 2 = 11.5, T 3 = 12.5, T 4 = 18.8, y = i=1 N j=1

=
2
ni
k
ni
2 = 5338, CM = Ny2 = 4950, Total SS =
15, ki=1
y
y
y
=
388,
i=1
j=1 ij
j=1 ij

2
yij2 CM = 388, SSE = ki1 nj i= 1 yij T i = 147, SST =

or Total SS = ki=1 nj=1

2
k Ti2
k
SSE
SSE
2
i=1 ni T i y =
i=1 ni CM = 241, S = MSE = n1 + n2 + + nk k = Nk = 8.16667,
SST
MST
MST = k1 = 80.3333, F = MSE = 9.8367
Therefore, the ANOVA table is
(a)
Source of
Variation
Treatments
Error
Total
Degrees of
Freedom
3
18
21
ANOVA Table
Sum of
Mean
Squares
Square
241
80.3333
147
8.166667
388
F -Statistic
9.8367
p-Value
0.00046
Assumptions: The samples are randomly selected from the 4 populations in an independent manner. The populations are normally distributed with equal variances 2
and means 1 , 2 , 3 , 4
(b) Since F is greater than critical value at = 0.05, there is sufcient evidence to indicate
a difference between the mean number of customers served by the 4 employees.
10.3.9.
Assumptions: The samples are normally selected from the population in an independent
manner. The populations are assumed to be normally distributed with common variances.
ni
ti
n1 = n2 = n3 = n4 = 5, k = 4, N = ki=1 ni = 20, Ti =
j=1 yij , T i = ni , T1 = 34.3,
T2 = 39.6, T3 = 45.9, T4 = 34.2, T 1 = 6.86, T 2 = 7.92, T 3 = 9.18, T 4 = 6.84,

2
k
ni
k
CM = Ny2 = 1185.8, Total SS =
= 23.78, or Total SS =
i=1
i=1
j=1 yij y
2

2
ni
k
ni
k
2 CM = 23.78, SSE =
y
y
T
=
5.36,
SST
=
n
T
y
=
ij
i
i
i
i1
i=1
j=1 ij
j=1
k Ti2
SSE
SSE
SST
2
i=1 ni CM = 18.42, S = MSE = ni + n2 + + nk k = Nk = 0.335, MST = k1 =
6.14, F = MST
MSE = 18.3284
At = 0.01F,(k1,Nk) = 5.2922
Since F > 5.2922, the sample evidence supports the alternative hypothesis that the true
rental and homeowner vacancy rates by area indeed different for all ve years at 0.01 l of
signicance level.
ni
ti
10.3.11. n1 = n2 = n3 = 6, k = 3, N =
i=1 ni = 18, Ti =
j=1 yij , T i = ni , T1 = 1273,
T2 = 1275, T3 = 1257, T 1 = 212.16667, T 2 = 212.5, T 3 = 209.5 CM = Ny2 =
2

i

i
804334.7222, Total SS = ki=1 nj=1
yij y = 5994.27778, or Total SS = ki=1 nj=1
2
k
ni
k
yij2 CM = 5994.27778, SSE =
= 5961.8333, SST =
i1
i=1
j=1 yij T i

2
k Ti2
SSE
SSE
2
ni T i y = i=1 ni CM = 32.4444, S = MSE = ni + n2 + + nk k = Nk = 397.456,
SST
MST = k1
= 16.2222, F = MST
MSE = 0.0408
At = 0.01 F,(k1,Nk) = 6.3589
Since F < 6.3589, based on the data there is not enough evidence to support the alternative hypothesis that the true mean cholesterol levels for all races in the United States
during 19781980 are different at 0.01 of signicance level.
EXERCISES 10.4
10.4.1.

yij y = yij T i Bj + y + T i y + Bj y
Then
yij y
2

2
2
2

= yij T i Bj + y + T i y + Bj y + 2 yij T i Bj + y T i y

+ 2 yij T i Bj + y Bj y + 2 T i y Bj y
Then
b
k

j=1 i=1
yij y
b
k

yij T i Bj + y
2
j=1 i=1
+2
b
k
b
k

Ti y
j=1 i=1

yij T i Bj + y T i y
j=1 i=1
+2
k
b

yij T i Bj + y Bj y
j=1 i=1
+2
k
b

T i y Bj y
j=1 i=1
2
b
k

2
Bj y
j=1 i=1
10.4.3.

b
b
b
Now,
y , and by = b 1n
j=1 yij T i =
j=1 yij bT i = 0, since bT i =
%j=1
ij
&
k
b
k
b
b
k
b
k
b
1
1
y
=
y
=
y
=
y
= bj=1 Bj
ij
ij
ij
ij
i=1
j=1
i=1
j=1
j=1
i=1
j=1 k
i=1
bk
k

Then, bj=1 Bj y = bj=1 Bj by = 0, and bj=1 yij T i Bj + y = bj=1 yij T i

bj=1 Bj y = 0

%

b
&

Then, bj=1 ki=1 yij T i Bj + y T i y = ki=1 T i y
j=1 yij T i Bj + y
=0

k %

k
&
b
k
and
j=1
i=1 yij T i Bj + y Bj y =
j=1 Bj y
i=1 yij T i Bj + y
=0

%

b
&

= 0.
and bj=1 ki=1 T i y Bj y = ki=1 T i y
B
y
j
j=1
2
b
k
2
b
k
2
b
k
Therefore, j=1 j=1 yij y = j=1 i=1 yij T i Bj + y + j=1 i=1 T i y

2
2
2
b
k
+ ki=1 bj=1 Bj y =
+ b ki=1 T i y + k bj=1
j=1
i=1 yij T i Bj + y

2
Bj y
k
b
W = ki=1 bj=1 (yij i j )2 , then W
i=1
j=1 (yij i j )
= 2
k
b
k
b
If W
i=1
j=1 (yij i j )
i=1
j=1 = 0, and since by the restriction
= 0, then
k
=y
j=1 j =
i=1 i = 0, the solution is given by
Now, for any xed i W
i =
j=1 (yij
i j )
b
W
if W
j=1 (yij i j ) = 0. Then, since
j=1 j = 0 then i =
i = 0, then =
T i ,
for any i = 1, 2, . . . , k; i.e., i = T i y
k
W
For any xed j, W
i=1 (yij i j ). If i = 0, since
i=1 i = 0 the solution
i = 2
is j = Bj y, j = 1, 2, . . . , b.
10.4.5.
T1 = 602, T2 = 619, T3 = 427, T4 = 439B1 = 386, B2 = 390, B3 = 414, B4 = 437,

B5 = 460 b = 5, k = 4, n = bk = 20

b
SSB
CM = 1n
B
= 217778.5, SSB = 1k bj=1 Bj2 CM = 986.8, MSB = b1
= 246.7,
j
j=1
ni
k
k
1
SST
2
SST = b i=1 Ti CM = 6344.55, MST = k1 = 2114.85, Total SS = i=1 j=1 yij2 CM
SSE
= 7390.55, SSE = Total SS SSB SST = 59.2, MSE = nbk+1
= 4.9333
To test if the true income lower limits of top 5 percent of U.S. households for each races
are the same, F = MST
MSE = 5.35283, F0.05,3,12 = 3.49. Since the observed value F > 3.49, we
reject the null hypothesis and conclude that there is difference in the true income lower
limits of top 5 percent of U.S. households for each races.
To test if the true income lower limits of top 5 percent of U.S. households for each year
between 19941998 are the same, F = MSB
MSE = 50.00676, and F0.05,4,12 = 3.2592. Since the
observed value F > 3.2592, we conclude that there is difference in the true income lower
limits of top 5 percent of U.S. households for each year among 19941998
10.4.7.
T1 = 112, T2 = 110, T3 = 133, T4 = 157, B1 = 201, B2 = 165, B3 = 146 b = 3, k = 4,

n = bk =12
2
SSB
CM = 1n
= 21845.33, SSB = 1k bj=1 Bj2 CM = 390.1667, MSB = b1
=
j=1 Bj
k
k
1
SST
2
195.0833, SST = b i=1 Ti CM = 482, MST = k1 = 160.667, Total SS =
i=1
ni
SSE
2 CM = 1234.667, SSE = Total SS SSB SST = 362.5, MSE =
y
j=1 ij
nbk+1 =
60.41667
To test if the true mean performance for different hours of sleep are the same, F = MST
MSE =
2.6593, F0.05,3,6 = 4.7571. Since the observed value F < 4.7571, there is evidence to conclude that there is no difference in the true mean performance for different hours of sleep
To test if the true mean performance for each category of the test are the same, F = MSB
MSE =
3.2290, and F0.05,2,6 = 5.1433. Since the observed value F < 5.1433, there is evidence
to conclude that there is no difference in the true mean performance for each category of
the test.
EXERCISES 10.5
10.5.1.
(a) For simplicity of computation, we will use SPSS. The following is the output.
Oneway
ANOVA Average Time
Sum of
Mean
Squares df Square
Between Groups
.900
3
.300
Within Groups
3.919
12
.327
Total
4.818
15
F
.919
Sig.
.461
(b) Since there is no signicant difference, Tukeys method is not necessary.

(c) Since F is smaller than the critical value, F0.05,3,12 = 3.4903, there is evidence to
conclude there is no difference in the average time to process claim forms among the
four processing facilities.
Assumptions: The samples are randomly selected from the 4 populations in an independent manner. The population are normally distributed with equal variances 2 and
mean 1 , 2 , 3 , 4 .
10.5.3.
(a) Oneway
Between Groups
Within Groups
Total
ANOVA Income
Sum of Squares df Mean Square
6344.550
3
2114.850
1046.000
16
65.375
7390.550
19
F
32.350
Sig.
.000
Since F is greater than critical value F , F0.05,3,16 = 3.24, based on data provided,
there is evidence to conclude that there is difference in the income lower limits of
top 5 percents of U.S. households for each races for all ve years at 0.05 level of
signicance.
(b) Post Hoc Tests
Multiple Comparisons Income Tukey HSD
95% Condence
Interval
(I)
(J)
Mean
Std.
Lower
Upper
race_num race_num Difference (IJ) Error Sig.
Bound
Bound
1
3.4
5.11371 0.91
3
4
35.00000
32.60000
5.11371 0
5.11371 0
1
2
49.6304
47.2304
11.2304
18.0304
5.11371 0
5.11371 0
23.7696
21.3696
53.0304
50.6304
35.00000
38.40000
5.11371 0
5.11371 0
49.6304
53.0304
20.3696
23.7696
5.11371 0.97
17.0304
12.2304
3.4
5.11371 0.91
2.4
4
4
11.2304
20.3696
17.9696
38.40000
36.00000
1
3
4
18.0304
32.60000
5.11371 0
47.2304
17.9696
2
3
36.00000
2.4
5.11371 0
5.11371 0.97
50.6304
12.2304
21.3696
17.0304
(c)
i j
Ti Tj
Tukey Interval
Reject or N.R.
Conclusion
1 2
120.4 123.8
(18.03,11.23)
N.R.
1 = 2
1 3
120.4 85.4
(20.37, 49.63)
1 = 3
1 4
120.4 87.8
(17.97, 47.23)
1 = 4
2 3
123.8 85.4
(23.77, 53.03)
2 = 3
2 4
123.8 87.8
(21.37, 50.63)
2 = 4
3 4
85.4 87.8
(17.03, 12.23)
N.R.
3 = 4
Assuming the samples are randomly selected in an independent manner, the populations
are normally distributed with equal variances 2 , and based on 95% Tukey intervals, All
races is similar to White, and Black is similar to Hispanic. All other true income lower limits
for each races are different.
EXERCISE 10.7
10.7.1.
Oneway
Between Groups
Within Groups
Total
ANOVA Cholesterol
Sum of Squares df Mean Square
32.444
2
16.222
5961.833
15
397.456
5994.278
17
F
.041
Sig.
.960
Since F is smaller than critical value F , F0.01,2,15 = 6.36, based on data provided, there is
not enough evidence to conclude that there is difference in the true cholesterol levels for all
races in United States during 19871980.
Chapter
11
Bayesian Estimation and Inference

EXERCISES 11.2
11.2.1.
By Bayes rule,
P(The die is loaded |3 consecutive ves)
=
P(3 consecutive ves |Loaded)P(Loaded)

P(3 consecutive ves |Loaded)P(Loaded) + P(3 consecutive ves |Fair)P(Fair)
(0.6)3 0.02
= 0.488
(0.6)3 0.02 + (1/6)3 0.98
11.2.3.
(a) f (p|x) (p)f (x|p)

p2 {px (1 p)nx }
px+2 (1 p)nx , which is the kernel density of (x + 3, n x + 1).
Hence, the posterior distribution of p is (x + 3, n x + 1).
(b) f (p|x) (p)f (x|p)
{pa1 (1 p)b1 }{px (1 p)nx }
px+a1 (1 p)nx+b1 , which is the kernel density of (x + a, n x + b).
Hence, the posterior distribution of p is (x + a, n x + b).
11.2.5.
(a) Note that

f (|x) ()f (x|)
4
e (exi )
i
n e(+
xi ) , which
is the kernel density of (n + 1, +
Hence, the posterior distribution of is (n + 1, + ni=1 xi ).
n
(b) E(|x) = (n + 1)/( + i=1 xi ).
i=1 xi ).
101
102 CHAPTER 11 Bayesian Estimation and Inference
11.2.7.
(a) f (|x) ()f (x|)

e
(e xi )
n
xi e(n+1) , which is the kernel density of
Hence, the posterior distribution of is

n

(b) E(|x) =
i=1 xi + 1 /(n + 1).

n
i=1 xi
i=1

xi + 1, n + 1 .

+ 1, n + 1 .
1

1
(xi )2
2 0
f (|x) ()f (x|) exp 2
exp
22
2

1
(xi )2
2
exp 2
22
2

1
1
n
2 nx
exp
+
2
2
2
2

n x
1 1
n
2
+
exp
,
1 + n
2 2
2
2
11.2.9.

which is the kernel density of N
2x
1
+n
2 2

,
1
+n
2 2
Hence the posterior distribution of is N
2x
.

n ,
1
+
2 2
1
1
n
2 +2
EXERCISES 11.3
11.3.1.
(a) We have seen from Example 11.2.7 that the posterior distribution of given
x1 , x2 , . . . , xn is normally distributed with
n x
n x
1
nx
1
9
2
Mean = 1 n = 9 n =
, and Variance = 1
=
n = 9 + n.
n
1
+
9
+
n
1
+
9
9
1 + 2
1 + 2
Hence, a 95% credible interval for is

nx
z0.025
9+n
'
'
9
nx
9
=
1.96
.
9+n
9+n
9+n
0.92 + 1.05 + + (4.78)

= 0.8725.
20
Using the results of part (a), a 95% credibel interval for is
(b) First note that n = 20 and x =
'
20(0.8725)
9
1.96
= (0.490, 1.694)
9 + 20
9 + 20
11.3.3.
We
of is
Exercise 11.2.8 that the posterior distribution

have seen from
50
50
x
+
,
50
+
=
(12.1,
52)
since
=
0.1,
=
2,
and
x
=

i=1 i
i=1 xi = 12.
Since (|x = 12) (12.1, 52). Using the procedures summarized in Section 11.3 we can
show that Pr(0.121 < < 0.381) = 0.95, thus (0.121, 0.381) is a 95% credible interval
for .
11.3.5.
Let x = sodium intake in this ethnic group, and = the mean sodium intake for this
group. Then x N(, 3002 ) and N(2700, 2502 ). From Example 11.2.7 the posterior
distribution of given x = 3000 is normally distributed with
1 2700 + 1 x
1 2700 + 1 3000
2
3002 = 2502
3002
Mean = 250 1
= 2822.95, and
1
1 + 1
+
2502
3002
2502
3002
Variance =
1
1 + 1
2502
3002
= 36885.25.
Hence,
a
95%
credible
interval
for
is
2822.95
z
36885.25 = 2822.95
0.025
1.96 36885.25 = (2446.52, 3199.38).

11.3.7.
Let x = the number of calls received in ve minutes. Then x Poi(5)

(|x = 25) ()f (x = 25|)
e2 {e5 (5)25 }
25 e7 , which is the kernel density of (25 + 1, 7).
Hence, the posterior distribution of is (26, 7).

Knowing the posterior distribution of we can show that Pr(2.43 < < 5.27) = 0.95. Thus
(2.43, 5.27) is a 95% credible interval of .
EXERCISES 11.4
11.4.1.
(a) Referring to Exercise 11.3.1. we know that x = 0.8725 and n = 20 and the posterior
distribution of is normal with
n x
20 0.8725
1
1
2
Mean = 1 n = 91 20 = 0.784, and Variance = 1
= 1 20 = 0.4045.
n
+
+
+
+
4
4
9
4
4
9
2
2
We can now compute

0 = P( 0|x = 0.784) = P

0.784
=P z
0.4045
0.784
0 0.784

0.4045
0.4045

= P(z 1.2327) = 0.109
and
1 = P( > 0|x = 0.784) = 1 0
= 1 0.109 = 0.891
Thus, 0 /1 = 0.109/0.891 = 0.122 < 1, and we reject H0 .

(b) First compute
z=
0.8725
x 0
= 1.3.
=
/ n
9/20
Thus, z = 1.3 < z0.95 = 1.645, and we do not reject H0 .

In this case, we see that we obtain different decisions from using the Bayesian approach than
from using the classical approach.
11.4.3.
We have seen from Exercise 11.3.3. the posterior distribution of is (12.1, 52). Using any
statistical software, we can compute
0 = P( 0.1|x = 12) = P( 0.1 and (12.1, 52)) = 0.0067,
and 1 = P( > 0.1|x = 12) = 1 0 = 1 0.0067 = 0.9933.

Thus, 0 < 1 , and we reject H0 .
11.4.5.
We have seen from Exercise 11.3.5. the posterior distribution of is N(2822.95, 36885.25).
We can now compute

0 = P( 2400|x = 3000) = P
2822.95
2400 2822.95

36885.25
36885.25
= P(z 2.202) = 0.014

and 1 = P( > 2400|x = 3000) = 1 0 = 1 0.014 = 0.986.
Thus, 0 < 1 , and we reject H0 .
EXERCISES 11.5
11.5.1. Expected Re turn = 25P(H)P(H) + 15P(H)P(T ) + 15P(T )P(H) + (15)P(T )P(T )
= 25(1/2)(1/2) + 15(1/2)(1/2) + 15(1/2)(1/2) 15(1/2)(1/2)
= 10
Since the expected return is positive, in a long run we should win. Thus, we should play
this game.
11.5.3.
According to the government forecast, we write P(G) = 0.7 to denote that the economy will
expand with a 70% chance, and P(B) = 0.3 to denote that the economy will decline.
Expected Earning = 300000P(G) + (200000)P(B)
= 300000(0.7) 200000(0.3)
= 150000
Since the expected earning is greater than 50000, the optimal decision is to open a new ofce.
Here we made an assumption that the government forecast will be correct with 100%
certainty.
11.5.5.
(a) Let G and B denote the true state of nature, and let G and B denote the weather persons
prediction. Based on the record we have P(G |G) = 6/8 = 3/4 and P(G |B) = 3/7.
Initially assume the prior as P(G) = P(B) = 1/2.
Using the Bayes theorem, we obtain the likelihood as
P(G|G ) =
=
P(G |G)P(G)
P(G |G)P(G) + P(G |B)P(B)

(3/4)(1/2)
(3/4)(1/2) + (3/7)(1/2)
= 7/11
and
P(G|B ) =
=
P(B |G)P(G)
P(B |G)P(G) + P(B |B)P(B)
(1/4)(1/2)
(1/4)(1/2) + (4/7)(1/2)
= 7/23
Updated prior when the weather person predicts good weather:

(G) = P(G|G ) = 7/11; (B) = 1 (G) = 4/11.
Updated prior when the weather person predicts bad weather:

(G) = P(G|B ) = 7/23; (B) = 1 (G) = 16/23.
(b) When the weather person predicts good weather:

Expected gain if we insure = 125(G) + 135(B)
= 125(7/11) + 135(4/11)
= 128.64,
and
Expected gain if we do not insure = 200(G)
= 200(7/11)
= 127.27.
Therefore our decision, given that the weather person predicts good weather, is to
insure.
When the weather person predicts bad weather:

Expected gain if we insure = 125(G) + 135(B)
= 125(7/23) + 135(16/23)
= 131.96,
and
Expected gain if we do not insure = 200(G)
= 200(7/23)
= 60.87.
Therefore our decision, given that the weather person predicts bad weather, is also to insure.
11.5.7.
By assuming a uniform prior we have (1 ) = (2 ) = (3 ) = 1/3.

The expected utility for decision d1 = 0 (1 ) + 10 (2 ) + 4 (3 )
= 0(1/3) + 10(1/3) + 4(1/3)
= 4.67.
The expected utility for decision d2 = (2) (1 ) + 5 (2 ) + 1 (3 )
= (2)(1/3) + 5(1/3) + (1/3)
= 1.33
Therefore our decision is d1 .

11.5.9.
(a)
States of Nature
Decision Space
p1
p2
pk
Predicting p1 (d1 )
Predicting p2 (d2 )
Predicting pk (dk )
(b) The expected utility for decision di is given by

E(U|di ) =
k

j=1
U(di , pj )(pj ) = g
1
g (k 1)l
1
+ (k 1)(l) =
.
k
k
k
Therefore the expected utility is the same for all decision di .
(c) When X = x1 , compute the updated prior as

P(X = x1 |pi )(pi )
(pi |X = x1 ) =
P(X = x1 |pj )(pj )

j
ai (1/k)
=
aj (1/k)
j
ai
=
,
aj
The expected utility for decision di =
i = 1, 2, . . . , k.
U(di , pj )(pj |X = x1 )
j=1
aj
ai
j=i
= g
l
aj
aj
j
ai
= (g + l)
l,
aj
i = 1, 2, . . . , k.
Since the above expression only depends on the term ai , therefore our decision is di
where i is such that ai is the largest among a1 , . . . , ak .
When X = x2 , compute the updated prior as
P(X = x2 |pi )(pi )
(pi |X = x2 ) =
P(X = x2 |pj )(pj )

j
(1 ai )(1/k)
=
(1 aj )(1/k)
j
(1 ai )
,
=
(1 aj )
i = 1, 2, . . . , k.
The expected utility for decision di =
U(di , pj )(pj |X = x2 )
j=1
(1 aj )
(1 ai )
j=i
=g
l
(1 aj )
(1 aj )
j
(1 ai )
= (g + l)
l,
(1 aj )
i = 1, 2, . . . , k.
Since the above expression only depends on the term (1 ai ), therefore our decision is
di where i is such that ai is the smallest among a1 , . . . , ak .
12
Chapter
Nonparametric Tests
EXERCISES 12.2
12.2.1.
In this case n = 9, p = 0.5, X Bin(n, p). If P(X a) = 0.025, using the Binomial table,
a = 1, then b = n+1a = 9. Then the rst and ninth value in the order list, an approximate
95% condence interval is 2.7 < M < 8.5
12.2.3.
(a) The next normal probability plot generated by a SPSS statistical software shows that the
normality assumption may not be satised.
Normal P-P Plot of VAR00001
1.0
Expected Cum Prob
0.8
0.6
0.4
0.2
0.0
0.0
0.2
0.4
0.6
Observed Cum Prob
0.8
1.0
109
110 CHAPTER 12 Nonparametric Tests
(b) Looking in the table for n = 10, p = 0.5, X Bin(n, p). If P(X a) = 0.025, using
the Binomial table, a = 1, then b = n + 1 a = 10. Then, an approximate 95%
condence interval for the central median is greater than 57.3 and less than 66.7. That
is, we have at least 95% chance that the true median air pollution index for the city will
fall in the interval (57.3, 66.7).
12.2.5.
Looking in the table for n = 6, p = 0.5, X Bin(n, p). If P(X a) = 0.005, using the
Binomial table, a = 0.
12.2.7.
Binomial table, a = 2, then b = n + 1 a = 14. That is, we have 99% chance that the true
median time required to prune an acre of grapes will fall in the interval (4.2, 5.8).
12.2.9.
Binomial table, a = 3, then b = n + 1 a = 13. Then, an approximate 95% condence
interval for the median in-state tuition costs is 3683 < M < 5212. That is, we have 95%
chance that the median is-state tuition costs fall in (3683, 5212).
EXERCISES 12.3
12.3.1.
Let m0 = 7.75
We test H0 : M = m0 vs. Ha : M = m0 at = 0.01
(i) Since there is a tie, n = 8. In this case, n+ = 3 and N + Bin(n, p = 12 ). Then

P N + 3 = 0.85547, which is not less than 2 . Then H0 is not rejected. Based on
the sample, the median interest rate in the city is not signicantly different from 7.75
(ii) Eliminating the tie, it can be obtained the next table
xi
7.625
7.875
7.625
8
7.5
8
7.375
7.25
zi = |xi 65|
0.125
0.125
0.125
0.25
0.25
0.25
0.375
0.5
Sign
Rank
2
2
2
5
5
5
7
8
thus W + = 12, n = 8. Then H0 is rejected since the rejection region is W + 3 or

W + 33. Then the same conclusion as the Sign test.
12.3.3.
Let m0 = 1000
We test H0 : M = m0 vs. Ha : M > m0 at = 0.05

(i) Since there is no tie, n = 10. In this case, n+ = 6 and N + Bin n, p = 12 . Then

+
P N 6 = 0.37695, which is not less than 2 . Then H0 is not rejected. Based on
the sample, the median SAT scores is not signicantly greater than 1000
(ii) It can be obtained the next table

xi
986
1065
1089
890
1128
1157
1224
765
1355
567
zi = |xi 65|
14
65
89
110
128
157
224
235
355
433
Sign
+
+
+
+
+
Rank
1
2
3
4
5
6
7
8
9
10
thus W + = 32, n = 10. Then H0 is rejected since the rejection region is W + 10. Based
on the sample we conclude that the median SAT scores is signicantly greater than 1000.
12.3.5.
Let m0 = 250. We test H0 : M = m0 vs. Ha : M > m0 at = 0.05

(i) Since there is no tie, n = 20. Using large sample approximation, N + N( = np, 2 =
+
np(1 p)), where p = 1/2 then Z = 2Nnn follows standard normal distribution. In
this case n+ = 11, then z = 0.447214. Since z = 1.6448, the rejection region is |z| z .
Based on the sample we conclude that the median weight of NFL players is not
signicantly greater than 250 pounds.
(ii) We have
xi
254
246
259
234
232
269
229
274
276
285
285
288
211
296
298
193
192
311
189
178
zi = |xi 65|
4
4
9
16
18
19
21
24
26
35
35
38
39
46
48
57
58
61
61
72
Sign
+
+
+
+
+
+
+
+
Rank
1.5
1.5
3
4
5
6
7
8
9
10.5
10.5
12
13
14
15
16
17
18.5
18.5
20
W + 1 n(n+1)
4
thus W + = 108, n = 20. Using normal approximation, Z = n(n+1)(2n+1)/24
then
z = 0.111998. Since z = 1.6448, the rejection region es |z| z . We reach the same
conclusion as the sign test.
12.3.7.
Using the difference as after-before we test H0 : M = 0 vs. Ha : M < 0 at = 0.05

(i) We have
Before
After
Difference
Sign
185
188
3
+
222
217
5
235
229
6
198
190
8
224
226
2
+
197
185
12
228
225
3
234
231
3
Then n+ = 2. Using large sample approximation, N + N( = np, 2 = np(1 p)),

+
where p = 1/2 then Z = 2Nnn follows standard normal distribution. In this case,
n = 8 and n+ = 2, then z = 1.4142. Since z = 1.645, the rejection region is

z z . Based on the sample, there is not enough evidence to conclude that the new
diet reduces the systollic blood pressure on individuals of over 40 years old.
(ii) We have
xi
2
3
3
3
5
6
8
12
zi = |xi 65|
2
3
3
3
5
6
8
12
Sign
+
+
Rank
1
3
3
3
5
6
7
8
W + 1 n(n+1)
4
thus W + = 4, n = 8. Using normal approximation, Z = n(n+1)(2n+1)/24
then z =
1.96. Since z = 1.645, the rejection region es z z . We reach the same conclusion
as the sign test.
EXERCISES 12.4
Assumptions: Observations are randomly selected and n1 n2
12.4.1.
We need to test H0 : m1 = m2 vs. Ha : m1 = m2 , where m1 is the median for American

conference, and m2 is the median for National conference. In this case n1 = n2 = 6.
Combining and keeping track the next table is obtained

Value
0.455
0.545
0.545
0.636
0.636
0.636
0.727
0.727
0.818
0.818
0.818
0.909
Population
N
A
N
N
N
N
A
A
A
A
N
A
Rank
1
2.5
2.5
5
5
5
7.5
7.5
10
10
10
12
Then R = 28.5, W = R 12 n2 (n2 + 1), so w = 28.5 12 (6)(6 + 1) = 7.5. For = 0.05 the
rejection region is W 28 or W 50. There is enough evidence to conclude that the two
samples comes from population with different medians.
12.4.3.
If we select n2 numbers from {1, 2, . . . , n1 + n2 } at random without replacement and if Xi is

the ith number selected then
E
n
2

Xi
i=1
n2 (n1 + n2 + 1)
2
and
Var
n
2

Xi
i=1
n2 (n1 + n2 + 1)n1
.
12
R is the sum of the ranks r(Xi ) of observations from population 2. Under the null hypothesis
they have the same distribution.
Since the rank r(Xi ) takes one of 1, 2, . . . , n1 + n2 , if (without loss in generality) all
n2
observations are distinct then R =

r(Xi ) has
i=1
E(R) =
n2 (n1 + n2 + 1)
2
and
Var(R) =
n2 (n1 + n2 + 1)n1
.
12
We have W = R 12 n2 (n2 + 1), then

1
n2 (n2 + 1)
2
n2 (n1 + n2 + 1) 1
n2 (n2 + 1)
=
2
2
n2 n1
=
2
E(W) = E(R)
and
Var(W) = Var(R)
=
12.4.5.
n2 (n1 + n2 + 1)n1
12
Using Wilcoxon-Rank-Sum test:

We need to test H0 : m1 = m2 vs. Ha : m1 < m2 , where m1 is the median net conversion in
female rat, and m2 is the median net conversion in male rat. In this case n1 = 12, n2 = 14.
Combining and keeping track the next tables are obtained
Value
Population
Rank
5.1
F
1
5.5
F
2
6.5
F
3
7.2
F
4
7.5
F
5
9.5
F
6
9.8
M
7.5
9.8
F
7.5
10.4
F
9
11.2
F
10
11.6
M
11
12.8 13.1
M
M
12
13
Value
13.5 13.8 13.8 14.2 14.5 15.1 15.8 15.9 16.0 16.0 16.7 16.9 17.3
Population M
M
F
M
F
M
F
M
M
M
M
M
M
Rank
14 15.5 15.5 17
18
19
20
21 22.5 22.5 24
25
26
Then R = 250, W = R 12 n2 (n2 + 1), so w = 28.5 12 (14)(14 + 1) = 145. We will use
1 n2 /2
Wilcoxon-Rank-Sum test for large sample, then we use the statistic Z = n nW(nn+n
,
1 2 1
2 +1)/12
for which the realization is z = 3.1375. For = 0.05, z = 1.64485, and the rejection
region is z < 1.64485. There is not enough evidence to conclude that the median net
conversion of progesterone in male rats is larger than in female rats.
Using Median test:
Testing the same hypothesis, the grand median is 13.3. The next tables can be obtained
Sample
Population
Above/Below Median
5.1
F
B
5.5
F
B
6.5
F
B
7.2
F
B
7.5
F
B
9.5
F
B
9.8
M
B
9.8
F
B
10.4
F
B
Sample
Population
Above/Below Median
11.2
F
B
11.6
M
B
12.8
M
B
13.1
M
B
13.5
M
A
13.8
M
A
13.8
F
A
14.2
M
A
14.5
F
A
Sample
Population
Above/Below Median
15.1
M
A
15.8
F
A
15.9
M
A
16
M
A
16
M
A
16.7
M
A
16.9
M
A
17.3
M
A
Then, it can be obtained the next table
sample 1
sample 2
total
Below
9
4
13
Above Total
3
12
10
14
13
26
The total above is Na = 13, the total below is Nb = 13 the sample sizes are n1 = 12, the
E(N1a )
statistic for large sample is given by Z = N1aVar(N
, which realization is z = 2.31455,
1a )
since N1a = 3 is the number of observation in sample 1 above the median, E(N1a ) =
(13)(12)
Na n1
1 n 2 Nb
= 6, and Var(N1a ) = Nna2n(n1)
= 1.68.
n =
26
At = 0.05, z = 1.64485. Then, the rejection region is z < 1.64485. Therefore, the
same conclusion is reached.
12.4.7.
We need to test H0 : m1 = m2 vs. Ha : m1 > m2 , where m1 is the median for sample II,
and m2 is the median for sample I. In this case, the sample sizes are n1 = 8 and n2 = 12.
Combining and keeping track the next table can be obtained
Value
Population
Rank
4
S1
1
6
S1
2
7
S1
3
8
S1
4.5
8
S1
4.5
10
S1
6
11
S1
7
12
S1
8.5
Value
Population
Rank
14
S1
13
15
S1
14.5
15
S2
14.5
16
S2
16
17
S2
17.5
17
S2
17.5
18
S2
19
19
S2
20
12
S1
8.5
13
S1
16.5
13
S2
16.5
13
S2
16.5
Then R = 89, W = R 12 n2 (n2 +1), so w = 89 12 (12)(12+1) = 11. For = 0.01, the rejection
region is W 115. There is no enough evidence to suggest that the median for sample I is
less than the median for sample II.
EXERCISES 12.5
12.5.1.
We need to test H0 : M1 = M2 = M3 = M4 = 0 vs. Ha : Not all Mi s equal 0, where Mi is the
true median for group i, i = 1, 2, 3, 4. In this case n1 = n2 = n3 = n4 = 8, N = 4i=1 ni = 32,
2 ri2
12
r1 = 80, r2 = 116, r3 = 115, and r4 = 177. Then H = N(N+1)
i=1 ni 3(N + 1) = 7.832386
2
2
= 0.05,3
= 7.8147. There is enough evidence to suggest that not
(i) At = 0.05, ,k1
all Mi s are equal to 0
2
2
(ii) At = 0.01, ,k1
= 0.01,3
= 11.3449. There is enough evidence to suggest that not
all Mi s are equal to 0
12.5.3.
For
12
n = 2, H = N(N+1)
i=1
ri2
ni
12
3(N + 1) = N(N+1)
i=1
1
ni

ri
ni (N+1)
2
Then,
since
12
r1 + r2 = N(N+1)
, N = n1 + n2 , and (1)2 = 1, it can be obtained H = N(N+1)
2

2

2
2
n1 (N+1)
n1 (N+1)
1
12
1
+ n12 N(N+1)
r1 n2 (N+1)
= N(N+1)
+ n12 r1
n1 r1
2
2
2
n1 r1
2
2

2

2
n1 (N+1)
n1 (N+1)
n1 (N+1)
12
N
12
.
Therefore,
H
=
.
r
r
1
1
2
N(N+1) n1 n2
2
n1 n2 (N+1)
2
Now, we reject H0 if, and only if H > k, for certain value according the rejection rule using

2

the Kruskall-Wallis test. And, H > k r1 n1 (N+1)
> k , for appropriate k r1 > c1 or
2
r1 > c1 , for some c1 and c2 , which correspond to the rejection rule using Willcoxon-RankSum test.
Thus, they are equivalent.
12.5.5.
We need to test H0 : Yields of corn for fertilizers are equal vs. Ha : Not all yields of corn for
fertilizer are equal. The corresponding table is
ri
None:
11
1
2.5
5
6
2.5
8.5
8.5
4
49
Fertilizer I:
15.5
14
10
12
7
13
18
15.5
105
Fertilizer II:
26
23
23
21
29.5
20
19
28
17
206.5
Fertilizer III:
31.5
36
23
33
31.5
25
29.5
38
27
274.5
In this case, n1 = n3 = n4 = n5 = 9, n2 = 8, k = 5, N =
Fertilizer IV:
44
41
38
43
40
34
42
38
35
355
k
ni = 44, r1 = 49, r2 = 105,
i=1
r3 = 206.5, r4 = 274.5, and r5 = 355. Then H =
12
N(N+1)
i=1
ri2
ni
3(N + 1) = 39.29066
2
At = 0.01, ,k1
= 13.2767. There is enough evidence to suggest a difference in yields of
corn from the different fertilizers.
Chapter
13
Empirical Methods
CHAPTER 13
Statistical software R is used for this chapter. All outputs and codes given are in R. R is a free statistical
software, and it can be downloaded from the website: www.r-project.org
EXERCISES 13.2
13.2.1.
(a) Let = and = sample mean. Use the command jackknife in R, we obtain the
results below:
$jack.se
[1] 12.66607
$jack.bias
[1] 0
$jack.values
[1] 287.1818 289.2727 276.3636 279.4545 287.4545 284.0909
284.8182 288.0909
[9] 286.7273 289.0000 287.0000 288.5455
Using the same notations dened in Section 13.2, we have the relation between our
notations and the results in R as follows:

jack.values = 1 , 2 , 3 , . . . , n

n
1
jack.bias = (n 1)
k
n
k=1
s
jack.se = standard error of the jackknife esitmate =
n
117
118 CHAPTER 13 Empirical Methods
Thus, the jackknife estimate can be obtained by

=
n
n

1
1
k =
n (n 1) k
n
n
1
n
k=1
n

k=1
k=1
+ (n 1) (n 1) k

n
1
k
= (n 1)
n
k=1
= jack.bias
Since = the sample mean of the complete data = 285.67, we have the jackknife estimate of as = jack.bias = 285.67 0 = 285.67.
(b) A 95% jackknife condence interval for is
t/2,n1 jack.se = 2.85.67 2.201 12.666 = (257.789, 313.545).
(c) Compare the 95% jackknife condence interval with Example 6.3.3, where the condence interval is (257.81,313.59). Thus, in this exercise through the two methods, we
get a very close condence interval for .
13.2.3.
Let = and = sample mean. From R we can obtain the following results:
$jack.se
[1] 1.050376
$jack.bias
[1] 0
Since = the sample mean of the complete data = 61.22, we have the jackknife estimate
of as = jack.bias = 61.22 0 = 61.22.
A 95% jackknife condence interval for is
t/2,n1 jack.se = 61.22 2.262 1.05 = (58.844, 63.596).
There is a 95% chance that the true mean falls in (58.844, 63.596).
13.2.5.
Let = 2 and = sample variance. From R we can obtain the following results:
$jack.se
[1] 2247.042
$jack.bias
[1] 0
Since = the sample variance of the complete data = 9386, we have the jackknife estimate
of 2 as = jack.bias = 9386 0 = 9386.
A 95% jackknife condence interval for 2 is
t/2,n1 jack.se = 9386 2.262 2247.042 = (4302.838, 14469.16).
There is a 95% chance that the true variance falls in (4302.838, 14469.16). Compare the
95% jackknife condence interval with Example 6.4.2, where the condence interval is
(4442.3, 31299). Thus, in this case the jackknife condence interval for 2 is much shorter
than the classical one in Example 6.4.2.
13.2.7.
(a) Let = and = sample mean. From R we can obtain the following results:
$jack.se
[1] 0.1837461
$jack.bias
[1] 0
Since = the sample mean of the complete data = 2.317, we have the jackknife estimate of as = jack.bias = 2.317 0 = 2.317.
A 95% jackknife condence interval for is
t/2,n1 jack.se = 2.317 2.201 0.184 = (1.912, 2.721).
(b) Let = 2 and = sample variance. From R we can obtain the following results:
$jack.se
[1] 0.1682317
$jack.bias
[1] 0
Since = the sample variance of the complete data = 0.405, we have the jackknife estimate of 2 as = jack.bias = 0.405 0 = 0.405.
A 95% jackknife condence interval for 2 is
t/2,n1 jack.se = 0.405 2.201 0.168 = (0.035, 0.775).
(c) There is a 95% chance that the true mean falls in (1.912, 2.721); and there is a 95%
chance that the true variance falls in (0.035, 0.775).
EXERCISES 13.3
Please note that resampling procedure may produce different bootstrap samples every time. Hence
the results might be different. You can perform the bootstrapping using any statistical software.
13.3.1.
Using statistical software R we have created N = 8 bootstrap samples of size 20. Next we
calculated the mean of each bootstrap samples, denoted by X1 , . . . , XN . Then we have the
following results:
The bootstrap mean = X =
N
1
Xi = 5.875;
N
i=1
)
*
*
The standard error = +
1
(Xi X )2 = 1.137.
N 1
N
i=1
13.3.3.
Using statistical software R we have created N = 199 bootstrap samples of size 10. Then
0.025 (199 + 1) = 5 and 0.975 (199 + 1) = 195. Thus, respectively, the 0.025 and 0.975
quantiles of the sample means are the 5th and 195th values of ascending-ordered sample
means from the bootstrap samples. Then we get the 95% bootstrap condence interval for
as (59.07, 63.16).
13.3.5.
(a) Using statistical software R we have created N = 199 bootstrap samples of size 6. Then
0.025 (199 + 1) = 5 and 0.975 (199 + 1) = 195. Thus, respectively, the 0.025 and
0.975 quantiles of the sample means are the 5th and 195th values of ascending-ordered
sample means from the bootstrap samples. Then we get the 95% bootstrap condence
interval for as (150.667, 1149.5).
(b) And, respectively, the 0.025 and 0.975 quantiles of the sample medians are the 5th
and 195th values of ascending-ordered sample medians from the bootstrap samples.
Then we get the 95% bootstrap condence interval for the population median as
(110, 1366.5).
EXERCISES 13.4
13.4.1.
(a) Since S N(0, 2 ) and N N(0, 2 ), then Y = S + N N(0, 2 + 2 ). Then the

likelihood function of Y is
1
L(; y) = fY (y|) =
2( 2 + 2 )

exp

1
2
y .
2( 2 + 2 )
And
ln L(; y)
y2
= 2
+
.
+ 2
( 2 + 2 )2
Solving
ln L(; y)
= 0, we obtain the MLE as

2
31/2
MLE = max 0, y2 2
.
(b) The complete likelihood for Y and S is

s

s
LC (; y, s) = fY ,S (y, s|) = fS,N (s, y s|)|J| where J = n

s

s

y
n
y

s2
1
(y s)2
= fS (s|)fN (y s|) =
exp 2
2
2
2 2
The conditional probability density of S given Y = y is

fY ,S (y, s|)
s2
(y s)2
exp 2
fY (y|)
2
2 2
2

y
1 1
1
2
exp
+ 2
s 1
2
1
2
2 + 2
h(s|, y) =

Thus S|, y N
Fix 0 , consider
1
2
y
2
1
2
1
+ 1
2 2
Q(|0 , y) = E0 [ln LC (; y, S)|0 , y]

,
= E 0
s2
(y s)2
constant 2 ln 2
2
2 2
= (constant of ) 2 ln
Solving
y
2
1
1
1
1
1
2 2 1
+ 2
+ 2
2
Q(|0 , y)
= 0, we have
2
=
+
1
1
1
1
+ 2
+ 2
02
02
2 1/2
Then we obtain the EM algorithm as

1
2
(k+1) =
+
1
1
2 1
1
+ 2
+ 2
2
2
(k)
(k)
13.4.3.
2 1/2
Let n = n1 + n2 + n3 and = (p, q). Let x = (n1 , n2 , n3 ) be the observed data and z =
(n11 , n12 , n21 , n22 , n3 ) be the compete data where n11 is the number of male identical pairs,
n21 is the number of female identical pairs, and n21 and n21 are the non-identical pairs respectively for males and females. Here, the complete data set z has a multinomial distribution
with the likelihood given by

n
[p(1 q)]n11 [(1 p)(1 q)2 ]n12 [pq]n21
L(; z) =
n11 , n12 , n21 , n22 , n3
[(1 p)q2 ]n22 [2(1 p)q(1 q)]n3
Then
ln L(; z) = (constant of ) + (n11 + n21 ) ln p + (n12 + n22 + n3 ) ln(1 p)
+ (n21 + 2n22 + n3 ) ln q + (n11 + 2n12 + n3 ) ln(1 q)
For multinomial distribution the expected value of each class is n multiplied by the
probability of that class.
Then, for (k) = (p (k) , q (k) ) as the kth step estimate, using the Bayes rule we have

(k)
n11 = E n11 | (k) , x = n1
p (k) (1 q (k) )
p (k) (1 q (k) ) + (1 p (k) )(1 q (k) )2

(k)
(k)
n12 = E n12 | (k) , x = n1 n11 ,

(k)
n21 = E n21 | (k) , x = n2
p (k) q (k)
p (k) q (k) + (1 p (k) )(q(k) )2

(k)
(k)
n22 = E n22 | (k) , x = n2 n21 .
Then,

Q | (k) , x = E [ln L(; Z)| (k) , x]
(k)

(k)
(k)
(k)
(k)
= (constant of ) + n11 + n21 ln p + n12 + n22 + n3 ln(1 p)

(k)
(k)
(k)
(k)
+ n21 + 2n22 + n3 ln q + n11 + 2n12 + n3 ln(1 q)
Solving
Q
p
= 0 and
Q
q
= 0, we then obtain the EM algorithm as

(k)
(k)
n + n21
p (k+1) = 11
,
n
(k)
q (k+1) =
13.4.5.
(k)
n21 + 2n22 + n3
(k)
(k)
n + n12 + n22 + n3
(a) The survival function is S(y) = Pr(Y > y) = 1 (y ), where Y N(, 1) and is
the cdf of N(0, 1). Then the likelihood of X and Y is
L(; x, y) =

0
n1
n2
0
1
1
[1 (yi )}], and
exp (xi )2
2
2
i=1
i=1
ln L(; x, y) = constant
1
2

1
(xi )2 +
ln[1 (yi )}].
2
i=1
i=1
Solving
ln L(;x,y)
= 0, then the MLE, MLE , is the solution such that

yi MLE

= 0, where is the pdf of N(0, 1).
n1 x MLE +
i=1 1 yi MLE

n2
(b) Let z = (x1 , . . . , xn1 , z1 , . . . , zn2 ) be the complete data set. Then the likelihood is

LC (; z) =
,
n1 +n2
n1
n2
1
1
1
2
2
exp
(xi )
(zi ) , and
2
2
2
i=1
i=1
ln LC (; z) = constant
1
2
1
1
(xi )2
(zi )2 .
2
2
i=1
i=1
For i = 1, . . . , n2 , the conditional pdf of Zi given X = x, Y = y and = 0 is

h(zi |0 , x, y) =
f (zi , zi yi |0 )
(zi 0 )
=
,
f (zi yi |0 )
1 (yi 0 )
zi yi .
Then,
Q(|0 , x, y) = E0 [ln LC (; Z)|0 , x, y]
= constant
1
2
1
1
(xi )2
E0 (zi )2
2
2
i=1
i=1
1
2
1
1
= constant
(xi )2
2
2
Solving

(zi )2 h(zi |0 , x, y)dzi .
i=1
i=1 yi
= 0, after a lengthy computation we then obtain

=
n2

n1
n2
1
(yi 0 )
x +
0 +
.
n1 + n2
n1 + n 2
n1 + n2
1 (yi 0 )
i=1
Therefore, we obtain the EM algorithm as

(k+1) =
n2

(yi (k) )
n1
n2
1
x+
(k) +
.
n1 + n2
n1 + n2
n1 + n2
1 (yi (k) )
i=1
EXERCISES 13.5
All the algorithms and generations of simple distributions in this section can be done by any statistical
software. The statistical software, R, is used here.
13.5.1.
The 1st iteration:

We have x0 = 6.
Step 1: Generate j from A = {aij }. Suppose the software generated j = 7.
Step 2: r =
(7)
(6)
e3 37 /7!
e3 36 /6!
= 0.4286
Step 3: Generate u from U(0, 1). Suppose the software generated u = 0.5494.
Since r < u, we reject the new state 7 and stay at state 6. Set x1 = x0 = 6.
The 2nd iteration:
Start with x1 = 6.
3 5
e 3 /5!
Step 2: r = (5)
(6) = e3 36 /6! = 2
Step 3: Since r > 1, set x2 = j = 5.
The 3rd iteration:

Start with x2 = 5.
3 6
e 3 /6!
Step 2: r = (6)
(5) = e3 35 /5! = 0.5
Step 3: Generate u from U(0, 1). Suppose the software generated u = 0.7594.
Since r < u, set x3 = x2 = 5.
The rst 3 iterations are given above. The readers can follow the same algorithm to obtain
more sample points. Note that different results may appear due to the different generated
values each time.
13.5.3.
The MetropolisHastings algorithm is given below:

For t = 0, start with an arbitrary point, x(0) .
Step 1: Generate y from the proposal density ([], []/).

1
1 exp y
1

y
()
(y)
x y

= xy
Step 2: r =
=
exp (t) .
(t)
x
1
1
(x(t) )
exp (t)
x
()
(t)
Step 3: Acceptance/Rejection.
Generate u from U(0, 1).
If r u, set x(t+1) = y (i.e. accept the proposed new state);
else set x(t+1) = x(t) (i.e. reject the proposed new state).
Step 4: Set t = t + 1, go to step 1.
13.5.5.
Use the nominating matrix below
1/2
1/2
A=
0
0
1/2
0
1/2
0
0
1/2
0
1/2
0
0
.
1/2
1/2

For k = 0, start with an arbitrary point, xk = i.
Step 1: Generate j from the proposal matrix as follows:
Generate u1 from U(0, 1).
For i = 0, if u1 0.5, set j = 1; else set j = 0.
For i = 1 or 2, if u1 0.5, set j = i + 1; else set j = i 1.
For i = 3, if u1 0.5, set j = 3; else set j = 2.
Step 2: Calculate r = (j)
(i) according to the target distribution (x).
Generate u2 from U(0, 1).
If r u2 , set xk+1 = j;
else set xk+1 = xk .
Step 4: Set k = k + 1, go to step 1.
13.5.7.

is an exponential random variable with parameter , i.e. (x) = 1 exp x , x > 0.
%
&
(yx)2
Let the proposal density be qx (y) exp 2(0.5)
2 .
For t = 0, start with an arbitrary point, x(0) > 0.
%
&
(yx(t) )2
Step 1: Generate y from the proposal density qx(t) (y) exp 2(0.5)
.
2
That is to generate y from N(x(t) , (0.5)2 ).
&
%
y
(x(t) y)2
1

(y)qy (x(t) )
exp exp 2(0.5)2
& = exp x(t)y .
%
Step 2: r =
=

2
x
(yx(t) )
1
(x(t) )qx(t) (y)
exp (t) exp
2(0.5)2
Let = min{1, r}.

Generate u from U(0, 1).
If u, set x(t+1) = y;
else set x(t+1) = x(t) .
Step 4: Set t = t + 1, go to step 1.
13.5.9.
From Example 13.5.5 with n = 15, = 1, and = 2 recall that

X|Y = y Binomial(n, y) = Binomial(15, y), and
Y |X = x (x + , n x + ) = (x + 1, 17 x).
For y0 = 1/3:
(i) Generate x0 from Binomial(15, 1/3). Suppose the software generated x0 = 5.
(ii) Generate y1 from (x0 + 1, 17 x0 ) = (6, 12). Suppose the software generated y1 = 0.46 (approximated to the second digit). Then generate x1 from
Binomial(15, 0.46), resulting in x1 = 5.
(iii) Generate y2 from (x1 + 1, 17 x1 ) = (6, 12), resulting in y2 = 0.26. Then generate
x2 from Binomial(15, 0.26), resulting in x2 = 5.
Thus, for y0 = 1/3 a particular realization of Gibbs sampler for the rst three iterations are
(5, 0.33), (5, 0.46), and (5, 0.26).
For y0 = 1/2:
(10, 0.5), (5, 0.31), and (5, 0.34).
For y0 = 2/3:
(11, 0.66), (9, 0.61), and (7, 0.59).
From the three cases with different initial values, we see that when y0 = 2/3 the samples
have larger x and y values than the samples with y0 = 1/3. Thus, the choosing of initial
values may have inuence on the samples. However, this inuence would be negligible if
the algorithm is run for a large number of times.

2
1
1 2
1
,
, then the conditional distribution of X given Y = y
13.5.11. If (X, Y ) N
2
1 2
22
is (X|Y = y) N(1 + 12 (y 2 ), 12 (1 2 )).
Apply the above result we have
(X|Y = y) N( y, (1 2 )), and (Y |X = x) N( x, (1 2 )).
The Gibbs sampler is given below:

Start with an arbitrary point, y(0) . Then obtain x(0) by generating a random value from
N( y(0) , (1 2 )).
For i = 1, . . . n, repeat
Step 1: Generate y(i) from N( x(i1) , (1 2 ))
Step 2: Generate x(i) from N( y(i) , (1 2 ))
Step 3: Obtain the ith sample as (x(i) , y(i) ). Set i = i + 1, go to step 1.
Chapter
14
Some Issues in Statistical Applications:

An Overview
EXERCISES 14.2
The following is a scatter plot of the data.
30
25
Percent Return
14.2.1.
20
15
10
5
0.5
1.0
1.5
2.0
2.5
Percent Expense Ratio
3.0
The sample correlation coefcient is 0.3249, indicating a mild positive correlation.
127
128 CHAPTER 14 Some Issues in Statistical Applications: An Overview
(a) The following is a scatter plot of the data.

40000
Expenditure
30000
20000
10000
10000
20000
30000
Revenue
40000
50000
(b) r = 0.9918.
(c) The following is a Q-Q plot of revenue versus expenditure.
40000
30000
Expenditure
14.2.3.
20000
10000
10000
20000
30000
Revenue
40000
50000
(d) From the scatter plot being close to a line and r = 0.9918 we see that there is a strong
positive liner relationship between revenue and expenditure. From the Q-Q plot we see
that the quantiles fall nearly along the 45 degree line. Thus, we may conjecture that the
revenue and the expenditure have the same probability distribution.
14.2.5.
The following is the dot plot for this data.
80
100
120
140
160
180
The above dot plot suggests the distribution of the median house prices is skewed towards
the left, because most of the observations are to the left.
EXERCISES 14.3
14.3.1.
(a) The following table summarizes the z score, the modied z score, and the distribution
free z score.
data
1215.1
1109.9
1536.5
1797.8
1630.5
939.7
1219.7
519.9
830
780.1
1403.3
1869.7
2152.8
1410
532.8
z score
0.09852
0.31406
0.559969
1.095325
0.752558
0.66277
0.0891
1.52286
0.88752
0.98976
0.287066
1.242635
1.822656
0.300793
1.49643
distfree z
0.011804
0.281755
0.812933
1.483449
1.054144
0.718501
0
1.79574
1
1.128047
0.471132
1.66795
2.394406
0.488324
1.762638
modied z
0.12339
0.39335
0.701342
1.371858
0.942553
0.83009
0.11159
1.90733
1.11159
1.23964
0.359541
1.556359
2.282815
0.376733
1.87423
Since no z scores and modied z scores have absolute values greater than 3.5, and
no distributions free z scores are greater than 5, we then conclude that there is no
obvious outliers.
(b) The following is the boxplot.
2000
1500
1000
500
(c) Outliers in this case may represent an extreme observation which has either very high
or very low rate of motor vehicle thefts.
14.3.3.
(a) The following table summarizes the z score, the modied z score, and the distribution
free z score.
data
67
63
39
80
64
95
90
93
21
36
44
66
100
66
72
z score
0.09135
0.29066
1.48652
0.55641
0.24083
1.303824
1.054686
1.204169
2.38342
1.636
1.23738
0.14118
1.552962
0.14118
0.157789
distfree z
0.037037
0.333333
2.111111
0.925926
0.259259
2.037037
1.666667
1.888889
3.444444
2.333333
1.740741
0.111111
2.407407
0.111111
0.333333
modied z
0.1358
0.4321
2.20987
0.827163
0.35802
1.938274
1.567904
1.790126
3.54321
2.4321
1.8395
0.20987
2.308644
0.20987
0.23457
34
78
66
68
98
74
81
71
100
60
50
81
66
90
89
86
49
77
63
58
43
1.73566
0.456755
0.14118
0.04152
1.453307
0.257444
0.606237
0.107961
1.552962
0.44014
0.93842
0.606237
0.14118
1.054686
1.004858
0.855375
0.98825
0.406927
0.29066
0.5398
1.28721
2.481481
0.777778
0.111111
0.037037
2.259259
0.481481
1
0.259259
2.407407
0.555556
1.296296
1
0.111111
1.666667
1.592593
1.37037
1.37037
0.703704
0.333333
0.703704
1.814815
2.58024
0.679015
0.20987
0.06173
2.160496
0.382719
0.901237
0.160496
2.308644
0.65432
1.39506
0.901237
0.20987
1.567904
1.49383
1.271607
1.46913
0.604941
0.4321
0.80247
1.91358
Using z-score test and distribution-free test, there are no outliers. Using modied z-score
test the observation 21 is a possible outlier.
(b) The following is the boxplot.
100
80
60
40
20
Hence, the observation 21 is identied as an outlier using the boxplot.
EXERCISES 14.4
14.4.1.
a
Normal Q-Q Plot
100
Sample Quantiles
80
60
40
20
22
21
0
1
Theoretical Quantiles
From the above normal probability plot we see that the data follows the straight line fairly
well. Hence, the normality of the data is not rejected and no transformation is needed.
(a) The following is the normal probability plot of the data. The graph below clearly shows
that the data does not follow normal distribution.
Normal Q-Q Plot
25000
Sample Quantiles
14.4.3.
15000
5000
21.5
21.0
20.5
0.0
0.5
1.0
1.5
(b) Take the transformation y = ln(x) and look at the normal probability plot of the
transformed data below.
Normal Q-Q Plot
Sample Quantiles
10.0
9.5
9.0
8.5
8.0
7.5
21.5
21.0
20.5
0.0
0.5
1.0
1.5
With the transformation, we can see that the transformed data falls much closer to the
normal line.
(a) The following is the normal probability plot of the data. The graph below clearly shows
that the data does not follow normal distribution.
Normal Q-Q Plot
40000
30000
Sample Quantiles
14.4.5.
20000
10000
21
0
Normal Q-Q Plot
10.5
Sample Quantiles
10.0
9.5
9.0
8.5
8.0
21
0
With the transformation, we can see that the transformed data falls much closer to the
normal line.
(a) & (b) The following is the normal probability plot of the data. We see that the data
follows the straight line except for one data points. This suggests that the data may
follow normality but with a possible outlier.
Normal Q-Q Plot
50
45
Sample Quantiles
14.4.7.
40
35
30
25
20
22
21
0
1
(c) The following is the boxplot of the data.
50
45
40
35
30
25
20
Hence, the observation 52 is identied as a possible outlier using boxplot. Further

investigation is needed to check if there was a measurement error about this case. Or
this observation may suggest that a particular car is signicantly better than the others
in terms of mileage per gallon.
14.4.9.
Use the data from Exercise 14.2.1. Let X = the percent expense ratio and Y = the percent
return. Then we can calculate that
s2
F= X
= 0.0038.
sY2
Since F < F0.025 (19, 19) = 0.3958, we then reject the null hypothesis at level 0.05. Thus,
we suggest that the variances of the two populations are not equal.
14.4.11.
Let X = the bonus for female and Y = the bonus for male. The assumption of the test is
that the random samples of X and Y are from independent normal distributions. To test
the homogeneity of variances of X and Y we calculate the ratio
s2
F= X
= 0.8044
sY2
Since F0.025 (7, 7) = 0.2002 < F < F0.975 (7, 7) = 4.9949, then we do not reject the null
hypothesis at level 0.05. Thus, we suggest that the variances of the two populations
are equal.
14.4.13.
Let X1 , X2 and X3 be the scores of the students taught by the faculty, the teaching assistant
and the adjunct, respectively. The assumption of the test is that the random samples of
X1 , X2 and X3 are from independent normal distributions. To test homogeneity of variances of X1 , X2 and X3 we rst compute x 1 = 81.6, x 2 = 78.8 and x 3 = 70.4. Letting
yij = |xij x i |, we then obtain the following yij values.
Faculty
11.4
20.6
5.4
6.6
10.4
Deviation
Teaching Assistant
9.2
11.2
2.8
3.2
20.8
Adjunct
15.6
14.4
2.6
19.6
23.4
The test statistic is

k
ni (yi. y .. )2 /(k 1)
i=1
z=
where n1 = 5, n2 = 5, n3 = 5, k = 3 and n = 15
ni
k
(yij y i. )2 /(n k)
i=1 j=1
=
MST
43.59
=
= 0.8651
MSE
50.39
Since z < F0.95 (2, 12) = 3.8853, then at level 0.05 we do not reject the null hypothesis.
That is we suggest that the variances of the three populations are equal.
EXERCISES 14.5
14.5.1.
(a) The following is the dot plot of the data of Exercise 14.4.5.
10000
20000
30000
40000
(b) Mean = y = 13373.53, median = 7145, and standard deviation = s = 11924.47.
(c) A 95% condence interval for the mean is

s
11924.47
y t0.975 (n 1) = 13373.53 2.1448
= (6769.98, 19977.08).
n
15
(d) A 95% prediction interval is

'
'
1
1
y t0.975 (n 1) s 1 + = 13373.53 2.1448 11924.47 1 +
n
15
= (13040.67, 39787.74).
Since state expenditure is nonnegative, we can take the 95% prediction interval as
(0, 39787.74).
(e) There is a 95% chance that the true mean falls in (6769.98, 19977.08). There is a 95%
chance that the next observation falls in (0, 39787.74). The assumption of obtaining the condence interval and prediction interval is that the data follows normal
distribution or has large sample size to employ central limit theorem.
14.5.3.
(a) Let X = the midterm score and Y = the nal score. The following is the scatter plot
of the data with the tted regression line.
100
80
Final
60
40
20
20
14.5.5.
40
60
Midterm
80
100
(b) The data does not show any particular pattern. No transformation is needed in
this case.
(c) Fitting the data we obtain the linear regression model as y = 64.39 0.4048x.
However, we haveR2 = 0.1345 meaning only 13.45% of the variation in y is explained
by the variable x.
(a) Let X = the in-state tuition and Y = the graduation rate. The following is the scatter
plot of the data with the tted regression line.
60
Graduation Rate
50
40
30
20
10
2000
3000
4000
5000
6000
In-State Tuition
7000
(b) Fitting the data by least squares method we obtain the linear regression model as
y = 18.6887 0.0043x.
(c) We have R2 = 0.1618 meaning only 16.18% of the variation in y is explained by the
variable x. Thus, from the small R2 and the scatter plot above we suggest that the least
squares line is not a good model and must be improved.
EXERCISES 14.6
(a) The normal probability plot of the data is given below. From the normal plot we can
see that the data signicantly deviates from the normal line. Hence, we can not assume
the data is normally distributed and nonparametric test is more appropriate.
Normal Q-Q Plot
0.6
Sample Quantiles
14.6.1.
0.4
0.2
0.0
22
21
0
1
Normal Q-Q Plot
Sample Quantiles
21
22
23
24
22
21
0
1
We see that the transform data does not deviate from the normal line a lot. Thus, a
parametric test can be used on the logarithmic ltered data.
EXERCISES 14.7
14.7.1.
(a) Let X = total revenue and Y = pupils per teacher. The following is the dot plot of the
data of pupils per teacher.
14
15
16
17
18
19
20
The descriptive statistics of the pupils per teacher data is given below.
Mean
16.6625
n
16
Std
2.0063
Min
14.2
Q1
14.975
Median
16.25
Q3
17.525
Max
20.2
(b) The boxplot of the pupils per teacher data is given below. From the boxplot below we
see that no outlier exists.
20
19
18
17
16
15
14
The following is the normal probability plot of the pupils per teacher data. The data is
not normal.
Normal Q-Q Plot
20
Sample Quantiles
19
18
17
16
15
14
22
21
0
The following gives the normal plot by take the transformation z = y2 which shows
that the data becomes approximately normal.
Normal Q-Q Plot
Sample Quantiles
0.0045
0.0035
0.0025
22
21
0
(c) A 95% condence interval for the mean of the pupils per teacher is
2.0063
s
y t0.975 (n 1) = 16.6625 2.1315
= (15.59, 17.73).
n
16
(d) The following is the scatter plot of total revenue vs. pupils per teacher with the tted
regression line.
20
Pupils per Teacher
19
18
17
16
15
14
5.0e 1 06
1.5e 1 07
Total Revenue
2.5e 1 07
(e) Fitting the data we obtain the linear regression model as y = 17.065.512108 x. However, we have R2 = 0.0324 meaning only 3.24% of the variation iny is explained by the
variable x. Thus, the regression model is not a good representation of the relationship
between total revenue and pupils per teacher.
Let X = the in-state tuition and Y = the graduation rate. The following is the scatter plot
of graduation rate vs. in-state tuition with the tted regression line.
60
Graduation Rate
50
40
30
20
10
2000
3000
4000
5000
In-State Tuition
6000
7000
Fitting the data we then obtain the linear regression model as y = 18.6887 0.0043x with
R2 = 0.1618.
To run a residual model diagnostics we look at the following three plots.
20
10
Residual
14.7.3.
10
20
2
8
Order
10
12
14
20
Residual
10
210
220
30
35
40
Fitted Value
45
50
Normal Q-Q
Standardized Residuals
20
10
210
220
21
0
There is nothing unusual about the residual plots. Therefore, the basic assumptions in
regression analysis for the errors: independency, normality and homogeneity of variances
are checked. And it seems to be no reason to reject these assumptions.

N

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

N

Caricato da

Copyright:

Formati disponibili

Mathematical Statistics with

AMSTERDAM BOSTON HEIDELBERG LONDON

Academic Press is an imprint of Elsevier

Elsevier Academic Press

For all information on all Elsevier Academic Press publications

CHAPTER 2 Basic Concepts from Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER 3 Additional Topics in Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

CHAPTER 4 Sampling Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

CHAPTER 11 Bayesian Estimation and Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

This page intentionally left blank

The suggested solutions:

The suggested questions can be

2 CHAPTER 1 Descriptive Statistics

(b) Pie chart

(a) Bar graph

Students Solutions Manual 3

(b) Pareto chart

(c) Pie chart

(a) Bar graph

4 CHAPTER 1 Descriptive Statistics

(b) Pie chart

(a) Pie chart

(b) Bar graph

Students Solutions Manual 5

(a) Bar graph

(b) Pareto graph

268.0 119.4 58.5 42.3 35.1 34.5 23.9 30.2

6 CHAPTER 1 Descriptive Statistics

(a) Stem and leaf

Students Solutions Manual 7

(c) Pie chart

Mean is 165.6667 and standard deviation is 63.15397

Data is 3,3,5,13 and standard deviation is 4.760952

(c) Therefore there are no outliers.

8 CHAPTER 1 Descriptive Statistics

(a) Mean is 33.105, variance is 177.0430 and range is 48.19.

(a) Mean is 110, standard deviation is 83.4847.

(a) Mean is 3.7433, variance is 3.501 and standard deviation is 1.871323.

Students Solutions Manual 9

(b) Frequency table

(a) Mean is 44.27, variance is 536.15 and standard deviation is 23.15.

10 CHAPTER 1 Descriptive Statistics

Basic Concepts from Probability Theory

(a) Probability space is

(a) Probability space is S = {N, N, N, S, S}

12 CHAPTER 2 Basic Concepts from Probability Theory

Without loss of generality let us assume An is increasing sequence then A1 A2 . . .

An . . .. We know that if A1 A2 . . . An . . . then A1 A2 . . . An . . . = Ai =

limn An . From the condition we know lim An = Ai and if we take probability on

both sides then lim P(An ) = P Ai = P lim An

Students Solutions Manual 13

(a) p = .4313 .44425 .46798 .5263 1 = .04719

365 364 ... (365 20 + 1)

(b) p = 1 .2936 = .7063

p = 1 .27778 .16667 = .5556

(a) P(A|B) + P(Ac |B) =

(b) (i) if P(A|B) +

14 CHAPTER 2 Basic Concepts from Probability Theory

If A and B are independent then P(A B) = P(A) P(B)

= P(E) then E and F are independent

(a) P(a dime is selected) =

P(a dime is selected|box i is selected)P(box i is selected)

P(box 4 is selected & a penny is selected)

note that the integrand is the kernel density of (, 1)