Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Department of Statistics
National Chengchi University
Taipei 11605, Taiwan
E-mail: chengt@nccu.edu.tw
Outline
• 2.1 Assembling the research design
• 2.2 How to randomize
• 2.3 Preparation of data files for the analysis
• 2.4 A statistical model for the experiment
• 2.5 Estimation of the model parameters with least squares
• 2.6 Sums of squares to identify important sources of variation
• 2.7 A treatment effects model
• 2.8 Degrees of freedom
• 2.9 Summaries in the analysis of variance table
• 2.10 Tests of hypotheses about linear models
• 2.11 Significance testing and tests of hypotheses
• 2.12 Standard errors and confidence intervals for treatment
means
• 2.13 Unequal replication of the treatments
• 2.14 How many replications of the F test?
• 2A.1 Appendix: Expected values
• 2A.2 Appendix: Expected mean squares
Objectives
• Research hypothesis
• Treatment design
• Experiment design or observational study design
Research design for a study
• Research hypothesis
• Treatment design
• Experiment design or observational study design
t
X
which leads the constrains to be ri τi = 0.
i =1
Remarks
• For balanced case, ri = r, i = 1, · · · , t, then
t
X t
,X t
X t
X
µ = µ̄· = rµi r= µi /t, τi = µi − µ̄· , and τi = 0.
i =1 i =1 i =1 i =1
• Eqs (2.1) and (2.13) are special cases of the general linear
model :
y = β0 + β1 x1 + β2 x2 + · · · + βk xk + ε
• Hence, these models can be written in vector notation as
Y = X β + ε~
• Without intercept:
• β0 = 0, β1 = µ1 , · · · , βt = µt
• xi =I(belonging to the i th treatment group), i = 1, · · · , t.
• With intercept:
• β0 , 0
• Let xi = I(belonging to the i th treatment group),
i = 1, · · · , t-1,
• then β0 = µt , βi = µi − µt
• How to write (2.13) as (2.2)?
Statistical hypothesis
• H0 : µ1 = · · · = µt
Ha : µi , µk for some i , k
• or
H0 : τi = 0, i = 1, · · · , t
Ha : τi , 0 for some i
• Remarks:
• H0 is associated to the reduced model : yij = µ + εij .
• Ha is associated to the full model : yij = µi + εij or
yij = µ + τi + εij .
• May include covariates measured prior to the experiment
in the model:
yij = µi + βxij + εij
or
yij = µ + τi + βxij + εij
• See Chapter 17 Analysis of Covariance.
2.5+2.6+2.13 Estimating parameters with least
squares and SS
X ri
t X ri
t X
X
Q= ε̂ij2 = (yij − µ̂i )2
i =1 j =1 i =1 j =1
X ri
t X ri
t X
X
Q= ε̂ij2 = (yij − µ̂)2
i =1 j =1 i =1 j =1
i t r i t r
∂Q ∂ XX 2
XX
= (yij − µ̂) = −2 (yij − µ̂) = 0
∂µ ∂µ
i =1 j =1 i =1 j =1
t ri
1 XX
µ̂ = yij = ȳ..
N
i =1 j =1
Estimator for reduced model (continued)
• Hence
t X
X r t X
X r
SSEr = (yij − µ̂)2 = (yij − ȳ.. )2
i =1 j =1 i =1 j =1
X ri
t X X ri
t X t
X
2 2
SSE = (yij − µ̂i ) = (yij − ȳi . ) = (ri − 1)si2
i =1 j =1 i =1 j =1 i =1
X t X ri Xt X ri
SSEr = (yij − µ̂)2 = (yij − ȳ.. )2
i =1 j =1 i =1 j =1
X t X ri X t X ri
SSEr − SSE = (yij − ȳ.. )2 − (yij − ȳi . )2
i =1 j =1 i =1 j =1
Partition of the total sum of squares (continued)
X ri
t X ri
t X
X
(yij − ȳ.. )2 = [(yij − ȳi . ) + (ȳi . − ȳ.. )]2
i =1 j =1 i =1 j =1
X ri
t X ri
t X
X ri
t X
X
2 2
(yij − ȳ.. ) = (ȳi . − ȳ.. ) + (yij − ȳi . )2
i =1 j =1 i =1 j =1 i =1 j =1
SST = SSTR + SSE
Partition of the total sum of squares (continued)
X ri
t X ri
t X
X
(yij − ȳ.. )2 = [(yij − ȳi . ) + (ȳi . − ȳ.. )]2
i =1 j =1 i =1 j =1
X ri
t X ri
t X
X ri
t X
X
2 2
(yij − ȳ.. ) = (ȳi . − ȳ.. ) + (yij − ȳi . )2
i =1 j =1 i =1 j =1 i =1 j =1
SST = SSTR + SSE
⇐= Partitioning total sum of squares
Notes
• The sum of squares between the treatment means and the
grand mean, the so-called treatment sum of squares, is
denoted as
X t Xri
SSTR = (ȳi . − ȳ.. )2
i =1 j =1
t
X t
X
= ri (ȳi . − ȳ.. )2 = ri τ̂i2
i =1 i =1
• SST = SSTR + SSE =⇒ SSE = SST − SSTR .
• Though the textbook uses SST to denote treatment sum of
squares, it is quite unconventional.
• So we’ll keep the conventional terminology that “SST”
mean “total sum of squares” and “SSTR” means
treatment sum of squares on the notes.
• The total number of observation nT .
Notes (continued)
Textbook Notes
Treatment sum of squares SST or SS Treatment SSTR
Total sum of squares SSEr or SS Total SST
Error terms of model e ε
Estimating treatment effects
τi = µi − µ̄·
τ̂i = µ̂i − µ̂ = ȳi . − ȳ..
yij = µ + τi + εij
t X
X r
• SST = SSEr = (yij − ȳ.. )2 which is composed by
i =1 j =1
(yij − ȳ.. )’s
t X
X r
• But (yij − ȳ.. ) = 0, which means that (yij − ȳ.. )’s are
i =1 j =1
linearly dependent, and any one of (yij − ȳ.. )’s is the
negative of the sum of the other (N − 1) values.
• Hence, df (SST ) = N − 1
• ȳ.. is used to estimate the grand mean µ (i.e. µ̄· ).
• 1 df is lost due to the estimation of the unknown parameter.
• df (SST ) = (sample size) - (# of parameters to be estimated)
= N −1
II. Degrees of freedom for SSTR
t
X
• SSTR = r (ȳi . − ȳ.. )2 is composed by (ȳi . − ȳ.. )’s.
i =1
t
X
• But (ȳi . − ȳ.. ) = 0, which means that (ȳi . − ȳ.. )’s are
i =1
linearly dependent, and any one of (ȳi . − ȳ.. )’s is the
negative of the sum of the other (t − 1) values.
• Hence, df (SSTR ) = t − 1.
t
X
• Treatment effects are τ1 , · · · , τt , and τi = 0, so only t-1
i =1
of them need to be estimated.
• Hence, df (SSTR ) = t − 1.
III. Degrees of freedom for SSE
t X
X r
• SSE = (yij − ȳi . )2 is composed by (yij − ȳi . )’s.
i =1 j =1
• But for each i , =⇒ r − 1 degrees of freedom, while
i = 1, · · · , t.
• df (SSE ) = t × (r − 1) = N − t.
• Since the full model has t unknown parameters: µ1 , · · · , µt
to be estimated,
• H0 : µ1 = . . .. . .. . . = µt =⇒ Reduced model
• Ha : µ0j s are not all equal =⇒ Full model
• Hence, the test statistic is
ν
X
Zi2 = Q1 + Q2 + · · · + Qs
i =1
with s ≤ ν.
• Then Q1 , Q2 , · · · , Qs are independent χ2 random variables
with ν1 , ν2 , · · · , νs d.f., respectively if and only if
ν = ν1 + ν2 + · · · + νs
Why F
Storage method
Observation A B C D
1 7.66 5.26 7.41 3.51
2 6.98 5.44 7.33 2.91
3 7.80 5.80 7.04 3.66
yi . 22.440 16.500 21.780 10.080
• Factor: Storage Method
• 4 treatments , t = 4.
• r1 = · · · = r4 = 3 , N = 4 × 3 = 12.
Example 2.1 (continued)
t X
r 4 X
3
X X y..2
SST = (yij − ȳ.. )2 = yij2 −
N
i =1 j =1 i =1 j =1
70.82
= 451.5106 − = 33.7996
12
X ri
t X 4
X
2
SSTR = (ȳi . − ȳ.. ) = 3(ȳi . − ȳ.. )2
i =1 j =1 i =1
4
X yi2. y..2 1351.778
= − = − 417.72 = 32.8728
3 N 3
i =1
4
X
• Note: Also, SSTR= 3τ̂i2
i =1
Example 2.1 (continued)
• H0 :µ1 =. . . . . . . . . =µ4
• Ha : µj , µl , for some j , l
• Test statistic:
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept y4• 3.360000000 B 0.19651124 17.10 <.0001
method 1 y1• - y4• 4.120000000 B 0.27790886 14.83 <.0001
method 2 y2• - y4• 2.140000000 B 0.27790886 7.70 <.0001
method 3 y3• - y4• 3.900000000 B 0.27790886 14.03 <.0001
method 4 0.000000000 B . . .
NOTE: The X'X matrix has been found to be singular, and a generalized inverse was
used to solve the normal equations. Terms whose estimates are followed by the
letter 'B' are not uniquely estimable.
Least Squares Means
method y LSMEAN
1 7.48000000 y1•
2 5.50000000 y2•
3 7.26000000 y3•
4 3.36000000 y4•
Example 2.2 Detecting the onset of Phlebitis
• CRD
• EU: rabbit
• Treatments: 3 intravenous treatments
• a vehicle solution + amiodarone
• a vehicle solution (as control)
• a saline solution (as placebo)
• Response: ear temperature difference
• Complications with the experiment protocol resulted in a
unbalanced design.
Example 2.2 –SAS
• options nocenter nodate nonumber;
• data phlebitis;
• input method y@@;
• datalines;
• 1 2.2 1 1.6 1 0.8 1 1.8 1 1.4 1 0.4 1 0.6 1 1.5 1 0.5
• 2 0.3 2 0.0 2 0.6 2 0.0 2 -0.3 2 0.2
• 3 0.1 3 0.1 3 0.2 3 -0.4 3 0.3 3 0.1 3 0.1 3 -0.5
• ;
• proc glm;
• class method;
• model y = method/solution e;
• lsmeans method;
• run;
Example 2.2 (continued)-Output
Dependent Variable: y
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 2 7.21623188 3.60811594 16.58 <.0001
Error 20 4.35333333 0.21766667
Corrected Total 22 11.56956522
Standard
Parameter Estimate Error t Value Pr > |t|
Intercept y3• -0.000000000 B 0.16494949 -0.00 1.0000
method 1 y1• − y3• 1.200000000 B 0.22670139 5.29 <.0001
method 2 y2• − y3• 0.133333333 B 0.25196450 0.53 0.6025
method 3 0.000000000 B . . .
NOTE: The X'X matrix has been found to be singular, and a generalized inverse was
used to solve the normal equations. Terms whose estimates are followed by the
letter 'B' are not uniquely estimable.
Computations
Pt Pri
• y.. = i =1 j =1 yij
Pri
• yi . = j =1 yij
P Pri P Pri 2
2 y..
• SST = ti=1 j = 1 (yij− ȳ.. )2 = ti=1 j = 1 y ij − N
Pt 2 Pt 2 Pt yi2. y..2
• SSTR = i =1 ri (ȳi . − ȳ.. ) = i =1 ri τ̂i = j =1 r − N ,
i
Pt Pri r yi2.
• SSE = i =1 j =1 (yij − ȳi . )2 = ti=1 j = 2 t
P P i
P
1 y ij − i = 1 ri =
SST − SSTR
• Note: If balanced design (r1 = r2 = ....... = rt = r), then
2
Pt Pr 2 Pt Pr 2 y..
• SST = i =1 j =1 (yij − ȳ.. ) = i =1 j =1 yij − rt
y2 y2
• SSTR = ti=1 r(ȳi . − ȳ.. )2 = r ti=1 τ̂i2 = tj=1 ri . − rt..
P P P
i
• SSE = ti=1 rj=1 (yij − ȳi . )2 = ti=1 rj=1 yij2 -
P P P P
Pt yi2.
i =1 r = SST − SSTR
Remarks
• If r1 = r2 = ....... = rt = r, then
Pt 2
i =1 r(ȳi . − ȳ.. )
MSTR = = rSȳ2
t −1
• If r1 = r2 = ....... = rt = r, then N = rt, N − t = t(r − 1), and
Pt Pr
− ȳi . )2
Pt
i =1 j =1 (yij − 1)Si2 i =1 (r
MSE = =
N −t N −t
Pt 2 t
(r − 1) i =1 Si 1X 2
= = Si ,
t(r − 1) t
i =1
Y = X β + ε~
I The estimator β̂ of Y = X β + ε~
Pt
• Without the restriction i = 1 ri τi = 0.
• Then
y11
..
.
y1r1
..
Y =
.
yt1
..
.
yt,rt N ×1
···
1r1 1r1 0r1 0r1
1r2 0r2 1r2 ··· 0r2
···
X = 1r3 0r3 0r3 0r3
.. .. .. ..
..
.
. . . .
1rt 0rt 0rt · · · 1rt N ×(t +1)
I (continued)
µ
τ1
τ2
β = ..
.
τ
t−1
τt
y..
y1.
y2.
0
• X y = .
..
.
y
t−1,.
yt.
(t +1)×1
• Hence, β̂ = (X X )− X 0 y, where
0 (X 0 X )− is a generalized
inverse for X 0 X .
Case 1
··· 0
0 0 0
0
1/r1 0 ··· 0
··· 0
0 −
• Let (X X ) = 0 0 1/r2
.. .. .. .. ..
. . . . .
· · · 1/rt
0 0 0
0 0
ȳ1.
µ̂ + τ̂1
ȳ2. µ̂ + τ̂2
0 − 0
• Then β̂ = (X X ) X y =
=
. (**)
. .
.. ..
ȳt−1,. µ̂ + τ̂t−1
ȳt. µ̂ + τ̂t
• Notice that β̂ derived in case 1 is a biased estimator, because
0 µ
µ + τ1
τ1
µ + τ2 τ2
E (β̂) = . . = β
,
.. ..
µ + τt−1 τt−1
µ + τt τt
Case 2
" #
(X10 X1 )−1 0
• Let (X 0 X )− = be another generalized inverse
0 0
for X 0 X .
ȳt. µ̂ + τ̂t
ȳ − ȳ
1. t.
τ̂1 − τ̂t
ȳ2. − ȳt. τ̂2 − τ̂t
• Then, β̂ = (X 0 X )− X 0 y = . .= .. . (***)
.
.
.
ȳt−1,. − ȳt. τ̂t−1 − τ̂t
0 0
µ + τt µ
τ1 − τt τ1
τ2 − τt τ2
• E (β̂) = = β,
.. , ..
.
.
τt−1 − τt τt−1
0 τt
which is also biased.
Example 2.2
23 9 6 8 11.6
9 9 0 0 10.8
X 0 X = , X 0 y =
6 0 6 0
0.8
8 0 0 8 0
• Case 1) Let
−
23 9 6 8 0 0 0 0
9 9 0 0 0 1/9 0 0
0 −
(X X ) = =
6 0 6 0
0 0 1/6 0
8 0 0 8 0 0 0 1/8
µ
τ1
τ2
β =
..
.
τt−1
II (continued)
··· 0
N 0 0
0
r1 (1 + r1 /rt ) r1 r2 /rt · · · r1 rt−1 /rt
X X = 0 r1 r2 /rt r2 (1 + r2 /rt ) · · · r2 rt−1 /rt
0
.. .. .. . . ..
.
. . . .
0 r1 rt−1 /rt r2 rt−1 /rt ··· rt−1 (1 + rt−1 /rt ) t×t
y..
r1 (ȳ1. − ȳt. )
r2 (ȳ2. − ȳt. )
0
..
X y =
.
..
.
rt−1 (ȳt−1. − ȳt. )
t×1
II (continued)
• Note that the normal equations (or least squares
equations) are X 0 X β̂ = X 0 y.
• Solve the first equation: N µ̂ = y.. =⇒ µ̂ = ȳ..
• As ti=1 ri τi = 0, we obtain τ̂i = ȳi . − ȳ.. , i =1,. . . , t.
P
’Sample size’
Obs t df1 r df2 cv ncp power1
1 3 2 2 3 9.55209 3.4545 0.16643
2 3 2 3 6 5.14325 5.1818 0.33545
3 3 2 4 9 4.25649 6.9091 0.49701
4 3 2 5 12 3.88529 8.6364 0.63388
5 3 2 6 15 3.68232 10.3636 0.74186
6 3 2 7 18 3.55456 12.0909 0.82281
7 3 2 8 21 3.46680 13.8182 0.88113
8 3 2 9 24 3.40283 15.5455 0.92184
Example 2.3 - SAS
• proc power ;
• onewayanova groupmeans = 0.8 |0.1 |0.0
• stddev = 0.469
• alpha = 0.05
• npergroup = .
• power = 0.9;
• run;
Example 2.3 - Output