Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
No analysis is done for any subfile group for which the number of non-empty
groups is less than 2 or the number of cases or sum of weights fails to exceed the
number of non-empty groups. An analysis may be stopped if no variables are
selected during variable selection or the eigenanalysis fails.
Notation
The following notation is used throughout this chapter unless otherwise stated:
g Number of groups
p Number of variables
q Number of variables selected
Xijk Value of variable i for case k in group j
f jk Case weights for case k in group j
mj Number of cases in group j
nj Sum of case weights in group j
Basic Statistics
Mean
F
mj I
= G∑ f JJ bvariable i in group jg
Xij
GH
k =1
jk Xijk
K
nj
Fg mj I avariable if
= G∑∑ f JJ
Xi•
GH
j =1 k =1
jk Xijk
K
n
1
2 DISCRIMINANT
Variances
F mj I
GG ∑ f 2
jk Xijk − n j Xij2 JJ
Sij2 =
H k =1 K bvariable i in group jg
dn − 1i
j
F g mj I
GG ∑ ∑ f X − nX 2 2 JJ avariable if
H K
jk ijk i
j =1 k =1
Si2• =
an − 1f
g mj F g mj IF mj I
wil = ∑∑ −∑ G ∑ f JJ GG ∑ f JJ i, l = 1, …, p
j =1 k =1
f jk Xijk Xljk
GH j =1 k =1
jk Xijk
KHk =1
jk Xljk
K
nj
g mj F g mj IF g mj I
til = ∑∑ f jk Xijk Xljk − GG ∑ ∑ f jk Xijk JJ GG ∑ ∑ f jk Xljk JJ n
j =1 k =1 H j =1 k =1 KHj =1 k =1 K
W
C= n>g
a f
n−g
DISCRIMINANT 3
Fm I
GG ∑ f jk Xijk Xljk − Xij Xlj n j JJ
j
a j f H k =1
cil =
K
dn j − 1i
T
T′ =
n −1
b
t − wii n − g
Fi = ii
ga f
wii g − 1 a f
with g −1 and n − g degrees of freedom
wii
Λi =
tii
Method = Direct
For direct variable selection, variables are considered for inclusion in the order in
which they are written on the ANALYSIS = list. A variable is included in the
analysis if, when it is included, no variable in the analysis will have a tolerance less
than the specified tolerance limit (default = 0.001).
• The order of entry of eligible variables with the same even inclusion level
is determined by their order on the ANALYSIS = specification.
• The order of entry of eligible variables with the same odd level of
inclusion is determined by their value on the entry criterion. The variable
with the “best” value for the criterion statistic is entered first.
• Its F-to-enter is less than the F-value for a variable to enter value, or
• If probability criteria are used, the significance level associated with its F-
to-enter exceeds the probability to enter.
A variable with an even inclusion number is ineligible for inclusion if the first
condition above is met.
LMW11 W12 OP
NW21 W22 Q
where W11 is q × q . At this stage, the matrix W∗ is defined by
W∗ =
LM−W111
− −1
W11 W12
=
∗OP LM
W11 ∗
W12 OP
MNW21W111− −1
W22 − W21W11 W12 ∗
W21PQ MN ∗
W22 PQ
In addition, when stepwise variable selection is used, T is replaced by the matrix
T ∗ , defined similarly.
6 DISCRIMINANT
Tolerance
R|0 if wii = 0
TOL i
|
= Sw w ∗
if variable i is not in the analysis and wii ≠ 0
||−1 ew w j
ii ii
∗
T ii ii if variable i is in the analysis and wii ≠ 0.
If a variable’s tolerance is less than or equal to the specified tolerance limit, or its
inclusion in the analysis would reduce the tolerance of another variable in the
equation to or below the limit, the following statistics are not computed for it or
any set including it.
F-to-Remove
Fi =
ew∗
ii ja
− tii∗ n − q − g + 1 f
tii∗ ag − 1f
with degrees of freedom g −1 and n − q − g +1.
F-to-Enter
F =
et ∗
ii ja
− wii∗ n − q − gf
i
wii∗ ag − 1f
with degrees of freedom g −1 and n − q − g .
Λ = W11 T11
The Approximate F Test for Lambda (the “overall F”), also known as Rao’s R (Tatsuoka, 1971)
F=
e1 − Λ jbr s + 1 − qh 2g
s
Λs qh
where
R| q + h − 5
2 2
|
s= S q h −4
2 2
if q 2 + h 2 ≠ 5
||
T1 otherwise
r = n − 1 − a q + gf 2
h = g −1
q q
a f∑ ∑ w ∗ bt
V = − n−g il il − wil g
i =1 l =1
q q
2
Dab a f∑ ∑ w∗ c X
= − n−g il ia hc
− Xib Xla − Xlb h
i =1 l =1
8 DISCRIMINANT
Fab =
bn − q − g + 1gn n D a b 2
qbn − g gbn + n g
ab
a b
g−1 g
R= ∑ ∑ 4 e4 + D j 2
ab
a =1 b = a +1
Classification Functions
Once a set of q variables has been selected, the classification functions (also known
as Fisher’s linear discriminant functions) can be computed using
q
a f∑ w∗ X
bij = n − g il lj i = 1, 2, …, q, j = 1, 2, …, g
l =1
∑b X
1
a j = log p j − ij ij j = 1, 2, …, q
2
i =1
( T − W ) V = λWV
DISCRIMINANT 9
W = LU
bL ( T − W )U
−1 −1
g
− λI ( UV ) = 0
V = U −1 UV a f
For each of the eigenvalues, which are ordered in descending magnitude, the
following statistics are calculated:
100λ k
m
∑λ k
k =1
Canonical Correlation
b
λk 1+ λk g
10 DISCRIMINANT
Wilks’ Lambda
Testing the significance of all the discriminating functions after the first k:
m
Λk = ∏1 b1 + λ g i k = 0, 1, …, m − 1
i = k +1
c a f h
χ 2 = − n − q + g 2 − 1 ln Λ k ,
a fa f
which is distributed as a χ 2 with q − k g − k − 1 degrees of freedom.
−1
D = S11 V
where
S = diag e w11 , w22 , …, w pp j
S11 = partition containing the first q rows and columns of S
V = matrix of eigenvectors such that
V ′W11V =I
−11
R = S11 W11V
DISCRIMINANT 11
a
If some variables were not selected for inclusion in the analysis q < p , thef
eigenvectors are implicitly extended with zeroes to include the nonselected
variables in the correlation matrix. Variables for which Wii = 0 are excluded from S
and W for this calculation; p then represents the number of variables with non-zero
within-groups variance.
B= an − gfS11−1D
The associated constants are:
q
ak = − ∑ bik Xi•
i =1
The group centroids are the canonical discriminant functions evaluated at the group
means:
q
fkj = ak + ∑ bik Xij
i =1
g
a f
M = n − g log C′ − ∑ dn j − 1ilog Ca jf
j =1
12 DISCRIMINANT
where
C′ = pooled within-groups covariance matrix excluding groups with singular
covariance matrices
a jf
Determinants of C′ and C are obtained from the Cholesky decomposition. If any
diagonal element of the decomposition is less than 10 −11, the matrix is considered
singular and excluded from the analysis.
p
a jf
log C = 2 ∑ log lii − p logdn j − 1i
i =1
Similarly,
p
log C′ = 2 ∑ log lii − p logan′ − gf
i =1
where
an′ − gfC′ = L′L
n′ = sum of weights of cases in all groups with nonsingular covariance matrices
The significance level is obtained from the F distribution with t1 and t2 degrees of
freedom using (Cooley and Lohnes, 1971):
R|M b if e2 > e12
F=S
|| t atb −MM f
2 if e2 < e12
T 1
DISCRIMINANT 13
where
F g 1 1 I 2 p + 3p −1
e1 = GG ∑ n − 1 − n − g JJ 6ag − 1fa p + 1f 2
Hj j
=1 K
Fg 1 I a p − 1fa p + 2f
e2 = GG ∑ (n − 1) −
1
JJ 6ag − 1f
Hj j
=1
2
( n − g) 2
K
a fa f
t1 = g − 1 p p + 1 2
b
t2 = t1 + 2 g e2 − e12
R| t 1 if e2 > e12
1− e − t t
b=S 1 1 2
||1 − e t− 2 t 2 if e2 < e12
T 1 2
e2 = e2 + 0.0001 e2 − e12 e j
the program uses Bartlett’s χ 2 statistic rather than the F statistic:
b
χ 2 = M 1 − e1 g
with t1 degrees of freedom.
For testing the group covariance matrix of the canonical discriminant functions, the
procedure is similar. The covariance matrices C a j f and C′ are replaced by D j and
D′ , where
D j = B′ C B a jf
14 DISCRIMINANT
a f
D′ = n − g I m − ∑ dn j − 1iD j
j
Classification
The basic procedure for classifying a case is as follows:
• If X is the 1× q vector of discriminating variables for the case, the 1× m
vector of canonical discriminant function values is
f = XB + a
d i d
χ 2j = f − f j D −j 1 f − f j i′
where D j is the covariance matrix of canonical discriminant functions for
group j and f j is the group centroid vector. If the case is a member of group j,
χ 2j has a χ 2 distribution with m degrees of freedom. P(X G) is the
significance level of such a χ 2j .
∑
−1 2 − χ 2j 2
Pj D j e
j =1
where p j is the prior probability for group j. A case is classified into the group
for which P(G j | X) is highest.
DISCRIMINANT 15
g j = log Pj −
1
2
elog D j + χ 2j j
R| expFG g − max g IJ
|| H j
j K j
if g j − max g j > −46
|| ∑ expFGH g − max g IJK
g j
|
j j
j
P( G | X ) = S j =1
j
||
||0 otherwise
||
T
If individual group covariances are not used in classification, the pooled within-
groups covariance matrix of the discriminant functions (an identity matrix) is
substituted for D j in the above calculation, resulting in considerable simplification.
LMD −1
j11 0 OP
MN0 0 PQ
Cross-Validation
The following notation is used in this section:
X
~ jk ( X1 jk , …, Xqjk )T
M
~ j
Sample mean of jth group
mj
∑f
1
M
~ j
= jk X
nj ~ jk
k =1
M
~ jk
Sample mean of jth group excluding point X jk
~
mj
∑ f jl X~ jl
1
M
~ jk
=
n j − f jk
l =1
l≠k
Σ −1 n − g − f jk −1
jk = (Σ +
n−g
n j Σ −j 1 ( X jk − M
~ j
)( X jk − M j ) T Σ −j 1
~ ~ ~
)
(n j − f jk )(n j − g) − n j ( X jk − M j ) T Σ −j 1 ( X jk − M
~ j
)
~ ~ ~
−1
d02 ( a~, b~ ) ~− b
~ ) Σ jk ( a − b)
T T
= (a ~ ~
cross-validation, SPSS loops over all cases in the data set. Each case, say X
~ jk
, is
extracted once and treated as test data. The remaining cases are treated as a new
data set.
DISCRIMINANT 17
the ratio of the sum of misclassified case weights and the sum of all case weights.
To reduce computation time, the linear discriminant method is used instead of
the canonical discriminant method. The theoretical solution is exactly the same for
both methods.
Rotations
Varimax rotations may be performed on either the matrix of canonical discriminant
function coefficients or on that of the correlation between the canonical
discriminant functions and the discrimination variables (the structure matrix). The
actual algorithm for the rotation is described in FACTOR.
For the Kaiser normalization
R|
1 + 1 w w∗
hi2 =S
| a
ii ii
squared multiple correlation f
if coefficients rotated
||∑ r
m
2
if correlations rotated
|T =
k 1
ik
−1
R = S11 W11V
D R = DK
18 DISCRIMINANT
R R = RK
V ′( T − W )V = Λ a
= diag λ 1 , λ 2 , …, λ m f
where the λ k are the eigenvalues.
b V g′ a T − W fV
R R
is not diagonal, meaning the rotated functions, unlike the unrotated ones, are
correlated for the original sample, although their within-groups covariance matrix
is an identity. The diagonals of the above matrix may still be interpreted as the
between-groups variances of the functions. They are the numerators for the
proportions of variance printed with the transformation matrix. The denominator is
their sum. After rotation, the columns of the transformation are exchanged, if
necessary, so that the diagonals of the matrix above are in descending order.
References
Anderson, T. W. 1958. Introduction to multivariate statistical analysis. New York:
John Wiley & Sons, Inc.
Cooley, W. W., and Lohnes, P. R. 1971. Multivariate data analysis. New York:
John Wiley & Sons, Inc.
Dixon, W. J., ed. 1973. BMD Biomedical computer programs. Los Angeles:
University of California Press.
Tatsuoka, M. M. 1971. Multivariate analysis. New York: John Wiley & Sons, Inc.