Multi Variate Discriminant

9.
Discriminant Analysis
Example 9.1: Consider the following data on financial
ration for solvent and bankrupted companies
Financial Ratios of Bankrupt and Solvent Companies, Altman (1968)
Source: Morrison (1990). Multivariate Statistical Methods,
3rd ed. McGraw-Hill
X1 = Working Capital / Total Assets
X2 = Retained Earnings / Total Assets
X3 = Earnings Before Interest and Taxes / Total Assets
X4 = Market Value of Equity / Total Value of Liabilities
X5 = Sales / Total Assets
Group, 1 = Bankrupt 2 = Solvent
Group
X1
X2
1
36.7 -62.8
1
24.0
3.3
1
-61.6 -120.8
1
-1.0 -18.1
1
18.9
-3.8
1
-57.2 -61.2
1
3.0 -20.3
1
-5.1 -194.5
1
17.9
20.8
1
5.4 -106.1
1
23.0 -39.4
1
-67.6 -164.1
1
-185.1 -308.9
1
13.5
7.2
1
-5.7 -118.3
1
72.4 -185.9
1
17.0 -34.6
1
-31.2 -27.9
1
14.1 -48.2
1
-60.6 -49.2
1
26.2 -19.2
1
7.0 -18.1
1
53.1 -98.0
1
-17.2 -129.0
1
32.7
-4.0
1
26.7
-8.7
1
-7.7 -59.2
1
18.0 -13.1
1
2.0 -38.0
1
-35.3 -57.9
1
5.1
-8.8
1
0.0 -64.7
1
25.2 -11.4
X3
-89.5
-3.5
-103.2
-28.8
-50.6
-56.6
-17.4
-25.8
-4.3
-22.9
-35.7
-17.7
-65.8
-22.6
-34.2
-280.0
-19.4
6.3
6.8
-17.2
-36.7
-6.5
-20.8
-14.2
-15.8
-36.3
-12.8
-17.6
1.6
0.7
-9.1
-4.0
4.8
X4
54.1
20.9
24.7
36.2
26.4
11.0
8.0
6.5
22.6
23.8
69.1
8.7
35.7
96.1
21.7
12.5
35.5
7.0
16.6
7.2
90.4
16.5
26.6
267.9
177.4
32.5
21.3
14.6
7.7
13.7
100.9
0.7
7.0
X5
1.7
1.1
2.5
1.1
0.9
1.7
1.0
0.5
1.0
1.5
1.2
1.3
0.8
2.0
1.5
6.7
3.4
1.3
1.6
0.3
0.8
0.9
1.7
1.3
2.1
2.8
2.1
0.9
1.2
0.8
0.9
0.1
0.9
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
35.2
38.8
14.0
55.1
59.3
33.6
52.8
45.6
47.4
40.0
69.0
34.2
47.0
15.4
56.9
43.8
20.7
33.8
35.8
24.4
48.9
49.9
54.8
39.0
53.0
20.1
53.7
46.1
48.3
46.7
60.3
17.9
24.7
43.0
47.0
-3.3
35.0
46.7
20.8
33.0
26.1
68.6
37.3
59.0
49.6
12.5
37.3
35.3
49.5
18.1
31.4
21.5
8.5
40.6
34.6
19.9
17.4
54.7
53.5
35.6
39.4
53.1
39.8
59.5
16.3
21.7
16.4
16.0
4.0
20.8
12.6
12.5
23.6
10.4
13.8
33.4
23.1
23.8
7.0
34.1
4.2
25.1
13.5
15.7
-14.4
5.8
5.8
26.4
26.7
12.6
14.6
20.6
26.4
30.5
7.1
13.8
7.0
20.4
-7.8
99.1
126.5
91.7
72.3
724.1
152.8
475.9
287.9
581.3
228.8
406.0
126.6
53.4
570.1
240.3
115.0
63.1
144.8
90.0
149.1
82.0
310.0
239.9
60.5
771.7
307.5
289.5
700.0
164.4
229.1
226.6
105.6
118.6
1.3
1.9
2.7
1.9
0.9
2.4
1.5
2.1
1.6
3.5
5.5
1.9
1.8
1.5
0.9
2.6
4.0
1.9
1.0
1.5
1.8
1.8
2.3
1.3
1.7
1.1
2.0
1.9
1.9
1.2
2.0
1.0
1.6
Relevant questions then are:

How do the companies in these two groups differ
from each other?
Which ratios best discriminate the groups?
Are the ratios useful for predicting bankruptcies?
Partial answers to can be obtained by examining each

single variable at a time.
For example sample statistics for each group

are
Sample Statistics of Bankrupt data

Statistic
Bankrupt
Mean
Solvent
Bankrupt
Median
Solvent
Bankrupt Standard Deviation
Solvent
Bankrupt
Sample Variance
Solvent
Bankrupt
Kurtosis
Solvent
Bankrupt
Skewness
Solvent
Bankrupt
Range
Solvent
Bankrupt
Minimum
Solvent
Bankrupt
Maximum
Solvent
Bankrupt
Count
Solvent
X1
-2.83
41.40
5.40
45.60
45.88
14.21
2104.57
201.99
6.95
-0.63
-2.09
-0.37
257.50
55.00
-185.10
14.00
72.40
69.00
33
33
X2
-62.51
35.24
-39.40
35.60
71.31
16.51
5085.48
272.50
3.31
-0.33
-1.69
-0.18
329.70
71.90
-308.90
-3.30
20.80
68.60
33
33
X3
-31.78
15.32
-17.70
14.60
51.35
10.87
2637.18
118.11
17.55
0.71
-3.82
-0.56
286.80
48.50
-280.00
-14.40
6.80
34.10
33
33
X4
40.05
254.67
21.70
164.40
54.94
206.57
3018.22
42669.19
9.51
0.72
2.91
1.31
267.20
718.30
0.70
53.40
267.90
771.70
33
33
X5
1.50
1.94
1.20
1.80
1.16
0.93
1.35
0.86
12.30
6.29
3.03
2.18
6.60
4.60
0.10
0.90
6.70
5.50
33
33
t-Test: Two-Sample Assuming Equal Variances

Sales / Total Assets
Bankrupt Solvent
Mean
1.50303 1.939394
Variance
1.350928 0.864962
Observations
33
33
Pooled Variance
1.107945
df
64
t Stat
-1.68396
P(T<=t) one-tail
0.048531
t Critical one-tail
1.669014
P(T<=t) two-tail
0.097061
t Critical two-tail
1.997728
F-Test Two-Sample for Variances of Sales / Total Assets
Mean
Variance
Observations
df
F
P(F<=f) one-tail
F Critical one-tail
Bankrupt
1.50303
1.350928
33
32
1.561835
0.106347
1.80448
Solvent
1.939394
0.864962
33
32
Some graphics may also be helpful. For example,

Class limits
Bankrupt
< -51
6
-35
3
-20
6
-5
10
10
8
25
0
40
0
41 >
0
Solvent
0
0
0
2
7
17
7
0
Histogram
EBIT / Total Assets
18
16
Frequency
14
12
10
8
6
4
2
Bankrupt
40
25
10
-5
-20
-35
< -51
Solvent
More complete use of group separation information, however, can be given by discriminant analysis (DA).
The General Setup for the Discriminant Analysis

Discriminant analysis is used for two purposes:
(1) describing major differences among the
groups, and
(2) classifying subject on the basis of measurements.
Descriptive Discriminant Analysis

The start off setup:
p variables
q exclusive groups
The goal of the descriptive DA is:

Form k new variables such that
1. The new variables are uncorrelated.
2. The first new variable has the best discriminating power w.r.t the given groups.
The second new variable has the second
best discriminating power and is uncorrelated with the first one, the third has
the third best discriminating power and is
uncorrelated with the previous ones, etc.
Remark 9.1: k min(p, q 1). For example, if q = 2

then k = min(p, 1) = 1.
More precisely, suppose we have observations

on random variables x1, . . . , xp from q groups.
Then the j th discriminant function is defined
as a linear combination of the original variables
(1)
yj = aj1x1 + + ajpxp,
such that Corr(yj , y`) = 0 for j 6= `, and y1

has the best discriminating power, y2 the second best, and so on.
10
Remark 9.2: In the basic case the assumption is that

the groups differ only with respect to the means of
the variables.
As a consequence the correlations between the variables and variances are assumed the same over the
groups (groups have similar covariance structures).
11
The idea in deriving the discriminant functions is to divide the total variation into between group and within group variation
(2)
T = B + W,
where T denotes the total covariance matrix,

B the between covariance matrix, and W the
within covariance matrix.
12
Technically the problem reduces again to an

eigenvalue problem.
In this case the eigenvalues are extracted form
the matrix
(3)
BW1.
The resulting eigenvectors form the coefficients for the discriminant functions yj , j = 1, . . . , k
with k = min(q 1, p).
The functions are called canonical discriminant functions.
13
Example 9.2: Consider the bankruptcy data. SAS proc

candisc or SPSS (Analyze Classify Discriminant).
Below are SAS results.
Example: Discriminant analysis applied to bankrupt data
Canonical Discriminant Analysis
66 Observations
5 Variables
2 Classes
65 DF Total
64 DF Within Classes
1 DF Between Classes
Class Level Information

GROUP
Frequency
Weight
Proportion
1
2
33
33
33.0000
33.0000
0.500000
0.500000
14
Variable
X1
X2
X3
X4
X5
Variable
X1
X2
X3
X4
X5
Within-Class Covariance Matrices
GROUP = 1
DF = 32
X1
X2
X3
X4
X5
2104.5659
1834.1637
-266.4029
249.8980
18.0357
1834.1637
5085.4767
1632.2018
177.7665
-15.6653
-266.4029
1632.2018
2637.1822
168.3066
-46.6066
249.8980
177.7665
168.3066
3018.2188
1.6108
18.0357
-15.6653
-46.6066
1.6108
1.3509
GROUP = 2
DF = 32
X1
X2
X3
X4
X5
201.986
117.413
16.740
974.165
1.921
117.413
272.496
52.076
1630.092
0.879
16.740
52.076
118.108
814.591
2.762
974.165
1630.092
814.591
42669.190
-14.529
1.921
0.879
2.762
-14.529
0.865
15

Simple Statistics
Total-Sample
Variable
X1
X2
X3
X4
X5
Mean
Variance
Std Dev
66
66
66
66
66
19.28485
-13.63485
-8.23182
147.35909
1.72121
1632
5064
1920
34186
1.13924
40.39972
71.15836
43.81308
184.89362
1.06735
GROUP = 1
Variable
X1
X2
X3
X4
X5
Mean
Variance
Std Dev
33
33
33
33
33
-2.83030
-62.51212
-31.78182
40.04545
1.50303
2105
5085
2637
3018
1.35093
45.87555
71.31253
51.35350
54.93832
1.16229
GROUP = 2
Variable
X1
X2
X3
X4
X5
Mean
Variance
Std Dev
33
33
33
33
33
41.40000
35.24242
15.31818
254.67273
1.93939
201.98563
272.49627
118.10841
42669
0.86496
14.21216
16.50746
10.86777
206.56522
0.93003
16
Univariate Test Statistics

F Statistics,
Total
STD
Pooled
STD
40.3997
71.1584
43.8131
184.8936
1.0673
33.9599
51.7589
37.1166
151.1413
1.0526
Variable
X1
X2
X3
X4
X5
Num DF= 1
Den DF= 64
Between
STD
R-Squared
RSQ/
(1-RSQ)
31.2755
69.1229
33.3047
151.7644
0.3086
0.304266
0.479063
0.293363
0.342055
0.042428
0.4373
0.9196
0.4152
0.5199
0.0443
Univariate Test Statistics

Variable
X1
X2
X3
X4
X5
Pr > F
27.9892
58.8555
26.5698
33.2726
2.8357
0.0001
0.0001
0.0001
0.0001
0.0971
Average R-Squared: Unweighted = 0.2922351

Weighted by Variance = 0.3546308
Multivariate Statistics and Exact F Statistics
S=1
Statistic
Wilks Lambda
Pillais Trace
Hotelling-Lawley Trace
Roys Greatest Root
M=1.5
Value
0.369760775
0.630239225
1.704451275
1.704451275
N=29
F
20.4534
20.4534
20.4534
20.4534
Num DF
Den DF
Pr > F
5
5
5
5
60
60
60
60
0.0001
0.0001
0.0001
0.0001
17

Canonical
Correlation
Adjusted
Canonical
Correlation
Approx
Standard
Error
Squared
Canonical
Correlation
0.793876
0.781803
0.045863
0.630239
Eigenvalues of INV(E)*H
= CanRsq/(1-CanRsq)
Eigenvalue
1
Difference
Proportion
Cumulative
1.0000
1.0000
1.7045
Test of H0: The canonical correlations in the

current row and all that follow are zero
Likelihood
Ratio
Approx F
Num DF
Den DF
Pr > F
0.36976078
20.4534
60
0.0001
NOTE: The F statistic is exact.

Total Canonical Structure
CAN1
X1
X2
X3
X4
X5
0.694823
0.871854
0.682260
0.736708
0.259462
18
Between Canonical Structure

CAN1
X1
X2
X3
X4
X5
1.000000
1.000000
1.000000
1.000000
1.000000
Pooled Within Canonical Structure

CAN1
X1
X2
X3
X4
X5
0.506539
0.734533
0.493528
0.552283
0.161231
Total-Sample Standardized Canonical Coefficients

CAN1
X1
X2
X3
X4
X5
0.1404518774
0.6028563830
0.6695203123
0.5616859665
0.5320432994
Pooled Within-Class Standardized Canonical Coefficients

CAN1
X1
X2
X3
X4
X5
0.1180635365
0.4385036080
0.5671902048
0.4591503359
0.5246858501
19
Raw Canonical Coefficients

CAN1
X1
X2
X3
X4
X5
0.0034765558
0.0084720383
0.0152812900
0.0030378872
0.4984713894
Class Means on Canonical Variables

GROUP
CAN1
1
2
-1.285613175
1.285613175
20
The output includes several coefficient matrices.

The structure matrices describe the correlations of the original variables with the discriminant function.
The most useful of these for interpretation
purposes is the within canonical structure.
In the case of multiple groups also between
canonical structure may give useful additional
information.
This structure tells how the means of variables and means of discriminant functions are
correlated.
21
The standardized coefficients are obtained by

dividing the raw coefficients by the standard
deviations of the variables.
These coefficient tell the marginal effect of
the (standardized) variable on the discriminant function.
Labeling the discriminant function is based
on those variables having largest correlations
and largest standardized coefficients.
22
Example 9.3: (Continued) From the within canonical structure we observe that X2 (Retained earnings
/ Total assets) has the highest correlation with the
discriminant function. Next come X4 (Market value
of equity / Total Value of Liabilities), X1 (Working
capital / Total Assets) and X3 (Earnings before interest and taxes / Total assets), whereas X5 (Sales /
Total Assets) is small, but it has a large standardized
coefficient.
Summing up, profitable and companies whose market
value is on a high level are the properties preventing
from the bankruptcy.
23
It should be noted that the basic assumption

in the discriminant analysis is that the variables are normally distributed in each of the
groups, and that the covariance matrices are
the same.
The former assumption is harder to test. The
latter is easier (in SPSS select Box M from
the options).
If the covariance matrices are not the same
the linear discriminant function analysis is invalid.
One should move to the quadratic discriminant function analysis.
This method, however, is planned for classification purposes.
24
Example 9.4. Testing for the equality of the population covariance matrices.
(4)
H0 : 1 = 2 ,
where i is the population covariance matrix of the

population i (i = 1, 2).
SPSS give the result: Test Chi-Square Value = 186.18
with 15 degrees of freedom and p-value = 0.0001
We observe that the null hypothesis is rejected, hence
one analysis results should be interpreted with caution.
25
Number of Discriminant Functions

In a case of multiple group (> 2) the question
is: in how many dimension the groups are
different.
In the case of two groups this is not a major
problem, because the groups can differentiate only in one dimension.
Generally, however, there can be more discriminating dimensions, if q > 2.
Example 9.5: The following data is a classic example
considering different species of Iris Setosa.
The following measures were made:
SL:
SW:
PL:
PW:
Sepal
Sepal
Pedal
Pedal
length
WIdth
Length
Width
26
The CANDISC procedure produces the following results.

title;
data iris;
title Discriminant Analysis of Fisher (1936) Iris Data;
input sepallen sepalwid petallen petalwid spec_no @@;
if spec_no=1 then species=SETOSA
;
if spec_no=2 then species=VERSICOLOR;
if spec_no=3 then species=VIRGINICA ;
label sepallen=Sepal Length in mm.
sepalwid=Sepal Width in mm.
petallen=Petal Length in mm.
petalwid=Petal Width in mm.;
datalines;
50 33 14 02 1 64 28 56 22 3 65 28 46 15 2 67 31 56 24 3
63 28 51 15 3 46 34 14 03 1 69 31 51 23 3 62 22 45 15 2
59 32 48 18 2 46 36 10 02 1 61 30 46 14 2 60 27 51 16 2
65 30 52 20 3 56 25 39 11 2 65 30 55 18 3 58 27 51 19 3
68 32 59 23 3 51 33 17 05 1 57 28 45 13 2 62 34 54 23 3
77 38 67 22 3 63 33 47 16 2 67 33 57 25 3 76 30 66 21 3
49 25 45 17 3 55 35 13 02 1 67 30 52 23 3 70 32 47 14 2
64 32 45 15 2 61 28 40 13 2 48 31 16 02 1 59 30 51 18 3
55 24 38 11 2 63 25 50 19 3 64 32 53 23 3 52 34 14 02 1
49 36 14 01 1 54 30 45 15 2 79 38 64 20 3 44 32 13 02 1
67 33 57 21 3 50 35 16 06 1 58 26 40 12 2 44 30 13 02 1
77 28 67 20 3 63 27 49 18 3 47 32 16 02 1 55 26 44 12 2
50 23 33 10 2 72 32 60 18 3 48 30 14 03 1 51 38 16 02 1
61 30 49 18 3 48 34 19 02 1 50 30 16 02 1 50 32 12 02 1
61 26 56 14 3 64 28 56 21 3 43 30 11 01 1 58 40 12 02 1
51 38 19 04 1 67 31 44 14 2 62 28 48 18 3 49 30 14 02 1
51 35 14 02 1 56 30 45 15 2 58 27 41 10 2 50 34 16 04 1
.
.
.
;
27
title Canonical Discriminant Analysis of IRIS data;

proc candisc data = iris;
class species;
var sepallen--petalwid;
run;
Which gives the results:

Canonical Discriminant Analysis of IRIS data
150 Observations
4 Variables
3 Classes
149 DF Total

SPECIES
Frequency
Weight
Proportion
50
50
50
50.0000
50.0000
50.0000
0.333333
0.333333
0.333333
SETOSA
VERSICOLOR
VIRGINICA

Multivariate Statistics and F Approximations
S=2
Statistic
M=0.5
Value
Wilks Lambda
0.023438631
Pillais Trace
1.191898825
Hotelling-Lawley Trace 32.47732024
Roys Greatest Root
32.1919292
N=71
F
199.145
53.4665
580.532
1166.96
Num DF
Den DF
Pr > F
8
8
8
4
288
290
286
145
0.0001
0.0001
0.0001
0.0001
NOTE: F Statistic for Roys Greatest Root is an upper bound.

NOTE: F Statistic for Wilks Lambda is exact.
28
1
2
Canonical
Correlation
Adjusted
Canonical
Correlation
Approx
Standard
Error
Squared
Canonical
Correlation
0.984821
0.471197
0.984508
0.461445
0.002468
0.063734
0.969872
0.222027
Eigenvalues of INV(E)*H
= CanRsq/(1-CanRsq)
1
2
Eigenvalue
Difference
Proportion
Cumulative
32.1919
0.2854
31.9065
.
0.9912
0.0088
0.9912
1.0000
Test of H0: The canonical correlations in the

current row and all that follow are zero
1
2
Likelihood
Ratio
Approx F
Num DF
Den DF
Pr > F
0.02343863
0.77797337
199.1453
13.7939
8
3
288
145
0.0001
0.0001
Total Canonical Structure
SEPALLEN
SEPALWID
PETALLEN
PETALWID
CAN1
CAN2
0.791888
-0.530759
0.984951
0.972812
0.217593
0.757989
0.046037
0.222902
Sepal
Sepal
Petal
Petal
Length
Width
Length
Width
29
in
in
in
in
mm.
mm.
mm.
mm.
Between Canonical Structure
SEPALLEN
SEPALWID
PETALLEN
PETALWID
CAN1
CAN2
0.991468
-0.825658
0.999750
0.994044
0.130348
0.564171
0.022358
0.108977
Sepal
Sepal
Petal
Petal
Length
Width
Length
Width
in
in
in
in
mm.
mm.
mm.
mm.
Length
Width
Length
Width
in
in
in
in
mm.
mm.
mm.
mm.
Pooled Within Canonical Structure
SEPALLEN
SEPALWID
PETALLEN
PETALWID
CAN1
CAN2
0.222596
-0.119012
0.706065
0.633178
0.310812
0.863681
0.167701
0.737242
Sepal
Sepal
Petal
Petal
30
Total-Sample Standardized Canonical Coefficients
SEPALLEN
SEPALWID
PETALLEN
PETALWID
CAN1
CAN2
-0.686779533
-0.668825075
3.885795047
2.142238715
0.019958173
0.943441829
-1.645118866
2.164135931
Sepal
Sepal
Petal
Petal
Length
Width
Length
Width
in
in
in
in
mm.
mm.
mm.
mm.
Pooled Within-Class Standardized Canonical Coefficients
SEPALLEN
SEPALWID
PETALLEN
PETALWID
CAN1
CAN2
-.4269548486
-.5212416758
0.9472572487
0.5751607719
0.0124075316
0.7352613085
-.4010378190
0.5810398645
Sepal
Sepal
Petal
Petal
Length
Width
Length
Width
in
in
in
in
mm.
mm.
mm.
mm.
Sepal
Sepal
Petal
Petal
Length
Width
Length
Width
in
in
in
in
mm.
mm.
mm.
mm.
Raw Canonical Coefficients
SEPALLEN
SEPALWID
PETALLEN
PETALWID
CAN1
CAN2
-.0829377642
-.1534473068
0.2201211656
0.2810460309
0.0024102149
0.2164521235
-.0931921210
0.2839187853
Class Means on Canonical Variables

SPECIES
SETOSA
VERSICOLOR
VIRGINICA
CAN1
CAN2
-7.607599927
1.825049490
5.782550437
0.215133017
-0.727899622
0.512766605
31
The Wilks lambda test indicates that there

are two statistically significant discriminators
on the five percent level.
Generally the hypotheses to be tested is like
in the factor analysis
H0 : The number of discriminators = m
H1 : More is needed
(5)
On the basis of the within-matrices the first
discriminator indicates that the species differ
with respect to the overall size of the leaves
and the second discriminator that species differ also with respect to the width of the
leaves.
32
Example 9.6: Bankruptcy risk and signal to reorganization of a company (Laitinen, Luoma, Pynn
onen
1996, UV, Discussion Papers 200)
Thus we have four groups.
33
The used ratios are:
34
Sample statistics:
B1 (n=20)
Variable Mean Std Dev
ROI -10.24 8.60
TCF -13.32 10.83
QRA 0.58 0.39
SCA -0.61 20.22
DSR 1.09 0.55
**=significant at level 0.01
***=significant at level 0.001
B2 (n=20)
Mean Std Dev
3.52 5.59
0.13 2.31
0.57 0.55
-4.75 18.79
0.69 0.25
N3 (n=17)
Mean Std Dev
2.27 7.14
0.97 5.00
1.14 0.70
13.62 13.19
0.88 0.34
N4 (n=23) F for eq
Mean Std Dev of means
12.02 5.96 37.66***
6.47 5.67 32.48***
0.85 0.42 4.95**
23.13 19.55 10.39***
0.57 0.28 7.62***
35
Number of canonical discriminant functions:
The results indicate that also the third canonical discriminant function is statistically significant.
36
Canonical structure and standardized coefficients:
Table 11. Canonical structure and Standardized canonical coefficients both as pooled within.
Variable CAN1
ROI 0.702
TCF 0.643
QRA 0.101
SCA 0.252
DSR -0.306
Canonical structure* Standardized coefficient

CAN2 CAN3 CAN1 CAN2 CAN3
0.036 0.004 0.717 0.013 -0.737
0.059 0.467 0.372 -0.458 0.983
0.513 0.653 -0.061 0.563 0.661
0.773 -0.168 0.169 0.946 -0.522
0.203 0.149 -0.722 0.034 0.16
*Correlation coefficients between original variables and canonical variables.
37
Interpretation of the discriminant functions:
38
Group differences:
39
CAN1, the financial performance, shows that the financial performance is the main characteristic differentiating healthy and bankruptcy firms (as expected).
CAN2, controversy dynamic liquidity and static ratios,
is differentiating characteristic between reorganizable
non-bankrupt and reorganizable bankrupt firms.
CAN3, controversy between liquidity and other ratios,
reorganizable non-bankrupt firms and healthy firms.
The distinction is probably due to the fact that nonbankrupt firms may have cash reserves (high liquidity),
but do not use it profitably.
40
Classification
The other main usage of discriminant analysis is to predict from which of the given
classes a given observation is coming from
(decease diagnostics, bankruptcy prediction,
etc.).
The goal is to minimize the misclassification
rate, (two groups labeled as 1 and 2)
(6)
P (E) = p1P (2|1) + p2P (1|2),
where P (E) denotes the misclassification probability, pi is the probability that an observation is from group i, and P (j|i) denotes
the probability that an observation coming
from the group j is classified to the group i,
i, j = 1, 2, and p1 + p2 = 1.
The probabilities pi indicate the prior probabilities or the population proportion of the
group i.
41
In the SAS-system procedure DISCRIM can

be used for classification purposes.
Example 9.7: Consider the bankruptcy example.
OPTIONS LS = 72;
TITLE Example: Discriminant analysis applied to bankrupt data;
DATA bankrupt;
INFILE d:\tex\opetus\tmmt\bankrupt.dat firstobs = 11;
INPUT group x1-x5;
PROC DISCRIM CROSSVALIDATE;
CLASS group;
VAR x1-x5;
RUN;
The results are

66 Observations
65 DF Total
5 Variables
2 Classes
GROUP
1
2
Frequency
33
33
Weight
33.0000
33.0000
Proportion
0.500000
0.500000
Prior
Probability
0.500000
0.500000
42
Pooled Covariance Matrix Information
Covariance
Matrix Rank
5
Natural Log of the Determinant

of the Covariance Matrix
31.011359
Pairwise Generalized Squared Distances Between Groups

2
_
_
-1 _
_
D (i|j) = (X - X ) COV
(X - X )
i
j
i
j
Generalized Squared Distance to GROUP
From GROUP
1
2
_
-1 _
Constant = -.5 X COV
X
j
j
CONSTANT
X1
X2
X3
X4
X5
1
0
6.61120
2
6.61120
0
Linear Discriminant Function

-1 _
Coefficient Vector = COV
X
j
GROUP
1
2
-1.76280
-4.67181
0.01113
0.02007
-0.03003
-0.00825
0.01810
0.05739
0.00266
0.01047
1.42947
2.71115
43
Remark 9.3: In the two groups classification problem,

the logit (or probit) regression is more popular.
Example 9.8: Logit regression of the bankruptcy data
(we use only variable x2 here because of the convergence problems).
proc logistic data = a.bankruptcy;
* wcta (x1) reta (x2) ebitta (x3)
model group = reta / ctable;
run;
mvetvl (x4)
sta (x5);
Response Profile
Ordered
Value
Group
Total
Frequency
1
2
1
2
33
33
Probability modeled is Group=2.

Model Convergence Status
Convergence criterion (GCONV=1E-8) satisfied.
Model Fit Statistics
Criterion
AIC
SC
-2 Log L
Intercept
Only
Intercept
and
Covariates
93.495
95.685
91.495
19.804
24.183
15.804
44
45
Prob
Level
0.500
1
1
Intercept
reta
1
1
1
DF
Correct
95.5
False
NEG
5.9
0.1530
0.0020
Wald
Pr > ChiSq
<.0001
<.0001
0.0020
Pr > ChiSq
Percentages
Sensi- Speci- False
tivity ficity
POS
93.9
97.0
3.1
Classification Table
2.0417
9.5805
Standard
Error
Chi-Square
0.8165
0.0571
Incorrect
NonEvent Event
1
2
1.1666
-0.1767
Estimate
Correct
NonEvent Event
31
32
DF
Parameter
75.6917
31.6182
9.5805
Chi-Square
Analysis of Maximum Likelihood Estimates
Likelihood Ratio
Score
Wald
Test
Testing Global Null Hypothesis: BETA=0
In SPSS the analysis run by choosing Analyze Regression

Binary Logistic and selecting the appropriate variables.
SPSS results:
Classification Tabl
==========================================
Predicted
Group
Percentage
1
2
Correct
-----------------------------------------Observed 1
31
2
93.9
Group
2
1
32
97.0
-----------------------------------------Overall Percentage
95.5
==========================================
a
The cut value is .500
Variables in the Equation
======================================================
B
S.E.
Wald df
Sig.
Exp(B)
-----------------------------------------------------reta
0.177 0.057
9.580
1
0.002
1.193
Constant -1.167 0.816
2.042
1
0.153
0.311
======================================================
46

Multi Variate Discriminant

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Multi Variate Discriminant

Caricato da

Copyright:

Formati disponibili

9.

Relevant questions then are:

Partial answers to can be obtained by examining each

For example sample statistics for each group

Sample Statistics of Bankrupt data

t-Test: Two-Sample Assuming Equal Variances

Some graphics may also be helpful. For example,

The General Setup for the Discriminant Analysis

Descriptive Discriminant Analysis

The goal of the descriptive DA is:

1. The new variables are uncorrelated.

Remark 9.1: k min(p, q 1). For example, if q = 2

More precisely, suppose we have observations

such that Corr(yj , y`) = 0 for j 6= `, and y1

Remark 9.2: In the basic case the assumption is that

where T denotes the total covariance matrix,

Technically the problem reduces again to an

Example 9.2: Consider the bankruptcy data. SAS proc

Class Level Information

Canonical Discriminant Analysis

Within-Class Covariance Matrices

Canonical Discriminant Analysis

Univariate Test Statistics

Univariate Test Statistics

Average R-Squared: Unweighted = 0.2922351

Example: Discriminant analysis applied to bankrupt data

Test of H0: The canonical correlations in the

NOTE: The F statistic is exact.

Between Canonical Structure

Pooled Within Canonical Structure

Total-Sample Standardized Canonical Coefficients

Pooled Within-Class Standardized Canonical Coefficients

Raw Canonical Coefficients

Class Means on Canonical Variables

The output includes several coefficient matrices.

The standardized coefficients are obtained by

It should be noted that the basic assumption

where i is the population covariance matrix of the

Number of Discriminant Functions

The CANDISC procedure produces the following results.

title Canonical Discriminant Analysis of IRIS data;

Which gives the results:

Class Level Information

Canonical Discriminant Analysis

NOTE: F Statistic for Roys Greatest Root is an upper bound.

Test of H0: The canonical correlations in the

Total Canonical Structure

Between Canonical Structure

Pooled Within Canonical Structure

Total-Sample Standardized Canonical Coefficients

Pooled Within-Class Standardized Canonical Coefficients

Raw Canonical Coefficients

Class Means on Canonical Variables

The Wilks lambda test indicates that there

Thus we have four groups.

The used ratios are:

Number of canonical discriminant functions:

Canonical structure and standardized coefficients:

Canonical structure* Standardized coefficient

*Correlation coefficients between original variables and canonical variables.

Interpretation of the discriminant functions:

P (E) = p1P (2|1) + p2P (1|2),

In the SAS-system procedure DISCRIM can