Sei sulla pagina 1di 62

“Multiple Linear Regression Project “

Applied Statistics For Engineer


Dr.Hussam Abu-Hajjar

Submission Date: 7-Nov-2019

Student Name : Khalid Akram Hilu

Student ID : 8180021
September 2019
Multiple Liner Regression

Contents
1 General Overview of The Project Problem ................................... 3

Literature review about the case study ...................................................................................... 3


Descriptive analysis and normality test for each Variable .......................................................... 5

2 Multiple Linear Regression: ........................................................ 8

Model 1: .................................................................................................................................... 8
Model 2: ...................................................................................................................................15
Model 3: .................................................................................................................................. 22

3 Logistics Model: ...................................................................... 29

Logistic Model 1 : .....................................................................................................................31


Logistic Model [2] : .................................................................................................................. 39

4 Cross Validation On Model [1] in Multiple Linear Regression: ...... 45

5 Appendices ............................................................................. 48

Appendix 1 [ Case Summary For Model 1] ............................................................................... 48


Appendix 2 [ Case Summary For Model 2] ................................................................................51
Appendix 3 [ Case Summary For Model 3] ............................................................................... 54
Appendix 4 [ Case Summary For Logistic Model 1] ...................................................................57
Appendix 5 [ Case Summary For Logistic Model 2] .................................................................. 60

2
Multiple Liner Regression

1 General Overview of The Project Problem

Literature review about the case study :


The scope of this Project is to find a relation between 1 independent variable with multiple
dependent variables throughout multiple linear regression and logistics regression as
shown using SPSS Software.

The Experiment that used in this project is a Number of Fish In Water Stream [Y]
calculating multiple factors:

 The Area drained by the stream (in acres) [X1]


 Dissolved Oxygen (in mg/liter) [X2]
 The Maximum Depth in Studied Cross-section (in cm) [X3]
 Nitrate Concentration (mg/liter) [X4]
 Sulfate Concentration (mg/liter) [X5]
 The Water Temperature on The Sampling Date (in degrees C) [X6]

This experiment is done by The Maryland Biological Stream Survey (MBSS),and it is to


calculate number of in water stream with 0.75 cm cross-sectional width, then they are
calculating the concentration for Dissolved Oxygen, Nitrate No3, Sulfate So4, and calculate
the water temperature

Moreover, the maximum depth of the stream will be measured along the cross-sectional
width.

Here is Some Information Related to the experiment from real life

1- Dissolved Oxygen:

0-2 mg/L not enough oxygen to support fish life.

2-4 mg/L only a few fish and aquatic insects can survive.

4-7 mg/L good for many aquatic animals, low for cold water fish

7-11 mg/L very good for most stream fish

2- Nitrate No3:

Nitrate is measured in mg/L. Natural levels of nitrate are usually less than 1 mg/L.
Concentrations over 10 mg/L will have an effect on the freshwater aquatic environment. 10

3
Multiple Liner Regression

mg/L is also the maximum concentration allowed in human drinking water by the U.S.
Public Health Service. For a sensitive fish such as salmon the recommended concentration
is 0.06 mg/L

3- Temperature:
In hot weather, the first thing to do is to increase water movement. The warmer water
is, the less oxygen will be dissolved in it, but at higher temperatures, your fish’s
metabolism will be higher, increasing their need for oxygen.

4- The Area drained by the stream:


number of fish counted will depend on the watershed size, we assume that larger
drained area by stream will result a more fishes, because when drained area increased
the flow rate and the collected water volume will be increased

5-Depth of Stream:
Stream Channel depth vary along the stream cross section, the experiment calculate the
maximum depth in the stream cross-section, the fish does not prefer shallow stream water
to be there

4
Multiple Liner Regression

6-Sulfate: Sulfate has effects on fish at high concentration only, when reach
hundred mg/l.

 The Area drained by the stream (in acres) [X1]


 Dissolved Oxygen (in mg/liter) [X2]
 The Maximum Depth in Studied Cross-section (in cm) [X3]
 Nitrate Concentration (mg/liter) [X4]
 Sulfate Concentration (mg/liter) [X5]
 The Water Temperature on The Sampling Date (in degrees C) [X6]

Descriptive analysis and normality test for each Variable

All the Above independent variable will model in order to find their relation with the
dependent variable which is the Number of Fish the stream [Y]

All Variable is continuous variable, in the last section we will convert the independent
variable into binary categorical variable and convert one dependent variable into binary
categorical variable

5
Multiple Liner Regression

6
Multiple Liner Regression

As shown above from descriptive statists table we have a quick review for the data and we
can see that we have 68 Sample Case, we can also see the mean and Std deviation and
compare them together. For Example, [X4] Has mean Value of 5.6, and it Std deviation
is equal to 3.27 this is massive STD comparing to the mean!!

As Shown Above is the Descriptive Statistics for each Variable and Normality Test for each
for them using Kolmogorov-Smirnov and Shapiro-Wilk.

Null Hypothesis: The Variable is Normally distributed, Alternative: It is not


We can note that only Number of Fish [Y1] and Maximum Water Depth [X3] is normally
distributed because their Sig > 0.005 [ Null is accepted], and other variable [X1], [X2], [X4],
[X5] [X6] is not normally distributed because their Sig < 0.05. [Null is rejected].
But Anyway the Non-Normality in Variables is not important and not an assumption in
linear regression or logistics regression model, the major assumption for linear
regression methods is the Normality of residuals not the variable themselves.

7
Multiple Liner Regression

2 Multiple Linear Regression:

Model 1:
Model 1 is the first trial model that contains all
variables together that entered in Forced Entry
method and assumed this model have constant
[have intercept with Y Axis], the result is as
following

We can see that all Variable is entered without any exclusion, therefore the R2 for the
model will be for all these variable with [Y]

From the Model Summary we can interpret the following:

1- 62.4% from the variance in [Y] is explained by all 6 independent Variables


2- For This Model Adjusted R Square 58.7% which is close to 62.4%, from 62.4% to 58.7
is not that big reduction so we can make Model Generalization.
3- Durbin Watson to check for Independent Errors equal to 1.022 which is good
Because typical value for it is from [1 to 3] so that we are near the lower limits but
still fine

8
Multiple Liner Regression

This F Test Represent us a statistics test for the overall significance of the regression model, we
always seek to High F Value, and seek a value of Sig < 0.005 to state that the observed
difference is real not due to chance.

Null Hypothesis : All Model Coefficient is equal to zero B0 , B1 , B2 ,…… = 0


Alternative Hypothesis : at least 1 slope coefficient ≠ 0

In-our Model Sig is equal to 0.000 which is less than 0.005, so that the null hypothesis is
rejected and the observed difference is real not due to chance.

9
Multiple Liner Regression

From Coefficients table we can see B value for each variable and whether it is significant or
not.

The null hypothesis for coefficients test is

Null hypothesis B=0

Alternative hypothesis B≠0

We can see from the coefficients table that Constant, [X5] , [X6] have Sig Higher Than
0.05 therefore the null hypothesis accepted [ this coefficients is bad in our model ]

On other hand it seems that [X1] ,[X2],[X3],[X4] have a Sig Value less than 0.05 , therefore
the null hypothesis is rejected and they are well fit in the model .

We Can interpret from coefficients table the VIF Values that help us to identify Collinearity

First of all, we don’t have any variable have VIF > 10, on other hand all Predictors have VIF
around 1, expect The Water Temperature [X6] and Dissolved Oxygen [X2] both have 3.187
and 3.451 respectively. As first indication these to two variables Temperature and dissolved
Oxygen maybe have a multi-collinearity problem, but let us take a look in the coloration
and covariance matrix.

10
Multiple Liner Regression

From The Coefficients Correlations we can that all variable have small values of r between
them expect Water Temperature [X6] and Dissolved Oxygen [X2] they have correlation
0.821 !!!!, this impressive result is correspond with VIF in Page 10, so that Water
Temperature and Dissolved Oxygen have high probability to be multi-collinear variable,
Maybe the next Model [ Model 2] [Backward] will eliminate one of them if one of them does
not add any significance effect on the model. Because in multi-collinearity one of the
variables is enough in model, and the other one is redundant and we can eliminate one of
them from model.

11
Multiple Liner Regression

The condition indices are computed as the square roots of the ratios of the largest
eigenvalue to each successive eigenvalue. Values greater than 15 indicate a possible
problem with collinearity; greater than 30, a serious problem. [ In our Model We have only
one parameter which is temperature is have Condition Index 53.511 > 30.

Eigen Value Problem Is that it does not have cutoff point to judge with it.

Case-wise: There is only two cases in the model which is consider as outliers because these
two cases have Std Residual > 1.96, 2 Cases Over 68 Cases is equal to 2/68 = 2.94%, which
is fine 2.95% < 5%. In My Opinion I don’t like to remove outliers in-case std residuals at 2
Std is less than 5%, Because sometimes removing outliers is an endless process every
time you will remove outliers the model will adjusted and find a new outliers according the
new fit model and so on.

I’m trying to remove these residuals from the model, and I have a three new residuals, so
that my theory is correct about if only 5% of your Std Residual or less lays below 2 Std
deviation then you are Ok!

12
Multiple Liner Regression

From Scatter Plot we can find that the data here is Heteroscedasticity, so that the residual
increase with increase of predicted value, the variance of the residuals is not constant at
each level of predictor variables. No Clear Evidence on linearity or not

Null Hypothesis: The Variable is Normally distributed, Alternative: It is not

From The Test of normality above we can see that all residuals type is normally distributed
because it has Sig > 0.05.

13
Multiple Liner Regression

In order to check Influential Points:

1-Cooks distance, it should be < 1, if not (>1) then this case in concerning

From our model no case have cooks distance more than 1 , and this is perfect

2-Leverage: average leverage in our model 1 = (K+1)/n, (6+1)/68 = 0.103 cases with 3 times
average leverage 3 time is concerning 3 X 0.103=0.309 .

From our model only case 28 have leverage value more than 0.309 but is cooks distance is
less than 1 so we can consider it as ok !

3-STD DF Beta, STD Df Beta >1 are concerning

No DF Betas are more than 1 expect case No.1 but it is slightly near 1 and its cooks distance
is also fine.

3∗(𝑘+1) 3∗(𝑘+1)
4-Covariance Ratios , values greater than 1+ and less than 1- are concerning
𝑛 𝑛
3∗(6+1)
, then 1± , then the range between [0.691 to 1.309 ] is acceptable range , any case out
68

of this range is concerning

Covariance Ratios only case 1,7,22,28,38,48,57 have values of CVR out of this range
[0.691 to 1.309], but also their cooks distance is also fine less than 1.

5-Mahalanobis distance: for our sample small one (68)<100 , any values above 15 need to
take care of it . only case 12,22 have value higher than 15, but their cooks distance is also
fine

Appendix 1 Shows All Case Summaries

AS a conclusion the model has not have many influential point and there are too little
indicators for them and because all of them does not have cooks distance more than 1

14
Multiple Liner Regression

Model 2:

In this Model we are going to make a Backward Entry,


This method start the model with all variable [X1],
[X2],[X3],[X4],[X5],[X6] , and then it is trying to remove
the variable that its removal will not affect the model
Significantly , and then it is remove the most second
variable that its removal is will not affect the model
significantly and so on .

The Criteria of removeing variable is set to be the


probability of 0.1 , sothat any removable of variable will
result a Sig > 0.1 , will be consider as no signifance
change if we remove it , then it will be removed ,

Null Hypothesis: the change is significance

Alternative Hypothesis: the change is not Significance

As we can see the model start with full variable then it removes the water temperature
because it has Sig 0.983 > 0.1, after that it removes Sulfate Concentration because it has Sig
0.921 > 0.1, after than the model stop removing because any variable removed it will have a
Sig < 0.1, so its removal will be significantly change the model!

15
Multiple Liner Regression

As we can see from the model summary the model the 1st model it starts with 0.624 and
removing 2 variable [X6] and [X5] will not affect the model, but it is improving the
difference between R Square and Adjusted R Square Now R square 0.624 and adjusted R
Square 0.601. , Durbin Watson for independent error check is 1.018 > 1 and that is ok

Reason for Eliminating Temperature [X6]: In my opinion the Temperature removed


because it is collinear with the dissolved
oxygen and this appear clearly in the
Previous Model [1]: in page 10 and in page 11
with high VIF and Pearson coloration.

In Real Life Experience it is Known that


temperature and dissolved oxygen have
inverse relationship, whenever the
temperature increase then the dissolved
oxygen will be decreased.

Reason for Eliminating Sulfate [X5]:


In my opinion I think sulfate does not have any effect on fishes at these low level, the level
measured in samples does not affect fish significantly, sulfate effect fishes at high
concertation when sulfate reach a hundred of mg/l.

16
Multiple Liner Regression

ANOVA Test Also Shows us how F Improve through model elimination process we always
seek for Higher F Value with Sig <.05 , and in this model Sig < 0.05 , So it is very good

As shown in coefficients table now in model [2] all variables have significance less than
0.05 and that is perfect, now only constant has Sig Higher than 0.05. Moreover No VIF >10
, and All VIF is Around 1 that result is away better than Model [1] .

17
Multiple Liner Regression

The condition indices are computed as the square roots of the ratios of the largest eigenvalue
to each successive eigenvalue. Values greater than 15 indicate a possible problem with
collinearity; greater than 30, a serious problem. Six of these indices are larger than 30,
suggesting a very serious problem with collinearity. [ In our Model We have any only [X4]
with CI > 15, No Value More than 30 .

Same as model [1] the same outliers appear but they only represent less than 5% from all sample
so outliers is fine .
Side Note : there is no outliers more than 3 Std

18
Multiple Liner Regression

From Scatter Plot we can find that the data here is Heteroscedasticity, so that the residual
increase with increase of predicted value, the variance of the residuals is not constant at
each level of predictor variables. No Clear Evidence on linearity or not

We can see from test for Normality that all residuals are normally distributed with Sig >
0.05, and that is perfect

19
Multiple Liner Regression

In order to check Influential Points:

1-Cooks distance, it should be < 1, if not (>1) then this case in concerning

no cases have cooks distance more than 1

2-Leverage: average leverage in our model 1 = (K+1)/n, (6+1)/68 = 0.103 cases with 3 times
average leverage 3 time is concerning 3 X 0.103=0.309

no cases have leverage value more than 0.309

3-Standard DF Beta, Standard Df Beta >1 are concerning

No DF Betas are more than 1

3∗(𝑘+1) 3∗(𝑘+1)
4-Covariance Ratios , values greater than 1+ and less than 1- are concerning
𝑛 𝑛
3∗(6+1)
, then 1± , then the range between [0.691 to 1.309 ] is acceptable range , any case out
68

of this range is concerning

Covariance Ratios no values out of this range [0.691 to 1.309],

5-Mahalanobis distance: for our sample small one (68)<100 , any values above 15 need to
take care of it .

No Mahalanobis distance above 15

Appendix 2 Shows The Case Summaries

As a conclusion the model has not have many indicators for influential points and all cases
does not have cooks distance more than 1

20
Multiple Liner Regression

Side Note:

Model [2] : Can be Run through Forward Entry Method , and it will have the same result of
backward in this model , same R Square , same coefficient , same everything ) , but the
procedures that forward run out it is totally different way from Backward .

Here is how the forward for our model will be run out

As we see here the model start without any variables, then it is add the most variable will
give us a significant change in R Square with Sig < 0.05, after that [ addition of Maximum
water Depth], it is found that [nitrate is the most second variable] that if added will improve
the model, and so on …..

In this experiment Forward and backward model give the same result Same R2 and same
regression equation

21
Multiple Liner Regression

Model 3:

In this model we need to think in more professional way


based on real life, in our experiment we have the
Independent variable is number of fish in the water stream
based on area drained by this stream, water depth of stream,
Dissolved oxygen …etc. .

Based on the above if there is no area provide stream with water [X1=0] then there is no
stream at all then no fishes !! . Same concept can be applied on water depth if the stream
depth is zero [X3=0] then there is no water then no
fishes!!

So that our model 3 will be carried out without


any presence of Constant in Backward entry. we
can do that from options and untick include
constant in linear regression.

22
Multiple Liner Regression

As we discussed in model 2 , the variable with Sig > 0.1 will be excluded , because their
removal from the model will not affect the model significantly

Sulfate and temperature excluded because they have Sig > 0.1

Wow !!!, Removing constant have massive improve in R Square it is jump from 0.624 to 0.914
and the difference between R Square and Adjusted R Square is decreased, on other hand
Durbin Watson here is less than 1 , but it is is too close to 1 , 0.98 approximately 1 .

23
Multiple Liner Regression

ANOVA Test All F Values have Sig < 0.005 That is Very Good, and F is increasing per each
iteration

24
Multiple Liner Regression

The null hypothesis for coefficients test is

Null hypothesis B=0

Alternative hypothesis B≠0

Look how the coefficient is Perfect! all Coefficients is Significant, and well fitted to the
model, all coefficients have sig value less than 0.05.

The Problem that appears here that we have two variables with VIF > 10, Dissolved Oxygen
and Maximum Water Depth this may lead to multi collinearity

The Only Problem in modeling through origin that the variable it seems to be multi-
collinear with high person coloration between themselves.

So That we need to keep any eye on high VIF Variable because it has also high correlation [
Keep an eye on Dissolved Oxygen and Maximum water depth] or remove one of them from
the model

25
Multiple Liner Regression

The condition indices are computed as the square roots of the ratios of the largest
eigenvalue to each successive eigenvalue. Values greater than 15 indicate a possible
problem with collinearity; greater than 30, a serious problem. Six of these indices are larger
than 30, suggesting a very serious problem with collinearity. We don’t have any Varible with
CI > 15.

We can see from test for Normality that all residuals are normally distributed with Sig >
0.05, and that is perfect .

26
Multiple Liner Regression

From Scatter Plot we can find that the data here is Heteroscedasticity, so that the residual
increase with increase of predicted value, the variance of the residuals is not constant at
each level of predictor variables.
No Clear Evidence on linearity or not

27
Multiple Liner Regression

In order to check Influential Points:

1-Cooks distance, it should be < 1, if not (>1) then this case in concerning

No case have cooks distance more than 1

2-Leverage: average leverage in our model 1 = (K+1)/n, (6+1)/68 = 0.103 cases with 3 times
average leverage 3 time is concerning 3 X 0.103=0.309

no cases have leverage value more than 0.309

3-DF Beta, Df Beta >1 are concerning

No DF Betas are more than 1

3∗(𝑘+1) 3∗(𝑘+1)
4-Covariance Ratios , values greater than 1+ and less than 1- are concerning
𝑛 𝑛
3∗(6+1)
, then 1± , then the range between [0.691 to 1.309 ] is acceptable range , any case out
68

of this range is concerning

No values out of this range [0.691 to 1.309],

5-Mahalanobis distance: for our sample small one (68)<100 , any values above 15 need to
take care of it .

No Mahalanobis distance above 15

Appendix 3 Shows The Case Summaries

AS a conclusion the model has not have many influential point and there are too little
indicators for them and because all of them does not have cooks distance more than 1

28
Multiple Liner Regression

3 Logistics Model:

In Logistics Model we are going to Convert the continuous dependent variable into a Binary
categorical variable by follow this category:

If the number of fish range from [ 0 – 60] then it is a Poor Stream with Aquatic life. [0]

If the number of fishes is more than 60 then it is a rich stream with Aquatic life. [1]

We need also to convert one independent variable to binary Categorical Variable

I choose to convert area drained by stream:


if the Area Drained by Steam is range from [ 0 to 7000] Acres, then it is a small area [0]

If the Area Drained by Steam is More than [ 7000] Acres, then it is a large area [1]

29
Multiple Liner Regression

30
Multiple Liner Regression

Logistic Model 1 :

By Forced Enter Method , We enter all variable , and make [X1] as Independent categorical
variable and make dummy coding by categorical option in binary regression .

The result is come out as below Block Zero:

31
Multiple Liner Regression

Classification Table gives us a fast indication how the percent of model correctness in block zero in
our model it is 75% .

-2 L L = 76.478 , We Always Seek for minimum log likelihood , this number should be improve and
go down when model Block 1 .

Note: There is no Variable in Block Zero Only Constant

Exp[B] is the Odd Raito this number should be far from 1, because, this number means how 1 unit
change in this variable will affect the independent variable probability of the outcome that coded
“1” to happen in our case “ Rich Aquatic Life “ Coded as “1” , Value of 3 is very good.

the null hypothesis is that the two coefficients of interest are simultaneously equal to zero. If the
test fails to reject the null hypothesis, this suggests that removing the variables from the model will
not substantially harm the fit of that. model

Then Sig = 000 then Sig < 0.05 we reject the null hypothesis so this is good one

32
Multiple Liner Regression

Block 1:

AS we see each iteration the -2Log Likelihood is decreasing and this is what we want we want the
minimum 2log likelihood from the model

As we see from classification table the correctness percent increase from 75% to 95.6%, and it is
very good percent

As we can see also the model predict a 100% percent of cases in Rich Aquatic life , 51 out of 51
and 82.4 % from poor Aquatic Life 14 out of 17.

From Model Summary we can see that the new -2Log Likelihood is 17.632 which decreased from
the value in Block zero and that is good, Also we can here see both R Square Terms that equivalent
to the R Square in linear Regression Cox Snell = 0.579 and Nageelerke = 0.858 and both of them
depends on -2Log Likelihood. both of them can be consider as good R Square.

33
Multiple Liner Regression

One of the most important test for the model fitness using Chi-Square Distortion the
Hosmer And Lemeshow Test, is testing

The Null Hypothesis is: the observed and expected proportions are the same across all
doses [ this mean well fitted model]

Alternative is: The alternative hypothesis is that the observed and expected proportions are
not the same

Therefore, we would like to accept the null hypothesis, in our model Sig = 0.89 > 0.05 , then
accept the null hypothesis that the model is well fitted

Omnibus Test is also good to know how significance the 2LL reduction is significant or not
for example in block 2LL was equal 76.47 and then in block 1 it was reduced to 17.623 is this
reduction from 76.47 to 17.623 significance or not??? Omnibus answer this question.

the null hypothesis of Omnibus test is : it reduction in 2LL from bassline model is not significant

Alternative hypothesis: Reduction IN 2LL is significant

In our model sig form omnibus test is 0 , which is < 0.05 so that we reject the null hypothesis
and accept that the reduction in 2LL is significant

34
Multiple Liner Regression

Null hypothesis that that B=0 , Alternative B≠0 .

This Sig is for Wald, Wald is equal = [B/SE]^2 in SPSS

Now Let take a look for Variable in Equation first of All we note that [x3] [x5] [x6] [X2] and
constant has Sig > 0.05 and this is bad. On other hand we have [X1] [X4] have the Sig < 0.05.

Another problem here for [X5] and [X3] the EXP(B) [Odds Ratio] is too close from 1, and this
is bad sing this mean that increasing 1 unit in this variable will not change the probabilities
of occurrence for “1” (Rich Aquatic life).

Another Problem Also is The Range of C.I This interval should not have passing throughout
[1] on it only variable that meet this criteria is [X1] [X4] , and other variable is all passing
through 1 and this is also bad sing

We can see from this plot that most value is clustered at edged and this is perfect because
we don’t want value near the 50% probability cutoff point, when value around 50% increase
the error increase in model but forint in our model the value scatter at edge.

35
Multiple Liner Regression

we need to check linearity:

Linearity has been met because all Sig > 0.05 that is amazing.

36
Multiple Liner Regression

Outliers More than 2 Std Deviation

In order to check Influential Points:

1- Cooks Distance >1 , we have 3 cases that have cooks distance more than 1 , Case 5 , Case 12
And Case 17 , we should care about these because these cases are outliers as shown in case
wise and also they are Influential Point .
2- Leverage: average leverage in our model 1 = (K+1)/n, (6+1)/68 = 0.103 cases with 3

times average leverage 3 time is concerning 3 X 0.103=0.309


In our Model we have case Marked in Red line in Appendix 4 have value more than
0.309, but all of them have cooks distance less than 1 so we can ignore them, but there
is only one case which is case 12
3-DF Value should be less than 1, these cases have marked in Red line in Appendix 4

Appendix 4 Shows The Case Summaries

Checking Multi-collinearity by running normal multiple regression

37
Multiple Liner Regression

All VIF Are Less Than 10, so this is fine, but maybe we need to take care of Temperature and
dissolved oxygen, because they are already have high correlation in the linear regression
process

Collinearity Statistics

Model Tolerance VIF

1 (Constant)

Drained Area [X1] .651 1.537

Dissolved Oxygen .289 3.459


Concentration [mg/l] [X2]

Maximum Water Depth [cm] .639 1.565


[X3]

Nitrate Concentration [mg/l] .726 1.378


[X4]

Sulfate Concentration [mg/l] .954 1.048


[X5]

The Water Temperature on .319 3.131


The Sampling Date
[Degrees C] [X6]

38
Multiple Liner Regression

Logistic Model [2] :


The Purpose of this model is to make a Forward LR model in logistic regression this will
help to eliminate the multi collinear variable and will help to find the un significance
variable in the model , the

We Can notice from classification table that the percent of correctness is 75 %

39
Multiple Liner Regression

Block 1:

We can see that the correctness of our model is increasing by each step in forward method
our model final correctness is 95.6% [ Same As Model 1 ] as the forced entry method entry
but there is some variable is eliminated

We can notice from the model summary from block 1, that each step has a lower -2Log
Likelihood value this mean that each step has a better model than the step before and we
can notice that in step 3 2 Log the minimum value that we have possible get.

The step in forward start with empty model then it is adding the most significance variable
in the model after than it is add the second significance variable, the model stops when the
reduction by adding extra variable in in -2Log Likelihood is not significant anymore.

Our model have Cox Snell R – 0.573 and Nagelkerek R = 0.849 this R Squares is calculated
throughout -2 Log Likelihood [ Same As Model 1 ]

40
Multiple Liner Regression

As we see the model start with [X4] , because it is the most significant from all other
variable , then in step 2 the model add [X1] , then the model add [X2] , after that the model
stop because any addition of the other reaming variable will not improve the model
significantly all remaining variable [x3][X5][X6] is all have Sig > 0.05 because of that
there is no addition happens .

From Variables in equation table we can find the following: [X2] [X4] [X1] have sig < 0.05,
this model has better coefficient significance than the Model 1 In page

41
Multiple Liner Regression

35 Good News here! is that EXP(B) CI No Passing 1 on all Variables !! that is amazing and
better than model Number 1 in the page 35

The Null Hypothesis is : the observed and expected proportions are the same across all
doses [ this mean well fitted model ]

Alternative is : that the observed and expected proportions are not

In Our Model we have in step 3 Sig = 0.763, so that we will accept the null Hypothesis and
the model is well fitted

the null hypothesis of Omnibus test is: it reduction in 2LL from bassline model is not
significant

Alternative hypothesis: Reduction IN 2LL is significant

In our model sig form omnibus test is 0, which is < 0.05 so that we reject the null hypothesis
and accept that the reduction in 2LL is significant

42
Multiple Liner Regression

We can see throughout these chart how our model is well scatter at the edge this mean that the
model is very good one, and we don’t have any values around 50% the cutoff point, if the model has
many values around 50% this mean the model is a bad one

2 Std Deviation Outliers

43
Multiple Liner Regression

Influential cases

1- Cooks Distance There is only one case with cooks distance > 1 , which is case 5,12 only
2- Leverage: average leverage in our model 1 = (K+1)/n, (3+1)/68 = 0.0588cases with 3

times average leverage 3 time is concerning 3 X 0.0588=0.1765,


These cases have values more than 0.185. case Marked in red line in Appendix 5but
they are not concerning because they have cooks distance less than 1
3-Df Betas >1, these cases have DF beta > 1 , marked in red line in appendix 5

Appendix 5 Shows The Case Summaries

Linearity has been meet here all Sig is > 0.05

44
Multiple Liner Regression

4 Cross Validation On Model [1] in Multiple Linear


Regression:
A 20% and 80% from the sample will be taken then both sample will be compared their R Square
together , we have sample size of 68 , then 68 X 0.2 = 14 , 68 X 0.8 = 54
so we will divide the sample into two sample

Frist one with 14 case and the second one with 54 case then compare Square

Sample 1

Modal Summary For Sample 1 :

45
Multiple Liner Regression

Sample 2 :

46
Multiple Liner Regression

Modal Summary 2:

R Square for sample 1 with [14 ] cases : have R Square 0.757 , adjusted 0.548

R Square for sample 1 with [54] cases : have R Square 0.621 , adjusted 0.572

The R Square full sample in the model 1 can show in page 8 , with R square 0.624 and R Adjusted
0.587

As we note that Sample [2] is more near to the model 1 , because it has large number of
observation 54 while sample 1 have only 14 observation and this make it 0.757 a little between
different from 0.621 , Maybe we need larger sample to make a strong cross-validation check

I Just make this section as a fast application for this method and to show how it apply

47
Multiple Liner Regression

5 Appendices

Appendix 1 [ Case Summary For Model 1]

48
Multiple Liner Regression

Centered Standardize Standardize Standardize Standardize Standardize Standardize Standardize


Cook's Leverage Mahalanobi COVRAT d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA
Distance Value s Distance IO Intercept X1 X2 X3 X4 X5 X6
1 .00069 .15025 10.06677 1.34066 1.02138 -.00451 -.03909 -.00078 -.00248 .00807 -.00514
2 .00658 .14421 9.66225 1.29790 .05623 .00235 -.03863 -.10179 .03122 -.09536 -.00271
3 .00318 .15063 10.09191 1.32780 .00209 -.01670 -.04817 .01180 .02779 .07068 .01238
4 .01419 .10022 6.71468 1.16121 .01580 .04986 .09660 -.13285 -.03750 .02337 -.06391
5 .04988 .08951 5.99744 .88037 -.24060 .24578 .09800 .26243 .41436 .11511 .06246
6 .00288 .13559 9.08451 1.30403 .02732 -.01279 -.05957 .01575 .06016 .06395 -.02854
7 .15941 .14914 9.99236 .67612 -.10569 .40476 .14591 -.73090 -.64427 .20693 .34099
8 .03160 .10753 7.20438 1.06330 .30011 .06450 -.35407 .09146 -.24468 -.05557 -.24064
9 .00585 .08436 5.65182 1.19381 -.03886 .05666 -.02142 .04704 -.12855 .05272 .05971
10 .01547 .06336 4.24506 1.04984 -.12692 .12196 .13166 -.00706 .16230 .11941 .02553
11 .00049 .06990 4.68307 1.22116 .01846 -.00956 -.00303 -.03125 .01067 -.02927 -.00780
12 .14901 .26662 17.86387 1.14273 .33179 .08331 -.22796 -.21340 -.54978 -.73364 -.05450
13 .00068 .06719 4.50203 1.21534 -.02767 -.00038 .01314 -.00553 .01405 .00066 .03905
14 .00272 .06989 4.68249 1.19768 -.00537 .00107 -.00020 -.03216 .02438 -.06248 .03775
15 .01129 .09172 6.14491 1.16378 .05123 .01179 -.03649 -.06013 .15184 -.13737 -.02324
16 .00235 .12614 8.45117 1.29170 .04462 -.03031 -.00855 -.02725 .08195 -.01097 -.06313
17 .02383 .03277 2.19535 .79394 -.05308 -.18512 .19256 -.17194 .00437 -.04692 .02825
18 .02504 .10568 7.08039 1.10013 .06304 .17499 -.23836 .21971 .10246 .01237 -.07642
19 .00010 .09956 6.67076 1.26672 -.00770 .00438 -.00086 .02048 .01186 -.00240 .00231
20 .00106 .03453 2.31335 1.16160 -.04707 .04504 .03438 .01692 .03786 -.02085 .03351
21 .02560 .11504 7.70796 1.12231 -.32096 .19553 .27130 -.08757 .18763 .12234 .26377
22 .10289 .09015 6.03996 .59617 .00223 .34822 .00880 -.60408 .10836 .26364 .05974
23 .00248 .01524 1.02107 1.08475 -.06366 .03662 .04169 .01575 .07467 .02247 .03577
24 .00797 .15209 10.18976 1.30490 .00319 .08266 -.11265 .13202 .06633 .08982 -.02617
25 .00987 .05296 3.54803 1.07857 .01509 .06476 -.13598 .09842 .01512 .10613 -.01173
26 .01211 .14502 9.71612 1.26915 .10294 .09422 -.00954 -.11543 .12495 -.20149 -.11921
27 .00300 .04508 3.02032 1.14956 .00023 .01844 .01906 -.01683 .10811 .00440 -.03985
28 .03805 .36387 24.37918 1.71788 .20273 -.19940 -.14527 -.08424 -.13627 .36842 -.22369
29 .02059 .11886 7.96330 1.16291 .19611 -.15862 -.13562 -.11492 -.26659 -.08619 -.09947
30 .00017 .03041 2.03740 1.17229 .01465 -.00852 -.00619 -.01436 -.01469 .01202 -.01135
31 .03732 .09296 6.22819 .97692 .01463 -.36965 -.08344 .26839 .13969 -.01517 -.05436
32 .00003 .04284 2.87023 1.19075 .00271 -.00286 .00299 -.00727 -.00555 -.00732 .00000
33 .00153 .04237 2.83906 1.16660 .08152 -.00505 -.06631 .00484 .01090 -.04558 -.07508
34 .00052 .07023 4.70541 1.22138 .00332 -.01246 .02043 -.01934 .02397 -.02913 -.01153
35 .00014 .10459 7.00734 1.27370 -.01677 .01192 .02296 -.01139 .01609 -.00982 .01403
36 .00256 .14018 9.39174 1.31357 .06260 .08096 -.06307 -.02926 .03936 .02320 -.06817
37 .03623 .06047 4.05147 .84059 -.07031 -.22257 .11887 -.25032 -.02292 .10887 .11776
38 .00059 .21269 14.25001 1.45074 -.02941 -.04635 .02424 .01545 -.02323 .01924 .03055

49
Multiple Liner Regression

39 .00414 .02359 1.58025 1.07319 -.06072 -.05061 .03406 -.00211 .04930 .06656 .04960
40 .00319 .07894 5.28887 1.20824 .04196 -.02132 -.00383 .01864 .09687 -.05831 -.07140
41 .00207 .05313 3.55987 1.17712 -.08978 .02981 .07691 -.02045 -.00791 .00099 .10410
42 .00166 .02062 1.38135 1.12205 -.06710 .01601 .04276 .01322 .00172 .01608 .07602
43 .00014 .07952 5.32780 1.23813 .00279 -.00829 -.00473 .01383 .01225 -.01839 -.00445
44 .00020 .03234 2.16653 1.17423 -.01487 -.02114 .01061 .00437 -.00838 .00149 .01787
45 .00200 .10300 6.90090 1.25720 -.09762 .02883 .09747 -.00760 .04573 -.01718 .08950
46 .00532 .03030 2.02985 1.07298 .06689 -.05129 -.08048 -.03225 .03165 .04200 -.05883
47 .00000 .05314 3.56066 1.20438 .00012 -.00102 -.00057 .00123 .00043 .00009 -.00028
48 .00004 .18885 12.65286 1.40943 .00350 -.00756 .00222 -.00320 .00588 .00081 -.00695
49 .00137 .01545 1.03485 1.11715 -.03236 -.00723 .00633 .03988 -.02893 .02894 .04040
50 .00290 .04527 3.03292 1.15138 -.07696 .03317 .08000 -.03390 -.04122 -.04238 .10330
51 .00116 .04045 2.71032 1.16945 -.04467 .03070 .03600 -.00685 -.03255 -.01513 .06095
52 .01504 .09674 6.48182 1.14669 -.21589 .16041 .22961 -.09883 .07216 -.11718 .23272
53 .07472 .17385 11.64785 1.06339 -.55395 -.23474 .50208 .01523 -.08383 .35079 .53766
54 .00042 .04811 3.22362 1.19192 -.00797 -.02747 .01091 -.00028 .02103 -.01498 .00652
55 .00036 .07910 5.29974 1.23542 -.00785 -.02024 -.00302 .03923 -.00583 .02200 .00074
56 .00687 .02462 1.64943 1.01987 .15619 -.06248 -.12029 .03157 -.06349 -.00625 -.14992
57 .00022 .16033 10.74208 1.35974 .00164 -.02893 -.00357 .01113 -.00725 -.01183 .00306
58 .00249 .05697 3.81698 1.17833 .01596 -.00919 -.04061 .05214 -.05457 .08834 -.02076
59 .01012 .04501 3.01560 1.04931 -.01418 .01867 .02344 .13157 .14975 -.08761 -.04382
60 .01275 .04268 2.85928 1.00447 .08575 -.11786 -.11374 .20262 -.07803 -.04460 -.08120
61 .04674 .06237 4.17861 .76431 .12311 -.31207 -.10821 .37256 .19029 -.06223 -.22489
62 .03023 .06335 4.24451 .90855 -.11401 -.27798 .21738 .01923 -.17083 .05091 .10262
63 .02461 .17143 11.48592 1.26450 .00541 -.15093 -.05739 .35594 .06446 -.07628 -.06697
64 .02681 .03342 2.23894 .76002 .09741 -.27857 -.02610 .16583 -.11004 -.00386 -.12350
65 .00884 .04455 2.98477 1.06523 .05372 .09861 -.00889 .02986 .01904 -.07915 -.08344
66 .01239 .02170 1.45421 .89085 .07098 .06867 -.03627 .05534 -.06443 -.11314 -.06680
67 .02092 .03876 2.59716 .87518 .18214 .10130 -.18048 .11286 -.10699 -.03605 -.18265
68 .01724 .03862 2.58728 .92325 .07954 .18303 -.04189 .00220 -.07044 -.05912 -.08553
Total N 68 68 68 68 68 68 68 68 68 68 68

50
Multiple Liner Regression

Appendix 2 [ Case Summary For Model 2]

51
Multiple Liner Regression

Case Summariesa

Standardized
Cook's Centered Mahalanobis DFBETA Standardized Standardized Standardized Standardized Standardized Standardized
Distance Leverage Value Distance COVRATIO Intercept DFBETA X1 DFBETA X2 DFBETA X3 DFBETA X4 DFBETA X5 DFBETA X6

1 .00090 .14688 9.84122 1.28969 .04621 -.00260 -.05183 -.00245 -.00189 . .


2 .00762 .11240 7.53083 1.21548 .11992 -.01107 -.07358 -.08998 .02038 . .
3 .00286 .11247 7.53580 1.23150 .05201 -.00800 -.07117 .00167 .03418 . .
4 .01906 .09460 6.33822 1.14310 -.10953 .06413 .23056 -.14481 -.04010 . .
5 .06771 .08481 5.68211 .93761 -.46669 .25980 .09761 .25554 .44032 . .
6 .00248 .09666 6.47602 1.20949 .01978 .00039 -.04385 .00307 .06033 . .
7 .19152 .12872 8.62454 .78586 .61325 .38693 -.14240 -.73719 -.59417 . .
8 .03111 .07494 5.02115 1.04809 .19516 .09206 -.26748 .07913 -.26614 . .
9 .00685 .06976 4.67365 1.14877 .05771 .05584 -.09450 .04414 -.11802 . .
10 .01861 .05290 3.54397 1.04820 -.24473 .13717 .19283 -.02381 .17934 . .
11 .00058 .04773 3.19778 1.15145 .02356 -.01359 -.00026 -.02953 .00715 . .
12 .07896 .12546 8.40578 1.03563 .47853 -.01055 -.35579 -.09125 -.56217 . .
13 .00065 .04051 2.71421 1.14155 .02231 -.00621 -.02597 -.00235 .01732 . .
14 .00279 .04449 2.98099 1.13131 .06247 -.01388 -.05660 -.01992 .02053 . .
15 .01213 .06587 4.41317 1.11491 .04149 -.00423 -.05193 -.04156 .13635 . .
16 .00233 .09114 6.10625 1.20208 -.03786 -.02230 .05802 -.03047 .07449 . .
17 .03387 .03190 2.13714 .85634 -.08875 -.20326 .26207 -.16752 .00099 . .
18 .03443 .10148 6.79942 1.10350 -.01548 .19349 -.27974 .21642 .10013 . .
19 .00016 .09759 6.53828 1.21972 -.01694 .00405 -.00478 .02289 .01271 . .
20 .00120 .02337 1.56578 1.11266 -.05165 .03888 .01047 .02400 .03928 . .
21 .01775 .05673 3.80060 1.06365 -.16976 .16810 .12164 -.08104 .21225 . .
22 .13143 .08065 5.40341 .71118 .23103 .38443 -.01404 -.65065 .14536 . .
23 .00320 .01230 .82405 1.06341 -.07795 .03558 .02624 .01569 .08169 . .
24 .00841 .12460 8.34852 1.23290 -.02872 .09608 -.12260 .11163 .07190 . .
25 .01133 .04159 2.78626 1.06397 .04225 .08245 -.17841 .08146 .02678 . .
26 .00575 .04617 3.09318 1.11346 -.07128 .08032 .08827 -.09015 .08812 . .
27 .00392 .04037 2.70449 1.11618 -.09765 .02561 .07807 -.02139 .10747 . .
28 .00538 .08142 5.45503 1.17464 .06345 -.07357 .06302 -.10290 -.06977 . .
29 .02582 .10372 6.94952 1.13792 .26054 -.16003 -.10717 -.11220 -.28852 . .
30 .00016 .01926 1.29066 1.11934 .01415 -.00496 .00550 -.01652 -.01348 . .
31 .05318 .09171 6.14442 1.01218 -.10120 -.37663 -.06853 .27426 .13728 . .
32 .00004 .02755 1.84569 1.13063 .00635 -.00483 .00418 -.00757 -.00787 . .
33 .00065 .00296 .19858 1.08704 .02099 -.00038 -.02111 .00558 -.00003 . .
34 .00060 .04781 3.20341 1.15143 -.02969 -.01603 .04295 -.01694 .02104 . .
35 .00017 .06491 4.34922 1.17608 -.01560 .01003 .02019 -.01020 .01874 . .
36 .00211 .09278 6.21593 1.20532 .00650 .08862 -.01220 -.03621 .03446 . .
37 .04663 .05355 3.58765 .89718 .13429 -.23055 .06304 -.26286 -.00142 . .
38 .00043 .14372 9.62895 1.28605 .00152 -.04231 .00392 .01316 -.01628 . .
39 .00450 .01496 1.00263 1.05279 -.02201 -.04954 .00494 -.00841 .06142 . .

52
Multiple Liner Regression

40 .00283 .04495 3.01155 1.13177 -.08026 -.01932 .06897 .02193 .08535 . .


41 .00066 .00158 .10596 1.08383 .01515 .01453 -.00381 -.01162 -.00006 . .
42 .00113 .00245 .16427 1.07427 .01161 .00719 -.02113 .01745 .00935 . .
43 .00016 .04489 3.00735 1.15075 -.01024 -.01209 -.00589 .01922 .01144 . .
44 .00022 .02150 1.44040 1.12132 .00458 -.02445 -.00456 .00585 -.00706 . .
45 .00102 .03015 2.02038 1.12447 -.04609 .01315 .04131 .00270 .04967 . .
46 .00636 .02356 1.57855 1.05676 .04742 -.03750 -.04836 -.04459 .03261 . .
47 .00000 .05128 3.43551 1.15981 -.00024 -.00064 -.00035 .00079 .00028 . .
48 .00002 .14954 10.01908 1.29612 -.00539 -.00467 .00843 -.00284 .00393 . .
49 .00146 .00804 .53866 1.08124 .02108 -.00933 -.03343 .03952 -.02283 . .
50 .00141 .00621 .41594 1.07780 .03555 .01216 -.00525 -.01869 -.03882 . .
51 .00081 .01225 .82103 1.10052 .02615 .02016 -.01905 .00074 -.03040 . .
52 .00612 .02075 1.39050 1.05082 -.03929 .10686 .05853 -.05888 .07312 . .
53 .02093 .03618 2.42373 .97514 -.05495 -.23189 .17096 .00547 -.00236 . .
54 .00058 .04201 2.81462 1.14402 -.01022 -.03266 .00708 .00279 .02093 . .
55 .00033 .06080 4.07328 1.16986 -.01184 -.01598 -.00178 .03304 -.00293 . .
56 .00508 .00629 .42115 1.00641 .04879 -.04182 -.00988 .01994 -.07570 . .
57 .00037 .14235 9.53763 1.28410 .01005 -.03701 -.01299 .01571 -.00993 . .
58 .00156 .02161 1.44754 1.10580 .01627 .00604 -.02246 .03484 -.04338 . .
59 .01270 .03731 2.49965 1.04155 -.17514 .01326 .07510 .14586 .13968 . .
60 .01663 .03747 2.51057 1.01233 .01798 -.11582 -.08862 .20832 -.09134 . .
61 .05574 .05051 3.38428 .83521 -.24356 -.29433 .08881 .36983 .16852 . .
62 .04071 .05885 3.94292 .94995 -.03971 -.29476 .22635 .02055 -.16053 . .
63 .03322 .16091 10.78087 1.23473 -.17561 -.15654 -.02238 .37180 .05162 . .
64 .03541 .02968 1.98846 .82973 -.04434 -.26918 .10651 .16010 -.12230 . .
65 .01006 .03262 2.18553 1.04861 -.08535 .10274 .07285 .03599 .00362 . .
66 .01451 .01497 1.00325 .92151 -.00647 .06473 .00446 .06929 -.08465 . .
67 .02280 .02669 1.78818 .91230 .02956 .12601 -.06961 .10478 -.12644 . .
68 .02272 .03419 2.29064 .95275 -.01423 .19364 .02709 .00426 -.08573 . .
Total N 68 68 68 68 68 68 68 68 68

a. Limited to first 100 cases.

53
Multiple Liner Regression

Appendix 3 [ Case Summary For Model 3]

54
Multiple Liner Regression

Case Summariesa
Centered Standardize Standardize Standardize Standardize Standardize Standardize Standardize
Cook's Leverage Mahalanobis COVRATI d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA
Distance Value Distance O Intercept X1 X2 X3 X4 X5 X6
1 .00000 .08348 5.67640 1.16201 . .00008 -.00257 .00144 .00374 . .
2 .00335 .07854 5.34050 1.14446 . -.00148 .00546 -.03678 .10714 . .
3 .00160 .10276 6.98801 1.18287 . -.00302 -.03646 .01689 .07173 . .
4 .02312 .09550 6.49369 1.11430 . .05903 .22213 -.21113 -.16806 . .
5 .03233 .03762 2.55844 .89500 . .22652 -.28102 .08832 .16517 . .
6 .00249 .10780 7.33050 1.18755 . .00173 -.03734 .01026 .09295 . .
7 .12481 .09139 6.21440 .84867 . .39054 .31251 -.49212 -.21289 . .
8 .02538 .06790 4.61734 1.04628 . .09937 -.17112 .15182 -.16840 . .
9 .00663 .07617 5.17924 1.12981 . .05598 -.06958 .06560 -.10004 . .
10 .00911 .02429 1.65205 .99502 . .12432 .04387 -.12977 .01560 . .
11 .00033 .05026 3.41744 1.11965 . -.00886 .01519 -.01673 .02412 . .
12 .03018 .06073 4.12971 1.00724 . .02166 -.04497 .07872 -.26826 . .
13 .00046 .04663 3.17091 1.11451 . -.00369 -.01221 .00525 .03675 . .
14 .00186 .04246 2.88701 1.10060 . -.00790 -.01770 .00315 .07556 . .
15 .01405 .07828 5.32279 1.10841 . -.00105 -.03192 -.02744 .22207 . .
16 .00346 .09264 6.29949 1.16384 . -.02943 .05123 -.05602 .07780 . .
17 .04171 .04452 3.02749 .88539 . -.21401 .27502 -.21975 -.08425 . .
18 .04397 .11603 7.88996 1.10704 . .19512 -.38997 .22969 .12483 . .
19 .00057 .07044 4.79019 1.14357 . .00597 -.04573 .03852 .00312 . .
20 .00108 .02094 1.42410 1.07418 . .04003 -.03577 .00581 .00585 . .
21 .01644 .04829 3.28381 1.03115 . .16312 .01361 -.16311 .13780 . .
22 .14680 .08826 6.00189 .78436 . .39524 .18013 -.59643 .41162 . .
23 .00275 .01666 1.13318 1.03976 . .03145 -.03532 -.01527 .04084 . .
24 .01143 .13654 9.28493 1.21131 . .09923 -.19827 .11461 .07574 . .
25 .01328 .05452 3.70709 1.06294 . .08454 -.19698 .10312 .07582 . .
26 .00673 .05002 3.40154 1.08572 . .08021 .05863 -.13437 .05749 . .
27 .00305 .02796 1.90129 1.06694 . .02021 .02021 -.06906 .06147 . .
28 .00439 .08156 5.54593 1.14532 . -.06049 .12228 -.07491 -.03154 . .
29 .01041 .05611 3.81575 1.07977 . -.11577 .07061 -.01328 -.12388 . .
30 .00008 .02540 1.72686 1.09196 . -.00280 .01415 -.00869 -.00370 . .
31 .06681 .10240 6.96294 1.02210 . -.39423 -.18387 .26072 .09534 . .
32 .00001 .03451 2.34641 1.10302 . -.00187 .00477 -.00241 -.00207 . .
33 .00063 .01524 1.03657 1.07047 . .00116 -.00921 .01375 .01891 . .
34 .00086 .04378 2.97733 1.10856 . -.02355 .03992 -.03878 .00106 . .
35 .00043 .05665 3.85217 1.12695 . .01494 .02220 -.02910 .01859 . .
36 .00248 .10704 7.27898 1.18653 . .08687 -.01023 -.03540 .05219 . .
37 .05202 .06316 4.29482 .93288 . -.21680 .19765 -.22496 .12320 . .
38 .00048 .15825 10.76082 1.26444 . -.03999 .00618 .01399 -.01985 . .
39 .00571 .02903 1.97390 1.04541 . -.05237 -.01297 -.01827 .06505 . .

55
Multiple Liner Regression

40 .00247 .03215 2.18639 1.08012 . -.02901 .02433 -.00984 .04722 . .


41 .00073 .01515 1.02991 1.06869 . .01527 .00796 -.00624 .01389 . .
42 .00134 .01674 1.13849 1.06192 . .00801 -.01773 .02323 .02366 . .
43 .00033 .05185 3.52600 1.12158 . -.01751 -.02279 .02252 .00821 . .
44 .00024 .03552 2.41521 1.10244 . -.02262 -.00192 .00762 -.00505 . .
45 .00101 .02583 1.75627 1.08284 . .01139 .01713 -.01841 .02913 . .
46 .00704 .03555 2.41748 1.05248 . -.03319 -.02222 -.02831 .08786 . .
47 .00001 .06327 4.30205 1.13689 . -.00429 -.00435 .00486 .00102 . .
48 .00053 .11927 8.11041 1.20806 . -.02660 .03383 -.02730 .00158 . .
49 .00162 .02134 1.45126 1.06816 . -.00755 -.02530 .04976 -.01117 . .
50 .00131 .01712 1.16399 1.06336 . .01421 .02303 -.00557 -.01885 . .
51 .00070 .02232 1.51752 1.08095 . .02042 -.00222 .01041 -.01571 . .
52 .00765 .03367 2.28937 1.04288 . .10690 .04454 -.08134 .06519 . .
53 .02631 .04944 3.36161 .98526 . -.24099 .18235 -.01649 -.05643 . .
54 .00085 .05465 3.71592 1.12246 . -.03691 .00050 -.00121 .02109 . .
55 .00061 .06907 4.69707 1.14170 . -.02136 -.01607 .03894 -.01929 . .
56 .00559 .01903 1.29396 1.00959 . -.03766 .02918 .04052 -.05722 . .
57 .00013 .14845 10.09443 1.25045 . -.01976 -.00460 .01140 -.00225 . .
58 .00176 .03506 2.38412 1.09038 . .00707 -.01513 .04273 -.04300 . .
59 .00902 .02695 1.83281 1.00809 . -.00005 -.05619 .09096 .02757 . .
60 .02074 .05198 3.53436 1.02089 . -.11486 -.10213 .23192 -.10896 . .
61 .05767 .05201 3.53638 .85602 . -.32010 -.09725 .30674 .00113 . .
62 .05179 .07300 4.96401 .97132 . -.30251 .26959 .00626 -.26244 . .
63 .03842 .14289 9.71670 1.17252 . -.18189 -.19601 .35240 -.10204 . .
64 .04475 .04391 2.98614 .86639 . -.27643 .10391 .15648 -.21314 . .
65 .01157 .04047 2.75226 1.03579 . .10018 .02291 .00463 -.07892 . .
66 .01840 .02966 2.01709 .94151 . .06491 .00025 .07263 -.12383 . .
67 .02835 .04109 2.79387 .93805 . .12873 -.06671 .12493 -.14641 . .
68 .02887 .04881 3.31902 .97030 . .19491 .02377 -.00114 -.13295 . .
Total N 68 68 68 68 68 68 68 68
a. Limited to first 100 cases.

56
Multiple Liner Regression

Appendix 4 [ Case Summary For Logistic


Model 1]

57
Multiple Liner Regression

Case Summariesa

Analog of
Cook's
influence DFBETA for DFBETA for
statistics Leverage value constant X1(1) DFBETA for X2 DFBETA for X3 DFBETA for X4 DFBETA for X5 DFBETA for X6

1 .00000 .00001 .00000 .00000 .00000 .00000 .00000 .00000 .00000


2 .00000 .00004 .00001 -.00001 .00000 .00000 .00000 .00000 .00000
3 .00000 .00001 .00000 .00000 .00000 .00000 .00000 .00000 .00000
4 .00000 .00034 .00013 -.00006 .00003 .00000 -.00003 .00000 -.00001
5 2.18002 .16787 -3.58578 -.00102 -.41987 .01958 .33708 .03960 .09051
6 .00000 .00002 .00000 .00000 .00000 .00000 .00000 .00000 .00000
7 .11306 .48025 .36801 -.45116 .10861 -.00655 -.11247 .02768 -.00614
8 .01049 .23349 .58392 -.12311 .01390 .00257 -.04741 .00603 -.02861
9 .00049 .06355 .05209 -.02660 .00791 .00047 -.01021 .00136 -.00418
10 .05328 .34356 .22819 -.32053 .14793 .00355 -.10038 .01494 -.04871
11 .00002 .01744 .00944 -.00467 .00193 .00010 -.00194 .00023 -.00093
12 2.68435 .71950 4.85399 .10402 .11223 .00396 -.35071 -.23769 -.06691
13 .00000 .00101 .00053 -.00020 .00009 .00000 -.00008 .00001 -.00004
14 .00000 .00023 .00008 -.00004 .00002 .00000 -.00002 .00000 -.00001
15 .00000 .00133 .00063 -.00026 .00012 .00000 -.00011 .00001 -.00006
16 .00021 .04296 .01949 -.01779 .00598 .00030 -.00659 .00066 -.00220
17 1.20738 .10665 -3.45663 1.04256 .10131 -.02258 .20741 -.01150 .11407
18 .00001 .00880 .00146 -.00179 .00190 .00000 -.00102 -.00002 -.00037
19 .12767 .41399 .60069 .06566 .10388 -.01154 -.04535 .00235 -.00402
20 .25290 .27929 1.96122 .47193 -.10642 -.00499 -.00945 .01300 -.03919
21 .00126 .08209 .17911 -.01462 .00928 .00077 -.01406 -.00002 -.00944
22 .00000 .00030 .00012 -.00007 .00003 .00000 -.00003 .00000 -.00001
23 .00002 .01012 .01349 -.00593 .00238 .00005 -.00201 .00000 -.00085
24 .00000 .00401 .00059 -.00074 .00076 .00000 -.00039 -.00003 -.00014
25 .00000 .00063 .00019 -.00018 .00011 .00000 -.00006 .00000 -.00003
26 .37724 .52106 -.29371 .35568 -.10813 .01426 -.11381 .07871 -.01273
27 .00030 .06659 .07981 -.00827 .00440 .00013 -.00676 .00164 -.00430
28 .00002 .00959 .00880 -.00563 .00203 .00006 -.00190 .00002 -.00061
29 .02999 .18135 -.52960 -.12191 .07287 .00280 -.00965 -.00394 -.00098
30 .00001 .00841 .01068 -.00416 .00175 .00002 -.00149 .00009 -.00066
31 .00000 .00040 .00022 -.00011 .00006 .00000 -.00004 .00000 -.00002
32 .00000 .00358 .00188 -.00150 .00080 .00000 -.00050 -.00003 -.00017
33 .00787 .10228 .44185 -.08959 .01062 .00066 -.01755 -.00477 -.01520
34 .01115 .13478 .27572 -.09080 -.01629 .00168 -.02388 .00731 -.00729
35 .00381 .11746 -.22759 -.04705 .04189 -.00078 -.00413 -.00533 .00487
36 .53149 .44375 5.42398 -.21552 -.42762 -.00789 .06954 .04349 -.10122
37 .00000 .00097 .00061 -.00025 .00012 .00001 -.00010 .00000 -.00005
38 .00000 .00128 .00093 -.00041 .00019 .00001 -.00016 .00000 -.00007

58
Multiple Liner Regression

39 .14593 .32758 2.50756 -.21346 -.07409 -.00181 .00924 -.02242 -.06187


40 .00222 .08438 .07895 -.04427 .02474 .00046 -.01074 -.00304 -.00745
41 .01216 .13897 -.54007 -.08200 .04295 -.00164 .00111 -.00647 .02264
42 .01985 .12681 -.50853 -.10738 .02680 -.00104 .00556 -.00663 .02469
43 .00000 .00183 .00125 -.00061 .00029 .00000 -.00024 .00002 -.00010
44 .00003 .01288 .01870 -.00646 .00273 .00008 -.00233 .00006 -.00124
45 .00941 .15928 -.56301 -.06549 .05983 -.00084 -.00033 -.00842 .01574
46 .00000 .00036 .00013 -.00009 .00005 .00000 -.00004 .00000 -.00002
47 .00000 .00160 .00088 -.00058 .00030 .00000 -.00021 .00000 -.00008
48 .23415 .52520 -.62172 -.18754 -.20397 .00441 .03553 -.00859 .08559
49 .00028 .02545 .01935 -.02184 .00783 .00027 -.00659 .00015 -.00205
50 .00005 .01403 -.00739 -.00849 .00460 .00001 -.00228 -.00028 -.00021
51 .00002 .01015 -.00009 -.00620 .00301 .00004 -.00190 -.00009 -.00040
52 .00948 .16039 -.48571 -.06575 .05269 -.00144 .00123 -.00955 .01707
53 .08448 .33609 -2.02550 .08910 .17523 -.00232 -.02371 .00472 .04774
54 .00000 .00058 .00040 -.00016 .00007 .00000 -.00006 .00000 -.00003
55 .00000 .00197 .00112 -.00054 .00024 .00001 -.00022 .00001 -.00011
56 .00000 .00535 .00656 -.00214 .00088 .00003 -.00082 .00003 -.00042
57 .00000 .00138 .00094 -.00038 .00019 .00000 -.00016 .00001 -.00008
58 .00000 .00152 .00121 -.00043 .00017 .00001 -.00018 .00002 -.00009
59 .00006 .01877 .01087 -.00801 .00423 .00015 -.00271 -.00021 -.00137
60 .00000 .00568 .00523 -.00212 .00086 .00004 -.00081 .00001 -.00040
61 .20736 .46088 .98656 .16238 .12054 .01502 -.09172 -.01812 -.10155
62 .00000 .00321 .00107 -.00062 .00054 .00001 -.00037 .00001 -.00016
63 .00000 .00302 .00135 -.00066 .00032 .00002 -.00027 -.00001 -.00014
64 .00000 .00112 .00070 -.00033 .00017 .00001 -.00013 .00000 -.00006
65 .00000 .00027 .00014 -.00006 .00003 .00000 -.00002 .00000 -.00001
66 .00000 .00112 .00073 -.00034 .00017 .00001 -.00013 .00000 -.00006
67 .00000 .00084 .00065 -.00022 .00010 .00000 -.00009 .00000 -.00005
68 .00000 .00031 .00018 -.00008 .00004 .00000 -.00003 .00000 -.00001
Total N 68 68 68 68 68 68 68 68 68

a. Limited to first 100 cases.

59
Multiple Liner Regression

Appendix 5 [ Case Summary For Logistic


Model 2]

60
Multiple Liner Regression

Case Summariesa

DFBETA for DFBETA DFBETA


Analog of Cook's influence statistics Leverage value DFBETA for constant
X1(1) for X2 for X4

1 0 0.00011 -0.00006 -0.00002 0.00001 -0.00001


2 0 0.0005 -0.00034 -0.0001 0.00008 -0.00003
3 0 0.00008 -0.00005 -0.00001 0.00001 0
4 0 0.00175 -0.00136 -0.00038 0.00031 -0.00013
5 1.38031 0.1159 0.02937 -0.13116 -0.28695 0.27082
6 0 0.00014 -0.00008 -0.00003 0.00002 -0.00001
7 0.01098 0.13747 -0.02945 -0.12845 0.0349 -0.02962
8 0.04485 0.2315 0.21078 -0.25009 0.02448 -0.05045
9 0.00116 0.06142 -0.03785 -0.03974 0.01614 -0.01059
10 0.08043 0.30474 -1.1339 -0.30908 0.21185 -0.05737
11 0.00019 0.03206 -0.03806 -0.01475 0.0095 -0.00422
12 1.09908 0.42899 2.12492 0.33152 -0.21055 -0.164
13 0 0.00585 -0.00522 -0.0017 0.00126 -0.00054
14 0 0.00116 -0.00081 -0.00028 0.0002 -0.00009
15 0 0.00619 -0.00565 -0.00177 0.00134 -0.00056
16 0.00039 0.04076 -0.02166 -0.02234 0.00942 -0.00624
17 0.86407 0.09714 -2.14388 0.86595 0.11755 0.07421
18 0.00002 0.01207 -0.01422 -0.00176 0.00317 -0.00121
19 0.01808 0.11984 -0.15214 0.0318 0.0504 -0.02872
20 0.18024 0.19921 0.48006 0.4022 -0.0655 0.00113
21 0.0074 0.0967 -0.0441 0.0021 0.02854 -0.02166
22 0 0.00112 -0.0009 -0.00036 0.00025 -0.00011
23 0.00005 0.01214 -0.00817 -0.00816 0.00374 -0.00216
24 0 0.00594 -0.0061 -0.00089 0.0014 -0.00056
25 0 0.00126 -0.00112 -0.00038 0.00029 -0.00012
26 0.05819 0.17164 0.17949 0.12369 0.00959 -0.03219
27 0.00111 0.05405 -0.09018 -0.00714 0.02175 -0.00925
28 0.00003 0.0103 -0.00248 -0.00617 0.00237 -0.00168
29 0.02266 0.14132 -0.35461 -0.1162 0.06826 -0.00725
30 0.00003 0.00944 -0.00867 -0.00565 0.00303 -0.00154
31 0 0.00084 -0.00061 -0.00026 0.00018 -0.00008
32 0 0.00429 -0.0038 -0.00197 0.00119 -0.00058
33 0.011 0.08129 0.17013 -0.10541 -0.00225 -0.00889
34 0.00711 0.09294 0.21023 -0.07644 -0.00998 -0.011
35 0.00082 0.04448 -0.08696 -0.02651 0.01931 -0.00572
36 0.33397 0.27788 1.96968 -0.17238 -0.3229 0.11407
37 0 0.00291 -0.00262 -0.00117 0.00076 -0.00035
38 0 0.00289 -0.00241 -0.0012 0.00074 -0.00036
39 0.13322 0.19809 0.95249 -0.18108 -0.14011 0.05107
40 0.00469 0.07749 -0.17782 -0.06346 0.03927 -0.00965
41 0.00113 0.03582 -0.01974 -0.04029 0.01365 -0.00778

61
Multiple Liner Regression

42 0.00474 0.05789 0.04056 -0.07781 0.01236 -0.01021


43 0 0.00227 -0.00185 -0.00089 0.00056 -0.00027
44 0.00012 0.01719 -0.02158 -0.01218 0.00672 -0.00305
45 0.00102 0.04328 -0.08532 -0.03215 0.02049 -0.0066
46 0 0.00093 -0.00075 -0.00028 0.0002 -0.00009
47 0 0.00229 -0.00199 -0.00087 0.00057 -0.00027
48 0.03475 0.23653 0.70589 -0.12255 -0.06892 -0.01272
49 0.00032 0.02345 -0.01334 -0.02107 0.0081 -0.00485
50 0.00003 0.0098 -0.0086 -0.00599 0.00313 -0.00163
51 0.00002 0.00893 -0.00676 -0.00535 0.00268 -0.00147
52 0.00073 0.03315 -0.05158 -0.03044 0.01558 -0.00638
53 0.00991 0.12641 -0.32489 0.00328 0.06614 -0.02164
54 0 0.00149 -0.00108 -0.00054 0.00034 -0.00017
55 0 0.00404 -0.00391 -0.00174 0.00113 -0.00052
56 0.00002 0.00891 -0.00807 -0.00523 0.00282 -0.00144
57 0 0.00303 -0.0031 -0.00106 0.00079 -0.00032
58 0 0.00308 -0.00237 -0.00132 0.00078 -0.0004
59 0.00023 0.0242 -0.03982 -0.01558 0.01012 -0.00375
60 0.00002 0.00903 -0.00732 -0.00541 0.00278 -0.00148
61 0.20037 0.23977 -0.98644 0.37756 0.13672 -0.00437
62 0.00001 0.00741 -0.00788 -0.00113 0.00183 -0.00074
63 0.00001 0.00719 -0.00832 -0.00342 0.00227 -0.00097
64 0 0.00269 -0.00246 -0.00104 0.0007 -0.00032
65 0 0.00108 -0.00091 -0.00033 0.00024 -0.0001
66 0 0.00266 -0.00229 -0.00106 0.00068 -0.00032
67 0 0.00248 -0.00195 -0.001 0.00062 -0.00031
68 0 0.00106 -0.00082 -0.00034 0.00023 -0.00011
Total N 68 68 68 68 68 68

a. Limited to first 100 cases.

62