Multiple Regression Project PDF

“Multiple Linear Regression Project “
Applied Statistics For Engineer

Dr.Hussam Abu-Hajjar
Submission Date: 7-Nov-2019
Student Name : Khalid Akram Hilu
Student ID : 8180021
September 2019
Multiple Liner Regression
Contents
1 General Overview of The Project Problem ................................... 3
Literature review about the case study ...................................................................................... 3

Descriptive analysis and normality test for each Variable .......................................................... 5
2 Multiple Linear Regression: ........................................................ 8
Model 1: .................................................................................................................................... 8
Model 2: ...................................................................................................................................15
Model 3: .................................................................................................................................. 22
3 Logistics Model: ...................................................................... 29
Logistic Model 1 : .....................................................................................................................31

Logistic Model [2] : .................................................................................................................. 39
4 Cross Validation On Model [1] in Multiple Linear Regression: ...... 45
5 Appendices ............................................................................. 48
Appendix 1 [ Case Summary For Model 1] ............................................................................... 48

Appendix 2 [ Case Summary For Model 2] ................................................................................51
Appendix 3 [ Case Summary For Model 3] ............................................................................... 54
Appendix 4 [ Case Summary For Logistic Model 1] ...................................................................57
Appendix 5 [ Case Summary For Logistic Model 2] .................................................................. 60
2
1 General Overview of The Project Problem
Literature review about the case study :

The scope of this Project is to find a relation between 1 independent variable with multiple
dependent variables throughout multiple linear regression and logistics regression as
shown using SPSS Software.
The Experiment that used in this project is a Number of Fish In Water Stream [Y]
calculating multiple factors:
 The Area drained by the stream (in acres) [X1]

 Dissolved Oxygen (in mg/liter) [X2]
 The Maximum Depth in Studied Cross-section (in cm) [X3]
 Nitrate Concentration (mg/liter) [X4]
 Sulfate Concentration (mg/liter) [X5]
 The Water Temperature on The Sampling Date (in degrees C) [X6]
This experiment is done by The Maryland Biological Stream Survey (MBSS),and it is to

calculate number of in water stream with 0.75 cm cross-sectional width, then they are
calculating the concentration for Dissolved Oxygen, Nitrate No3, Sulfate So4, and calculate
the water temperature
Moreover, the maximum depth of the stream will be measured along the cross-sectional
width.
Here is Some Information Related to the experiment from real life
1- Dissolved Oxygen:
0-2 mg/L not enough oxygen to support fish life.
2-4 mg/L only a few fish and aquatic insects can survive.
4-7 mg/L good for many aquatic animals, low for cold water fish
7-11 mg/L very good for most stream fish
2- Nitrate No3:
Nitrate is measured in mg/L. Natural levels of nitrate are usually less than 1 mg/L.
Concentrations over 10 mg/L will have an effect on the freshwater aquatic environment. 10
3
mg/L is also the maximum concentration allowed in human drinking water by the U.S.
Public Health Service. For a sensitive fish such as salmon the recommended concentration
is 0.06 mg/L
3- Temperature:
In hot weather, the first thing to do is to increase water movement. The warmer water
is, the less oxygen will be dissolved in it, but at higher temperatures, your fish’s
metabolism will be higher, increasing their need for oxygen.
4- The Area drained by the stream:

number of fish counted will depend on the watershed size, we assume that larger
drained area by stream will result a more fishes, because when drained area increased
the flow rate and the collected water volume will be increased
5-Depth of Stream:
Stream Channel depth vary along the stream cross section, the experiment calculate the
maximum depth in the stream cross-section, the fish does not prefer shallow stream water
to be there
4
6-Sulfate: Sulfate has effects on fish at high concentration only, when reach
hundred mg/l.
 The Area drained by the stream (in acres) [X1]

 Dissolved Oxygen (in mg/liter) [X2]
 The Maximum Depth in Studied Cross-section (in cm) [X3]
 Nitrate Concentration (mg/liter) [X4]
 Sulfate Concentration (mg/liter) [X5]
 The Water Temperature on The Sampling Date (in degrees C) [X6]
Descriptive analysis and normality test for each Variable
All the Above independent variable will model in order to find their relation with the
dependent variable which is the Number of Fish the stream [Y]
All Variable is continuous variable, in the last section we will convert the independent
variable into binary categorical variable and convert one dependent variable into binary
categorical variable
5
6
As shown above from descriptive statists table we have a quick review for the data and we
can see that we have 68 Sample Case, we can also see the mean and Std deviation and
compare them together. For Example, [X4] Has mean Value of 5.6, and it Std deviation
is equal to 3.27 this is massive STD comparing to the mean!!
As Shown Above is the Descriptive Statistics for each Variable and Normality Test for each
for them using Kolmogorov-Smirnov and Shapiro-Wilk.
Null Hypothesis: The Variable is Normally distributed, Alternative: It is not

We can note that only Number of Fish [Y1] and Maximum Water Depth [X3] is normally
distributed because their Sig > 0.005 [ Null is accepted], and other variable [X1], [X2], [X4],
[X5] [X6] is not normally distributed because their Sig < 0.05. [Null is rejected].
But Anyway the Non-Normality in Variables is not important and not an assumption in
linear regression or logistics regression model, the major assumption for linear
regression methods is the Normality of residuals not the variable themselves.
7
2 Multiple Linear Regression:
Model 1:
Model 1 is the first trial model that contains all
variables together that entered in Forced Entry
method and assumed this model have constant
[have intercept with Y Axis], the result is as
following
We can see that all Variable is entered without any exclusion, therefore the R2 for the
model will be for all these variable with [Y]
From the Model Summary we can interpret the following:
1- 62.4% from the variance in [Y] is explained by all 6 independent Variables

2- For This Model Adjusted R Square 58.7% which is close to 62.4%, from 62.4% to 58.7
is not that big reduction so we can make Model Generalization.
3- Durbin Watson to check for Independent Errors equal to 1.022 which is good
Because typical value for it is from [1 to 3] so that we are near the lower limits but
still fine
8
This F Test Represent us a statistics test for the overall significance of the regression model, we
always seek to High F Value, and seek a value of Sig < 0.005 to state that the observed
difference is real not due to chance.
Null Hypothesis : All Model Coefficient is equal to zero B0 , B1 , B2 ,…… = 0

Alternative Hypothesis : at least 1 slope coefficient ≠ 0
In-our Model Sig is equal to 0.000 which is less than 0.005, so that the null hypothesis is
rejected and the observed difference is real not due to chance.
9
From Coefficients table we can see B value for each variable and whether it is significant or
not.
The null hypothesis for coefficients test is
Null hypothesis B=0
Alternative hypothesis B≠0
We can see from the coefficients table that Constant, [X5] , [X6] have Sig Higher Than
0.05 therefore the null hypothesis accepted [ this coefficients is bad in our model ]
On other hand it seems that [X1] ,[X2],[X3],[X4] have a Sig Value less than 0.05 , therefore
the null hypothesis is rejected and they are well fit in the model .
We Can interpret from coefficients table the VIF Values that help us to identify Collinearity
First of all, we don’t have any variable have VIF > 10, on other hand all Predictors have VIF
around 1, expect The Water Temperature [X6] and Dissolved Oxygen [X2] both have 3.187
and 3.451 respectively. As first indication these to two variables Temperature and dissolved
Oxygen maybe have a multi-collinearity problem, but let us take a look in the coloration
and covariance matrix.
10
From The Coefficients Correlations we can that all variable have small values of r between
them expect Water Temperature [X6] and Dissolved Oxygen [X2] they have correlation
0.821 !!!!, this impressive result is correspond with VIF in Page 10, so that Water
Temperature and Dissolved Oxygen have high probability to be multi-collinear variable,
Maybe the next Model [ Model 2] [Backward] will eliminate one of them if one of them does
not add any significance effect on the model. Because in multi-collinearity one of the
variables is enough in model, and the other one is redundant and we can eliminate one of
them from model.
11
The condition indices are computed as the square roots of the ratios of the largest
eigenvalue to each successive eigenvalue. Values greater than 15 indicate a possible
problem with collinearity; greater than 30, a serious problem. [ In our Model We have only
one parameter which is temperature is have Condition Index 53.511 > 30.
Eigen Value Problem Is that it does not have cutoff point to judge with it.
Case-wise: There is only two cases in the model which is consider as outliers because these
two cases have Std Residual > 1.96, 2 Cases Over 68 Cases is equal to 2/68 = 2.94%, which
is fine 2.95% < 5%. In My Opinion I don’t like to remove outliers in-case std residuals at 2
Std is less than 5%, Because sometimes removing outliers is an endless process every
time you will remove outliers the model will adjusted and find a new outliers according the
new fit model and so on.
I’m trying to remove these residuals from the model, and I have a three new residuals, so
that my theory is correct about if only 5% of your Std Residual or less lays below 2 Std
deviation then you are Ok!
12
From Scatter Plot we can find that the data here is Heteroscedasticity, so that the residual
increase with increase of predicted value, the variance of the residuals is not constant at
each level of predictor variables. No Clear Evidence on linearity or not
Null Hypothesis: The Variable is Normally distributed, Alternative: It is not
From The Test of normality above we can see that all residuals type is normally distributed
because it has Sig > 0.05.
13
In order to check Influential Points:
1-Cooks distance, it should be < 1, if not (>1) then this case in concerning
From our model no case have cooks distance more than 1 , and this is perfect
2-Leverage: average leverage in our model 1 = (K+1)/n, (6+1)/68 = 0.103 cases with 3 times
average leverage 3 time is concerning 3 X 0.103=0.309 .
From our model only case 28 have leverage value more than 0.309 but is cooks distance is
less than 1 so we can consider it as ok !
3-STD DF Beta, STD Df Beta >1 are concerning
No DF Betas are more than 1 expect case No.1 but it is slightly near 1 and its cooks distance
is also fine.
3∗(𝑘+1) 3∗(𝑘+1)
4-Covariance Ratios , values greater than 1+ and less than 1- are concerning
𝑛 𝑛
3∗(6+1)
, then 1± , then the range between [0.691 to 1.309 ] is acceptable range , any case out
68
of this range is concerning
Covariance Ratios only case 1,7,22,28,38,48,57 have values of CVR out of this range
[0.691 to 1.309], but also their cooks distance is also fine less than 1.
5-Mahalanobis distance: for our sample small one (68)<100 , any values above 15 need to
take care of it . only case 12,22 have value higher than 15, but their cooks distance is also
fine
Appendix 1 Shows All Case Summaries
AS a conclusion the model has not have many influential point and there are too little
indicators for them and because all of them does not have cooks distance more than 1
14
Model 2:
In this Model we are going to make a Backward Entry,

This method start the model with all variable [X1],
[X2],[X3],[X4],[X5],[X6] , and then it is trying to remove
the variable that its removal will not affect the model
Significantly , and then it is remove the most second
variable that its removal is will not affect the model
significantly and so on .
The Criteria of removeing variable is set to be the

probability of 0.1 , sothat any removable of variable will
result a Sig > 0.1 , will be consider as no signifance
change if we remove it , then it will be removed ,
Null Hypothesis: the change is significance
Alternative Hypothesis: the change is not Significance
As we can see the model start with full variable then it removes the water temperature
because it has Sig 0.983 > 0.1, after that it removes Sulfate Concentration because it has Sig
0.921 > 0.1, after than the model stop removing because any variable removed it will have a
Sig < 0.1, so its removal will be significantly change the model!
15
As we can see from the model summary the model the 1st model it starts with 0.624 and
removing 2 variable [X6] and [X5] will not affect the model, but it is improving the
difference between R Square and Adjusted R Square Now R square 0.624 and adjusted R
Square 0.601. , Durbin Watson for independent error check is 1.018 > 1 and that is ok
Reason for Eliminating Temperature [X6]: In my opinion the Temperature removed

because it is collinear with the dissolved
oxygen and this appear clearly in the
Previous Model [1]: in page 10 and in page 11
with high VIF and Pearson coloration.
In Real Life Experience it is Known that

temperature and dissolved oxygen have
inverse relationship, whenever the
temperature increase then the dissolved
oxygen will be decreased.
Reason for Eliminating Sulfate [X5]:

In my opinion I think sulfate does not have any effect on fishes at these low level, the level
measured in samples does not affect fish significantly, sulfate effect fishes at high
concertation when sulfate reach a hundred of mg/l.
16
ANOVA Test Also Shows us how F Improve through model elimination process we always
seek for Higher F Value with Sig <.05 , and in this model Sig < 0.05 , So it is very good
As shown in coefficients table now in model [2] all variables have significance less than
0.05 and that is perfect, now only constant has Sig Higher than 0.05. Moreover No VIF >10
, and All VIF is Around 1 that result is away better than Model [1] .
17
The condition indices are computed as the square roots of the ratios of the largest eigenvalue
to each successive eigenvalue. Values greater than 15 indicate a possible problem with
collinearity; greater than 30, a serious problem. Six of these indices are larger than 30,
suggesting a very serious problem with collinearity. [ In our Model We have any only [X4]
with CI > 15, No Value More than 30 .
Same as model [1] the same outliers appear but they only represent less than 5% from all sample
so outliers is fine .
Side Note : there is no outliers more than 3 Std
18
each level of predictor variables. No Clear Evidence on linearity or not
We can see from test for Normality that all residuals are normally distributed with Sig >
0.05, and that is perfect
19
no cases have cooks distance more than 1
average leverage 3 time is concerning 3 X 0.103=0.309
no cases have leverage value more than 0.309
3-Standard DF Beta, Standard Df Beta >1 are concerning
No DF Betas are more than 1
3∗(𝑘+1) 3∗(𝑘+1)
𝑛 𝑛
3∗(6+1)
68
Covariance Ratios no values out of this range [0.691 to 1.309],
take care of it .
No Mahalanobis distance above 15
Appendix 2 Shows The Case Summaries
As a conclusion the model has not have many indicators for influential points and all cases
does not have cooks distance more than 1
20
Side Note:
Model [2] : Can be Run through Forward Entry Method , and it will have the same result of
backward in this model , same R Square , same coefficient , same everything ) , but the
procedures that forward run out it is totally different way from Backward .
Here is how the forward for our model will be run out
As we see here the model start without any variables, then it is add the most variable will
give us a significant change in R Square with Sig < 0.05, after that [ addition of Maximum
water Depth], it is found that [nitrate is the most second variable] that if added will improve
the model, and so on …..
In this experiment Forward and backward model give the same result Same R2 and same
regression equation
21
Model 3:
In this model we need to think in more professional way

based on real life, in our experiment we have the
Independent variable is number of fish in the water stream
based on area drained by this stream, water depth of stream,
Dissolved oxygen …etc. .
Based on the above if there is no area provide stream with water [X1=0] then there is no
stream at all then no fishes !! . Same concept can be applied on water depth if the stream
depth is zero [X3=0] then there is no water then no
fishes!!
So that our model 3 will be carried out without

any presence of Constant in Backward entry. we
can do that from options and untick include
constant in linear regression.
22
As we discussed in model 2 , the variable with Sig > 0.1 will be excluded , because their
removal from the model will not affect the model significantly
Sulfate and temperature excluded because they have Sig > 0.1
Wow !!!, Removing constant have massive improve in R Square it is jump from 0.624 to 0.914
and the difference between R Square and Adjusted R Square is decreased, on other hand
Durbin Watson here is less than 1 , but it is is too close to 1 , 0.98 approximately 1 .
23
ANOVA Test All F Values have Sig < 0.005 That is Very Good, and F is increasing per each
iteration
24
The null hypothesis for coefficients test is
Null hypothesis B=0
Alternative hypothesis B≠0
Look how the coefficient is Perfect! all Coefficients is Significant, and well fitted to the
model, all coefficients have sig value less than 0.05.
The Problem that appears here that we have two variables with VIF > 10, Dissolved Oxygen
and Maximum Water Depth this may lead to multi collinearity
The Only Problem in modeling through origin that the variable it seems to be multi-
collinear with high person coloration between themselves.
So That we need to keep any eye on high VIF Variable because it has also high correlation [
Keep an eye on Dissolved Oxygen and Maximum water depth] or remove one of them from
the model
25
The condition indices are computed as the square roots of the ratios of the largest
eigenvalue to each successive eigenvalue. Values greater than 15 indicate a possible
problem with collinearity; greater than 30, a serious problem. Six of these indices are larger
than 30, suggesting a very serious problem with collinearity. We don’t have any Varible with
CI > 15.
We can see from test for Normality that all residuals are normally distributed with Sig >
0.05, and that is perfect .
26
each level of predictor variables.
No Clear Evidence on linearity or not
27
No case have cooks distance more than 1
average leverage 3 time is concerning 3 X 0.103=0.309
no cases have leverage value more than 0.309
3-DF Beta, Df Beta >1 are concerning
No DF Betas are more than 1
3∗(𝑘+1) 3∗(𝑘+1)
𝑛 𝑛
3∗(6+1)
68
No values out of this range [0.691 to 1.309],
take care of it .
No Mahalanobis distance above 15
AS a conclusion the model has not have many influential point and there are too little
indicators for them and because all of them does not have cooks distance more than 1
28
3 Logistics Model:
In Logistics Model we are going to Convert the continuous dependent variable into a Binary
categorical variable by follow this category:
If the number of fish range from [ 0 – 60] then it is a Poor Stream with Aquatic life. [0]
If the number of fishes is more than 60 then it is a rich stream with Aquatic life. [1]
We need also to convert one independent variable to binary Categorical Variable
I choose to convert area drained by stream:

if the Area Drained by Steam is range from [ 0 to 7000] Acres, then it is a small area [0]
If the Area Drained by Steam is More than [ 7000] Acres, then it is a large area [1]
29
30
Logistic Model 1 :
By Forced Enter Method , We enter all variable , and make [X1] as Independent categorical
variable and make dummy coding by categorical option in binary regression .
The result is come out as below Block Zero:
31
Classification Table gives us a fast indication how the percent of model correctness in block zero in
our model it is 75% .
-2 L L = 76.478 , We Always Seek for minimum log likelihood , this number should be improve and
go down when model Block 1 .
Note: There is no Variable in Block Zero Only Constant
Exp[B] is the Odd Raito this number should be far from 1, because, this number means how 1 unit
change in this variable will affect the independent variable probability of the outcome that coded
“1” to happen in our case “ Rich Aquatic Life “ Coded as “1” , Value of 3 is very good.
the null hypothesis is that the two coefficients of interest are simultaneously equal to zero. If the
test fails to reject the null hypothesis, this suggests that removing the variables from the model will
not substantially harm the fit of that. model
Then Sig = 000 then Sig < 0.05 we reject the null hypothesis so this is good one
32
Block 1:
AS we see each iteration the -2Log Likelihood is decreasing and this is what we want we want the
minimum 2log likelihood from the model
As we see from classification table the correctness percent increase from 75% to 95.6%, and it is
very good percent
As we can see also the model predict a 100% percent of cases in Rich Aquatic life , 51 out of 51
and 82.4 % from poor Aquatic Life 14 out of 17.
From Model Summary we can see that the new -2Log Likelihood is 17.632 which decreased from
the value in Block zero and that is good, Also we can here see both R Square Terms that equivalent
to the R Square in linear Regression Cox Snell = 0.579 and Nageelerke = 0.858 and both of them
depends on -2Log Likelihood. both of them can be consider as good R Square.
33
One of the most important test for the model fitness using Chi-Square Distortion the
Hosmer And Lemeshow Test, is testing
The Null Hypothesis is: the observed and expected proportions are the same across all
doses [ this mean well fitted model]
Alternative is: The alternative hypothesis is that the observed and expected proportions are
not the same
Therefore, we would like to accept the null hypothesis, in our model Sig = 0.89 > 0.05 , then
accept the null hypothesis that the model is well fitted
Omnibus Test is also good to know how significance the 2LL reduction is significant or not
for example in block 2LL was equal 76.47 and then in block 1 it was reduced to 17.623 is this
reduction from 76.47 to 17.623 significance or not??? Omnibus answer this question.
the null hypothesis of Omnibus test is : it reduction in 2LL from bassline model is not significant
Alternative hypothesis: Reduction IN 2LL is significant
In our model sig form omnibus test is 0 , which is < 0.05 so that we reject the null hypothesis
and accept that the reduction in 2LL is significant
34
Null hypothesis that that B=0 , Alternative B≠0 .
This Sig is for Wald, Wald is equal = [B/SE]^2 in SPSS
Now Let take a look for Variable in Equation first of All we note that [x3] [x5] [x6] [X2] and
constant has Sig > 0.05 and this is bad. On other hand we have [X1] [X4] have the Sig < 0.05.
Another problem here for [X5] and [X3] the EXP(B) [Odds Ratio] is too close from 1, and this
is bad sing this mean that increasing 1 unit in this variable will not change the probabilities
of occurrence for “1” (Rich Aquatic life).
Another Problem Also is The Range of C.I This interval should not have passing throughout
[1] on it only variable that meet this criteria is [X1] [X4] , and other variable is all passing
through 1 and this is also bad sing
We can see from this plot that most value is clustered at edged and this is perfect because
we don’t want value near the 50% probability cutoff point, when value around 50% increase
the error increase in model but forint in our model the value scatter at edge.
35
we need to check linearity:
Linearity has been met because all Sig > 0.05 that is amazing.
36
Outliers More than 2 Std Deviation
1- Cooks Distance >1 , we have 3 cases that have cooks distance more than 1 , Case 5 , Case 12
And Case 17 , we should care about these because these cases are outliers as shown in case
wise and also they are Influential Point .
2- Leverage: average leverage in our model 1 = (K+1)/n, (6+1)/68 = 0.103 cases with 3
times average leverage 3 time is concerning 3 X 0.103=0.309

In our Model we have case Marked in Red line in Appendix 4 have value more than
0.309, but all of them have cooks distance less than 1 so we can ignore them, but there
is only one case which is case 12
3-DF Value should be less than 1, these cases have marked in Red line in Appendix 4
Checking Multi-collinearity by running normal multiple regression
37
All VIF Are Less Than 10, so this is fine, but maybe we need to take care of Temperature and
dissolved oxygen, because they are already have high correlation in the linear regression
process
Collinearity Statistics
Model Tolerance VIF
1 (Constant)
Drained Area [X1] .651 1.537
Dissolved Oxygen .289 3.459

Concentration [mg/l] [X2]
Maximum Water Depth [cm] .639 1.565

[X3]
Nitrate Concentration [mg/l] .726 1.378

[X4]
Sulfate Concentration [mg/l] .954 1.048

[X5]
The Water Temperature on .319 3.131

The Sampling Date
[Degrees C] [X6]
38
Logistic Model [2] :

The Purpose of this model is to make a Forward LR model in logistic regression this will
help to eliminate the multi collinear variable and will help to find the un significance
variable in the model , the
We Can notice from classification table that the percent of correctness is 75 %
39
Block 1:
We can see that the correctness of our model is increasing by each step in forward method
our model final correctness is 95.6% [ Same As Model 1 ] as the forced entry method entry
but there is some variable is eliminated
We can notice from the model summary from block 1, that each step has a lower -2Log
Likelihood value this mean that each step has a better model than the step before and we
can notice that in step 3 2 Log the minimum value that we have possible get.
The step in forward start with empty model then it is adding the most significance variable
in the model after than it is add the second significance variable, the model stops when the
reduction by adding extra variable in in -2Log Likelihood is not significant anymore.
Our model have Cox Snell R – 0.573 and Nagelkerek R = 0.849 this R Squares is calculated
throughout -2 Log Likelihood [ Same As Model 1 ]
40
As we see the model start with [X4] , because it is the most significant from all other
variable , then in step 2 the model add [X1] , then the model add [X2] , after that the model
stop because any addition of the other reaming variable will not improve the model
significantly all remaining variable [x3][X5][X6] is all have Sig > 0.05 because of that
there is no addition happens .
From Variables in equation table we can find the following: [X2] [X4] [X1] have sig < 0.05,
this model has better coefficient significance than the Model 1 In page
41
35 Good News here! is that EXP(B) CI No Passing 1 on all Variables !! that is amazing and
better than model Number 1 in the page 35
The Null Hypothesis is : the observed and expected proportions are the same across all
doses [ this mean well fitted model ]
Alternative is : that the observed and expected proportions are not
In Our Model we have in step 3 Sig = 0.763, so that we will accept the null Hypothesis and
the model is well fitted
the null hypothesis of Omnibus test is: it reduction in 2LL from bassline model is not
significant
Alternative hypothesis: Reduction IN 2LL is significant
In our model sig form omnibus test is 0, which is < 0.05 so that we reject the null hypothesis
and accept that the reduction in 2LL is significant
42
We can see throughout these chart how our model is well scatter at the edge this mean that the
model is very good one, and we don’t have any values around 50% the cutoff point, if the model has
many values around 50% this mean the model is a bad one
2 Std Deviation Outliers
43
Influential cases
1- Cooks Distance There is only one case with cooks distance > 1 , which is case 5,12 only
2- Leverage: average leverage in our model 1 = (K+1)/n, (3+1)/68 = 0.0588cases with 3
times average leverage 3 time is concerning 3 X 0.0588=0.1765,

These cases have values more than 0.185. case Marked in red line in Appendix 5but
they are not concerning because they have cooks distance less than 1
3-Df Betas >1, these cases have DF beta > 1 , marked in red line in appendix 5
Linearity has been meet here all Sig is > 0.05
44
4 Cross Validation On Model [1] in Multiple Linear

Regression:
A 20% and 80% from the sample will be taken then both sample will be compared their R Square
together , we have sample size of 68 , then 68 X 0.2 = 14 , 68 X 0.8 = 54
so we will divide the sample into two sample
Frist one with 14 case and the second one with 54 case then compare Square
Sample 1
Modal Summary For Sample 1 :
45
Sample 2 :
46
Modal Summary 2:
R Square for sample 1 with [14 ] cases : have R Square 0.757 , adjusted 0.548
R Square for sample 1 with [54] cases : have R Square 0.621 , adjusted 0.572
The R Square full sample in the model 1 can show in page 8 , with R square 0.624 and R Adjusted
0.587
As we note that Sample [2] is more near to the model 1 , because it has large number of
observation 54 while sample 1 have only 14 observation and this make it 0.757 a little between
different from 0.621 , Maybe we need larger sample to make a strong cross-validation check
I Just make this section as a fast application for this method and to show how it apply
47
5 Appendices
Appendix 1 [ Case Summary For Model 1]
48
Centered Standardize Standardize Standardize Standardize Standardize Standardize Standardize

Cook's Leverage Mahalanobi COVRAT d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA
Distance Value s Distance IO Intercept X1 X2 X3 X4 X5 X6
1 .00069 .15025 10.06677 1.34066 1.02138 -.00451 -.03909 -.00078 -.00248 .00807 -.00514
2 .00658 .14421 9.66225 1.29790 .05623 .00235 -.03863 -.10179 .03122 -.09536 -.00271
3 .00318 .15063 10.09191 1.32780 .00209 -.01670 -.04817 .01180 .02779 .07068 .01238
4 .01419 .10022 6.71468 1.16121 .01580 .04986 .09660 -.13285 -.03750 .02337 -.06391
5 .04988 .08951 5.99744 .88037 -.24060 .24578 .09800 .26243 .41436 .11511 .06246
6 .00288 .13559 9.08451 1.30403 .02732 -.01279 -.05957 .01575 .06016 .06395 -.02854
7 .15941 .14914 9.99236 .67612 -.10569 .40476 .14591 -.73090 -.64427 .20693 .34099
8 .03160 .10753 7.20438 1.06330 .30011 .06450 -.35407 .09146 -.24468 -.05557 -.24064
9 .00585 .08436 5.65182 1.19381 -.03886 .05666 -.02142 .04704 -.12855 .05272 .05971
10 .01547 .06336 4.24506 1.04984 -.12692 .12196 .13166 -.00706 .16230 .11941 .02553
11 .00049 .06990 4.68307 1.22116 .01846 -.00956 -.00303 -.03125 .01067 -.02927 -.00780
12 .14901 .26662 17.86387 1.14273 .33179 .08331 -.22796 -.21340 -.54978 -.73364 -.05450
13 .00068 .06719 4.50203 1.21534 -.02767 -.00038 .01314 -.00553 .01405 .00066 .03905
14 .00272 .06989 4.68249 1.19768 -.00537 .00107 -.00020 -.03216 .02438 -.06248 .03775
15 .01129 .09172 6.14491 1.16378 .05123 .01179 -.03649 -.06013 .15184 -.13737 -.02324
16 .00235 .12614 8.45117 1.29170 .04462 -.03031 -.00855 -.02725 .08195 -.01097 -.06313
17 .02383 .03277 2.19535 .79394 -.05308 -.18512 .19256 -.17194 .00437 -.04692 .02825
18 .02504 .10568 7.08039 1.10013 .06304 .17499 -.23836 .21971 .10246 .01237 -.07642
19 .00010 .09956 6.67076 1.26672 -.00770 .00438 -.00086 .02048 .01186 -.00240 .00231
20 .00106 .03453 2.31335 1.16160 -.04707 .04504 .03438 .01692 .03786 -.02085 .03351
21 .02560 .11504 7.70796 1.12231 -.32096 .19553 .27130 -.08757 .18763 .12234 .26377
22 .10289 .09015 6.03996 .59617 .00223 .34822 .00880 -.60408 .10836 .26364 .05974
23 .00248 .01524 1.02107 1.08475 -.06366 .03662 .04169 .01575 .07467 .02247 .03577
24 .00797 .15209 10.18976 1.30490 .00319 .08266 -.11265 .13202 .06633 .08982 -.02617
25 .00987 .05296 3.54803 1.07857 .01509 .06476 -.13598 .09842 .01512 .10613 -.01173
26 .01211 .14502 9.71612 1.26915 .10294 .09422 -.00954 -.11543 .12495 -.20149 -.11921
27 .00300 .04508 3.02032 1.14956 .00023 .01844 .01906 -.01683 .10811 .00440 -.03985
28 .03805 .36387 24.37918 1.71788 .20273 -.19940 -.14527 -.08424 -.13627 .36842 -.22369
29 .02059 .11886 7.96330 1.16291 .19611 -.15862 -.13562 -.11492 -.26659 -.08619 -.09947
30 .00017 .03041 2.03740 1.17229 .01465 -.00852 -.00619 -.01436 -.01469 .01202 -.01135
31 .03732 .09296 6.22819 .97692 .01463 -.36965 -.08344 .26839 .13969 -.01517 -.05436
32 .00003 .04284 2.87023 1.19075 .00271 -.00286 .00299 -.00727 -.00555 -.00732 .00000
33 .00153 .04237 2.83906 1.16660 .08152 -.00505 -.06631 .00484 .01090 -.04558 -.07508
34 .00052 .07023 4.70541 1.22138 .00332 -.01246 .02043 -.01934 .02397 -.02913 -.01153
35 .00014 .10459 7.00734 1.27370 -.01677 .01192 .02296 -.01139 .01609 -.00982 .01403
36 .00256 .14018 9.39174 1.31357 .06260 .08096 -.06307 -.02926 .03936 .02320 -.06817
37 .03623 .06047 4.05147 .84059 -.07031 -.22257 .11887 -.25032 -.02292 .10887 .11776
38 .00059 .21269 14.25001 1.45074 -.02941 -.04635 .02424 .01545 -.02323 .01924 .03055
49
39 .00414 .02359 1.58025 1.07319 -.06072 -.05061 .03406 -.00211 .04930 .06656 .04960
40 .00319 .07894 5.28887 1.20824 .04196 -.02132 -.00383 .01864 .09687 -.05831 -.07140
41 .00207 .05313 3.55987 1.17712 -.08978 .02981 .07691 -.02045 -.00791 .00099 .10410
42 .00166 .02062 1.38135 1.12205 -.06710 .01601 .04276 .01322 .00172 .01608 .07602
43 .00014 .07952 5.32780 1.23813 .00279 -.00829 -.00473 .01383 .01225 -.01839 -.00445
44 .00020 .03234 2.16653 1.17423 -.01487 -.02114 .01061 .00437 -.00838 .00149 .01787
45 .00200 .10300 6.90090 1.25720 -.09762 .02883 .09747 -.00760 .04573 -.01718 .08950
46 .00532 .03030 2.02985 1.07298 .06689 -.05129 -.08048 -.03225 .03165 .04200 -.05883
47 .00000 .05314 3.56066 1.20438 .00012 -.00102 -.00057 .00123 .00043 .00009 -.00028
48 .00004 .18885 12.65286 1.40943 .00350 -.00756 .00222 -.00320 .00588 .00081 -.00695
49 .00137 .01545 1.03485 1.11715 -.03236 -.00723 .00633 .03988 -.02893 .02894 .04040
50 .00290 .04527 3.03292 1.15138 -.07696 .03317 .08000 -.03390 -.04122 -.04238 .10330
51 .00116 .04045 2.71032 1.16945 -.04467 .03070 .03600 -.00685 -.03255 -.01513 .06095
52 .01504 .09674 6.48182 1.14669 -.21589 .16041 .22961 -.09883 .07216 -.11718 .23272
53 .07472 .17385 11.64785 1.06339 -.55395 -.23474 .50208 .01523 -.08383 .35079 .53766
54 .00042 .04811 3.22362 1.19192 -.00797 -.02747 .01091 -.00028 .02103 -.01498 .00652
55 .00036 .07910 5.29974 1.23542 -.00785 -.02024 -.00302 .03923 -.00583 .02200 .00074
56 .00687 .02462 1.64943 1.01987 .15619 -.06248 -.12029 .03157 -.06349 -.00625 -.14992
57 .00022 .16033 10.74208 1.35974 .00164 -.02893 -.00357 .01113 -.00725 -.01183 .00306
58 .00249 .05697 3.81698 1.17833 .01596 -.00919 -.04061 .05214 -.05457 .08834 -.02076
59 .01012 .04501 3.01560 1.04931 -.01418 .01867 .02344 .13157 .14975 -.08761 -.04382
60 .01275 .04268 2.85928 1.00447 .08575 -.11786 -.11374 .20262 -.07803 -.04460 -.08120
61 .04674 .06237 4.17861 .76431 .12311 -.31207 -.10821 .37256 .19029 -.06223 -.22489
62 .03023 .06335 4.24451 .90855 -.11401 -.27798 .21738 .01923 -.17083 .05091 .10262
63 .02461 .17143 11.48592 1.26450 .00541 -.15093 -.05739 .35594 .06446 -.07628 -.06697
64 .02681 .03342 2.23894 .76002 .09741 -.27857 -.02610 .16583 -.11004 -.00386 -.12350
65 .00884 .04455 2.98477 1.06523 .05372 .09861 -.00889 .02986 .01904 -.07915 -.08344
66 .01239 .02170 1.45421 .89085 .07098 .06867 -.03627 .05534 -.06443 -.11314 -.06680
67 .02092 .03876 2.59716 .87518 .18214 .10130 -.18048 .11286 -.10699 -.03605 -.18265
68 .01724 .03862 2.58728 .92325 .07954 .18303 -.04189 .00220 -.07044 -.05912 -.08553
Total N 68 68 68 68 68 68 68 68 68 68 68
50
51
Case Summariesa
Standardized
Cook's Centered Mahalanobis DFBETA Standardized Standardized Standardized Standardized Standardized Standardized
Distance Leverage Value Distance COVRATIO Intercept DFBETA X1 DFBETA X2 DFBETA X3 DFBETA X4 DFBETA X5 DFBETA X6
1 .00090 .14688 9.84122 1.28969 .04621 -.00260 -.05183 -.00245 -.00189 . .

2 .00762 .11240 7.53083 1.21548 .11992 -.01107 -.07358 -.08998 .02038 . .
3 .00286 .11247 7.53580 1.23150 .05201 -.00800 -.07117 .00167 .03418 . .
4 .01906 .09460 6.33822 1.14310 -.10953 .06413 .23056 -.14481 -.04010 . .
5 .06771 .08481 5.68211 .93761 -.46669 .25980 .09761 .25554 .44032 . .
6 .00248 .09666 6.47602 1.20949 .01978 .00039 -.04385 .00307 .06033 . .
7 .19152 .12872 8.62454 .78586 .61325 .38693 -.14240 -.73719 -.59417 . .
8 .03111 .07494 5.02115 1.04809 .19516 .09206 -.26748 .07913 -.26614 . .
9 .00685 .06976 4.67365 1.14877 .05771 .05584 -.09450 .04414 -.11802 . .
10 .01861 .05290 3.54397 1.04820 -.24473 .13717 .19283 -.02381 .17934 . .
11 .00058 .04773 3.19778 1.15145 .02356 -.01359 -.00026 -.02953 .00715 . .
12 .07896 .12546 8.40578 1.03563 .47853 -.01055 -.35579 -.09125 -.56217 . .
13 .00065 .04051 2.71421 1.14155 .02231 -.00621 -.02597 -.00235 .01732 . .
14 .00279 .04449 2.98099 1.13131 .06247 -.01388 -.05660 -.01992 .02053 . .
15 .01213 .06587 4.41317 1.11491 .04149 -.00423 -.05193 -.04156 .13635 . .
16 .00233 .09114 6.10625 1.20208 -.03786 -.02230 .05802 -.03047 .07449 . .
17 .03387 .03190 2.13714 .85634 -.08875 -.20326 .26207 -.16752 .00099 . .
18 .03443 .10148 6.79942 1.10350 -.01548 .19349 -.27974 .21642 .10013 . .
19 .00016 .09759 6.53828 1.21972 -.01694 .00405 -.00478 .02289 .01271 . .
20 .00120 .02337 1.56578 1.11266 -.05165 .03888 .01047 .02400 .03928 . .
21 .01775 .05673 3.80060 1.06365 -.16976 .16810 .12164 -.08104 .21225 . .
22 .13143 .08065 5.40341 .71118 .23103 .38443 -.01404 -.65065 .14536 . .
23 .00320 .01230 .82405 1.06341 -.07795 .03558 .02624 .01569 .08169 . .
24 .00841 .12460 8.34852 1.23290 -.02872 .09608 -.12260 .11163 .07190 . .
25 .01133 .04159 2.78626 1.06397 .04225 .08245 -.17841 .08146 .02678 . .
26 .00575 .04617 3.09318 1.11346 -.07128 .08032 .08827 -.09015 .08812 . .
27 .00392 .04037 2.70449 1.11618 -.09765 .02561 .07807 -.02139 .10747 . .
28 .00538 .08142 5.45503 1.17464 .06345 -.07357 .06302 -.10290 -.06977 . .
29 .02582 .10372 6.94952 1.13792 .26054 -.16003 -.10717 -.11220 -.28852 . .
30 .00016 .01926 1.29066 1.11934 .01415 -.00496 .00550 -.01652 -.01348 . .
31 .05318 .09171 6.14442 1.01218 -.10120 -.37663 -.06853 .27426 .13728 . .
32 .00004 .02755 1.84569 1.13063 .00635 -.00483 .00418 -.00757 -.00787 . .
33 .00065 .00296 .19858 1.08704 .02099 -.00038 -.02111 .00558 -.00003 . .
34 .00060 .04781 3.20341 1.15143 -.02969 -.01603 .04295 -.01694 .02104 . .
35 .00017 .06491 4.34922 1.17608 -.01560 .01003 .02019 -.01020 .01874 . .
36 .00211 .09278 6.21593 1.20532 .00650 .08862 -.01220 -.03621 .03446 . .
37 .04663 .05355 3.58765 .89718 .13429 -.23055 .06304 -.26286 -.00142 . .
38 .00043 .14372 9.62895 1.28605 .00152 -.04231 .00392 .01316 -.01628 . .
39 .00450 .01496 1.00263 1.05279 -.02201 -.04954 .00494 -.00841 .06142 . .
52
40 .00283 .04495 3.01155 1.13177 -.08026 -.01932 .06897 .02193 .08535 . .

41 .00066 .00158 .10596 1.08383 .01515 .01453 -.00381 -.01162 -.00006 . .
42 .00113 .00245 .16427 1.07427 .01161 .00719 -.02113 .01745 .00935 . .
43 .00016 .04489 3.00735 1.15075 -.01024 -.01209 -.00589 .01922 .01144 . .
44 .00022 .02150 1.44040 1.12132 .00458 -.02445 -.00456 .00585 -.00706 . .
45 .00102 .03015 2.02038 1.12447 -.04609 .01315 .04131 .00270 .04967 . .
46 .00636 .02356 1.57855 1.05676 .04742 -.03750 -.04836 -.04459 .03261 . .
47 .00000 .05128 3.43551 1.15981 -.00024 -.00064 -.00035 .00079 .00028 . .
48 .00002 .14954 10.01908 1.29612 -.00539 -.00467 .00843 -.00284 .00393 . .
49 .00146 .00804 .53866 1.08124 .02108 -.00933 -.03343 .03952 -.02283 . .
50 .00141 .00621 .41594 1.07780 .03555 .01216 -.00525 -.01869 -.03882 . .
51 .00081 .01225 .82103 1.10052 .02615 .02016 -.01905 .00074 -.03040 . .
52 .00612 .02075 1.39050 1.05082 -.03929 .10686 .05853 -.05888 .07312 . .
53 .02093 .03618 2.42373 .97514 -.05495 -.23189 .17096 .00547 -.00236 . .
54 .00058 .04201 2.81462 1.14402 -.01022 -.03266 .00708 .00279 .02093 . .
55 .00033 .06080 4.07328 1.16986 -.01184 -.01598 -.00178 .03304 -.00293 . .
56 .00508 .00629 .42115 1.00641 .04879 -.04182 -.00988 .01994 -.07570 . .
57 .00037 .14235 9.53763 1.28410 .01005 -.03701 -.01299 .01571 -.00993 . .
58 .00156 .02161 1.44754 1.10580 .01627 .00604 -.02246 .03484 -.04338 . .
59 .01270 .03731 2.49965 1.04155 -.17514 .01326 .07510 .14586 .13968 . .
60 .01663 .03747 2.51057 1.01233 .01798 -.11582 -.08862 .20832 -.09134 . .
61 .05574 .05051 3.38428 .83521 -.24356 -.29433 .08881 .36983 .16852 . .
62 .04071 .05885 3.94292 .94995 -.03971 -.29476 .22635 .02055 -.16053 . .
63 .03322 .16091 10.78087 1.23473 -.17561 -.15654 -.02238 .37180 .05162 . .
64 .03541 .02968 1.98846 .82973 -.04434 -.26918 .10651 .16010 -.12230 . .
65 .01006 .03262 2.18553 1.04861 -.08535 .10274 .07285 .03599 .00362 . .
66 .01451 .01497 1.00325 .92151 -.00647 .06473 .00446 .06929 -.08465 . .
67 .02280 .02669 1.78818 .91230 .02956 .12601 -.06961 .10478 -.12644 . .
68 .02272 .03419 2.29064 .95275 -.01423 .19364 .02709 .00426 -.08573 . .
Total N 68 68 68 68 68 68 68 68 68
a. Limited to first 100 cases.
53
54
Case Summariesa
Centered Standardize Standardize Standardize Standardize Standardize Standardize Standardize
Cook's Leverage Mahalanobis COVRATI d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA d DFBETA
Distance Value Distance O Intercept X1 X2 X3 X4 X5 X6
1 .00000 .08348 5.67640 1.16201 . .00008 -.00257 .00144 .00374 . .
2 .00335 .07854 5.34050 1.14446 . -.00148 .00546 -.03678 .10714 . .
3 .00160 .10276 6.98801 1.18287 . -.00302 -.03646 .01689 .07173 . .
4 .02312 .09550 6.49369 1.11430 . .05903 .22213 -.21113 -.16806 . .
5 .03233 .03762 2.55844 .89500 . .22652 -.28102 .08832 .16517 . .
6 .00249 .10780 7.33050 1.18755 . .00173 -.03734 .01026 .09295 . .
7 .12481 .09139 6.21440 .84867 . .39054 .31251 -.49212 -.21289 . .
8 .02538 .06790 4.61734 1.04628 . .09937 -.17112 .15182 -.16840 . .
9 .00663 .07617 5.17924 1.12981 . .05598 -.06958 .06560 -.10004 . .
10 .00911 .02429 1.65205 .99502 . .12432 .04387 -.12977 .01560 . .
11 .00033 .05026 3.41744 1.11965 . -.00886 .01519 -.01673 .02412 . .
12 .03018 .06073 4.12971 1.00724 . .02166 -.04497 .07872 -.26826 . .
13 .00046 .04663 3.17091 1.11451 . -.00369 -.01221 .00525 .03675 . .
14 .00186 .04246 2.88701 1.10060 . -.00790 -.01770 .00315 .07556 . .
15 .01405 .07828 5.32279 1.10841 . -.00105 -.03192 -.02744 .22207 . .
16 .00346 .09264 6.29949 1.16384 . -.02943 .05123 -.05602 .07780 . .
17 .04171 .04452 3.02749 .88539 . -.21401 .27502 -.21975 -.08425 . .
18 .04397 .11603 7.88996 1.10704 . .19512 -.38997 .22969 .12483 . .
19 .00057 .07044 4.79019 1.14357 . .00597 -.04573 .03852 .00312 . .
20 .00108 .02094 1.42410 1.07418 . .04003 -.03577 .00581 .00585 . .
21 .01644 .04829 3.28381 1.03115 . .16312 .01361 -.16311 .13780 . .
22 .14680 .08826 6.00189 .78436 . .39524 .18013 -.59643 .41162 . .
23 .00275 .01666 1.13318 1.03976 . .03145 -.03532 -.01527 .04084 . .
24 .01143 .13654 9.28493 1.21131 . .09923 -.19827 .11461 .07574 . .
25 .01328 .05452 3.70709 1.06294 . .08454 -.19698 .10312 .07582 . .
26 .00673 .05002 3.40154 1.08572 . .08021 .05863 -.13437 .05749 . .
27 .00305 .02796 1.90129 1.06694 . .02021 .02021 -.06906 .06147 . .
28 .00439 .08156 5.54593 1.14532 . -.06049 .12228 -.07491 -.03154 . .
29 .01041 .05611 3.81575 1.07977 . -.11577 .07061 -.01328 -.12388 . .
30 .00008 .02540 1.72686 1.09196 . -.00280 .01415 -.00869 -.00370 . .
31 .06681 .10240 6.96294 1.02210 . -.39423 -.18387 .26072 .09534 . .
32 .00001 .03451 2.34641 1.10302 . -.00187 .00477 -.00241 -.00207 . .
33 .00063 .01524 1.03657 1.07047 . .00116 -.00921 .01375 .01891 . .
34 .00086 .04378 2.97733 1.10856 . -.02355 .03992 -.03878 .00106 . .
35 .00043 .05665 3.85217 1.12695 . .01494 .02220 -.02910 .01859 . .
36 .00248 .10704 7.27898 1.18653 . .08687 -.01023 -.03540 .05219 . .
37 .05202 .06316 4.29482 .93288 . -.21680 .19765 -.22496 .12320 . .
38 .00048 .15825 10.76082 1.26444 . -.03999 .00618 .01399 -.01985 . .
39 .00571 .02903 1.97390 1.04541 . -.05237 -.01297 -.01827 .06505 . .
55
40 .00247 .03215 2.18639 1.08012 . -.02901 .02433 -.00984 .04722 . .

41 .00073 .01515 1.02991 1.06869 . .01527 .00796 -.00624 .01389 . .
42 .00134 .01674 1.13849 1.06192 . .00801 -.01773 .02323 .02366 . .
43 .00033 .05185 3.52600 1.12158 . -.01751 -.02279 .02252 .00821 . .
44 .00024 .03552 2.41521 1.10244 . -.02262 -.00192 .00762 -.00505 . .
45 .00101 .02583 1.75627 1.08284 . .01139 .01713 -.01841 .02913 . .
46 .00704 .03555 2.41748 1.05248 . -.03319 -.02222 -.02831 .08786 . .
47 .00001 .06327 4.30205 1.13689 . -.00429 -.00435 .00486 .00102 . .
48 .00053 .11927 8.11041 1.20806 . -.02660 .03383 -.02730 .00158 . .
49 .00162 .02134 1.45126 1.06816 . -.00755 -.02530 .04976 -.01117 . .
50 .00131 .01712 1.16399 1.06336 . .01421 .02303 -.00557 -.01885 . .
51 .00070 .02232 1.51752 1.08095 . .02042 -.00222 .01041 -.01571 . .
52 .00765 .03367 2.28937 1.04288 . .10690 .04454 -.08134 .06519 . .
53 .02631 .04944 3.36161 .98526 . -.24099 .18235 -.01649 -.05643 . .
54 .00085 .05465 3.71592 1.12246 . -.03691 .00050 -.00121 .02109 . .
55 .00061 .06907 4.69707 1.14170 . -.02136 -.01607 .03894 -.01929 . .
56 .00559 .01903 1.29396 1.00959 . -.03766 .02918 .04052 -.05722 . .
57 .00013 .14845 10.09443 1.25045 . -.01976 -.00460 .01140 -.00225 . .
58 .00176 .03506 2.38412 1.09038 . .00707 -.01513 .04273 -.04300 . .
59 .00902 .02695 1.83281 1.00809 . -.00005 -.05619 .09096 .02757 . .
60 .02074 .05198 3.53436 1.02089 . -.11486 -.10213 .23192 -.10896 . .
61 .05767 .05201 3.53638 .85602 . -.32010 -.09725 .30674 .00113 . .
62 .05179 .07300 4.96401 .97132 . -.30251 .26959 .00626 -.26244 . .
63 .03842 .14289 9.71670 1.17252 . -.18189 -.19601 .35240 -.10204 . .
64 .04475 .04391 2.98614 .86639 . -.27643 .10391 .15648 -.21314 . .
65 .01157 .04047 2.75226 1.03579 . .10018 .02291 .00463 -.07892 . .
66 .01840 .02966 2.01709 .94151 . .06491 .00025 .07263 -.12383 . .
67 .02835 .04109 2.79387 .93805 . .12873 -.06671 .12493 -.14641 . .
68 .02887 .04881 3.31902 .97030 . .19491 .02377 -.00114 -.13295 . .
Total N 68 68 68 68 68 68 68 68
56
Appendix 4 [ Case Summary For Logistic

Model 1]
57
Case Summariesa
Analog of
Cook's
influence DFBETA for DFBETA for
statistics Leverage value constant X1(1) DFBETA for X2 DFBETA for X3 DFBETA for X4 DFBETA for X5 DFBETA for X6
1 .00000 .00001 .00000 .00000 .00000 .00000 .00000 .00000 .00000

2 .00000 .00004 .00001 -.00001 .00000 .00000 .00000 .00000 .00000
3 .00000 .00001 .00000 .00000 .00000 .00000 .00000 .00000 .00000
4 .00000 .00034 .00013 -.00006 .00003 .00000 -.00003 .00000 -.00001
5 2.18002 .16787 -3.58578 -.00102 -.41987 .01958 .33708 .03960 .09051
6 .00000 .00002 .00000 .00000 .00000 .00000 .00000 .00000 .00000
7 .11306 .48025 .36801 -.45116 .10861 -.00655 -.11247 .02768 -.00614
8 .01049 .23349 .58392 -.12311 .01390 .00257 -.04741 .00603 -.02861
9 .00049 .06355 .05209 -.02660 .00791 .00047 -.01021 .00136 -.00418
10 .05328 .34356 .22819 -.32053 .14793 .00355 -.10038 .01494 -.04871
11 .00002 .01744 .00944 -.00467 .00193 .00010 -.00194 .00023 -.00093
12 2.68435 .71950 4.85399 .10402 .11223 .00396 -.35071 -.23769 -.06691
13 .00000 .00101 .00053 -.00020 .00009 .00000 -.00008 .00001 -.00004
14 .00000 .00023 .00008 -.00004 .00002 .00000 -.00002 .00000 -.00001
15 .00000 .00133 .00063 -.00026 .00012 .00000 -.00011 .00001 -.00006
16 .00021 .04296 .01949 -.01779 .00598 .00030 -.00659 .00066 -.00220
17 1.20738 .10665 -3.45663 1.04256 .10131 -.02258 .20741 -.01150 .11407
18 .00001 .00880 .00146 -.00179 .00190 .00000 -.00102 -.00002 -.00037
19 .12767 .41399 .60069 .06566 .10388 -.01154 -.04535 .00235 -.00402
20 .25290 .27929 1.96122 .47193 -.10642 -.00499 -.00945 .01300 -.03919
21 .00126 .08209 .17911 -.01462 .00928 .00077 -.01406 -.00002 -.00944
22 .00000 .00030 .00012 -.00007 .00003 .00000 -.00003 .00000 -.00001
23 .00002 .01012 .01349 -.00593 .00238 .00005 -.00201 .00000 -.00085
24 .00000 .00401 .00059 -.00074 .00076 .00000 -.00039 -.00003 -.00014
25 .00000 .00063 .00019 -.00018 .00011 .00000 -.00006 .00000 -.00003
26 .37724 .52106 -.29371 .35568 -.10813 .01426 -.11381 .07871 -.01273
27 .00030 .06659 .07981 -.00827 .00440 .00013 -.00676 .00164 -.00430
28 .00002 .00959 .00880 -.00563 .00203 .00006 -.00190 .00002 -.00061
29 .02999 .18135 -.52960 -.12191 .07287 .00280 -.00965 -.00394 -.00098
30 .00001 .00841 .01068 -.00416 .00175 .00002 -.00149 .00009 -.00066
31 .00000 .00040 .00022 -.00011 .00006 .00000 -.00004 .00000 -.00002
32 .00000 .00358 .00188 -.00150 .00080 .00000 -.00050 -.00003 -.00017
33 .00787 .10228 .44185 -.08959 .01062 .00066 -.01755 -.00477 -.01520
34 .01115 .13478 .27572 -.09080 -.01629 .00168 -.02388 .00731 -.00729
35 .00381 .11746 -.22759 -.04705 .04189 -.00078 -.00413 -.00533 .00487
36 .53149 .44375 5.42398 -.21552 -.42762 -.00789 .06954 .04349 -.10122
37 .00000 .00097 .00061 -.00025 .00012 .00001 -.00010 .00000 -.00005
38 .00000 .00128 .00093 -.00041 .00019 .00001 -.00016 .00000 -.00007
58
39 .14593 .32758 2.50756 -.21346 -.07409 -.00181 .00924 -.02242 -.06187

40 .00222 .08438 .07895 -.04427 .02474 .00046 -.01074 -.00304 -.00745
41 .01216 .13897 -.54007 -.08200 .04295 -.00164 .00111 -.00647 .02264
42 .01985 .12681 -.50853 -.10738 .02680 -.00104 .00556 -.00663 .02469
43 .00000 .00183 .00125 -.00061 .00029 .00000 -.00024 .00002 -.00010
44 .00003 .01288 .01870 -.00646 .00273 .00008 -.00233 .00006 -.00124
45 .00941 .15928 -.56301 -.06549 .05983 -.00084 -.00033 -.00842 .01574
46 .00000 .00036 .00013 -.00009 .00005 .00000 -.00004 .00000 -.00002
47 .00000 .00160 .00088 -.00058 .00030 .00000 -.00021 .00000 -.00008
48 .23415 .52520 -.62172 -.18754 -.20397 .00441 .03553 -.00859 .08559
49 .00028 .02545 .01935 -.02184 .00783 .00027 -.00659 .00015 -.00205
50 .00005 .01403 -.00739 -.00849 .00460 .00001 -.00228 -.00028 -.00021
51 .00002 .01015 -.00009 -.00620 .00301 .00004 -.00190 -.00009 -.00040
52 .00948 .16039 -.48571 -.06575 .05269 -.00144 .00123 -.00955 .01707
53 .08448 .33609 -2.02550 .08910 .17523 -.00232 -.02371 .00472 .04774
54 .00000 .00058 .00040 -.00016 .00007 .00000 -.00006 .00000 -.00003
55 .00000 .00197 .00112 -.00054 .00024 .00001 -.00022 .00001 -.00011
56 .00000 .00535 .00656 -.00214 .00088 .00003 -.00082 .00003 -.00042
57 .00000 .00138 .00094 -.00038 .00019 .00000 -.00016 .00001 -.00008
58 .00000 .00152 .00121 -.00043 .00017 .00001 -.00018 .00002 -.00009
59 .00006 .01877 .01087 -.00801 .00423 .00015 -.00271 -.00021 -.00137
60 .00000 .00568 .00523 -.00212 .00086 .00004 -.00081 .00001 -.00040
61 .20736 .46088 .98656 .16238 .12054 .01502 -.09172 -.01812 -.10155
62 .00000 .00321 .00107 -.00062 .00054 .00001 -.00037 .00001 -.00016
63 .00000 .00302 .00135 -.00066 .00032 .00002 -.00027 -.00001 -.00014
64 .00000 .00112 .00070 -.00033 .00017 .00001 -.00013 .00000 -.00006
65 .00000 .00027 .00014 -.00006 .00003 .00000 -.00002 .00000 -.00001
66 .00000 .00112 .00073 -.00034 .00017 .00001 -.00013 .00000 -.00006
67 .00000 .00084 .00065 -.00022 .00010 .00000 -.00009 .00000 -.00005
68 .00000 .00031 .00018 -.00008 .00004 .00000 -.00003 .00000 -.00001
Total N 68 68 68 68 68 68 68 68 68
59
Appendix 5 [ Case Summary For Logistic

Model 2]
60
Case Summariesa
DFBETA for DFBETA DFBETA

Analog of Cook's influence statistics Leverage value DFBETA for constant
X1(1) for X2 for X4
1 0 0.00011 -0.00006 -0.00002 0.00001 -0.00001

2 0 0.0005 -0.00034 -0.0001 0.00008 -0.00003
3 0 0.00008 -0.00005 -0.00001 0.00001 0
4 0 0.00175 -0.00136 -0.00038 0.00031 -0.00013
5 1.38031 0.1159 0.02937 -0.13116 -0.28695 0.27082
6 0 0.00014 -0.00008 -0.00003 0.00002 -0.00001
7 0.01098 0.13747 -0.02945 -0.12845 0.0349 -0.02962
8 0.04485 0.2315 0.21078 -0.25009 0.02448 -0.05045
9 0.00116 0.06142 -0.03785 -0.03974 0.01614 -0.01059
10 0.08043 0.30474 -1.1339 -0.30908 0.21185 -0.05737
11 0.00019 0.03206 -0.03806 -0.01475 0.0095 -0.00422
12 1.09908 0.42899 2.12492 0.33152 -0.21055 -0.164
13 0 0.00585 -0.00522 -0.0017 0.00126 -0.00054
14 0 0.00116 -0.00081 -0.00028 0.0002 -0.00009
15 0 0.00619 -0.00565 -0.00177 0.00134 -0.00056
16 0.00039 0.04076 -0.02166 -0.02234 0.00942 -0.00624
17 0.86407 0.09714 -2.14388 0.86595 0.11755 0.07421
18 0.00002 0.01207 -0.01422 -0.00176 0.00317 -0.00121
19 0.01808 0.11984 -0.15214 0.0318 0.0504 -0.02872
20 0.18024 0.19921 0.48006 0.4022 -0.0655 0.00113
21 0.0074 0.0967 -0.0441 0.0021 0.02854 -0.02166
22 0 0.00112 -0.0009 -0.00036 0.00025 -0.00011
23 0.00005 0.01214 -0.00817 -0.00816 0.00374 -0.00216
24 0 0.00594 -0.0061 -0.00089 0.0014 -0.00056
25 0 0.00126 -0.00112 -0.00038 0.00029 -0.00012
26 0.05819 0.17164 0.17949 0.12369 0.00959 -0.03219
27 0.00111 0.05405 -0.09018 -0.00714 0.02175 -0.00925
28 0.00003 0.0103 -0.00248 -0.00617 0.00237 -0.00168
29 0.02266 0.14132 -0.35461 -0.1162 0.06826 -0.00725
30 0.00003 0.00944 -0.00867 -0.00565 0.00303 -0.00154
31 0 0.00084 -0.00061 -0.00026 0.00018 -0.00008
32 0 0.00429 -0.0038 -0.00197 0.00119 -0.00058
33 0.011 0.08129 0.17013 -0.10541 -0.00225 -0.00889
34 0.00711 0.09294 0.21023 -0.07644 -0.00998 -0.011
35 0.00082 0.04448 -0.08696 -0.02651 0.01931 -0.00572
36 0.33397 0.27788 1.96968 -0.17238 -0.3229 0.11407
37 0 0.00291 -0.00262 -0.00117 0.00076 -0.00035
38 0 0.00289 -0.00241 -0.0012 0.00074 -0.00036
39 0.13322 0.19809 0.95249 -0.18108 -0.14011 0.05107
40 0.00469 0.07749 -0.17782 -0.06346 0.03927 -0.00965
41 0.00113 0.03582 -0.01974 -0.04029 0.01365 -0.00778
61
42 0.00474 0.05789 0.04056 -0.07781 0.01236 -0.01021

43 0 0.00227 -0.00185 -0.00089 0.00056 -0.00027
44 0.00012 0.01719 -0.02158 -0.01218 0.00672 -0.00305
45 0.00102 0.04328 -0.08532 -0.03215 0.02049 -0.0066
46 0 0.00093 -0.00075 -0.00028 0.0002 -0.00009
47 0 0.00229 -0.00199 -0.00087 0.00057 -0.00027
48 0.03475 0.23653 0.70589 -0.12255 -0.06892 -0.01272
49 0.00032 0.02345 -0.01334 -0.02107 0.0081 -0.00485
50 0.00003 0.0098 -0.0086 -0.00599 0.00313 -0.00163
51 0.00002 0.00893 -0.00676 -0.00535 0.00268 -0.00147
52 0.00073 0.03315 -0.05158 -0.03044 0.01558 -0.00638
53 0.00991 0.12641 -0.32489 0.00328 0.06614 -0.02164
54 0 0.00149 -0.00108 -0.00054 0.00034 -0.00017
55 0 0.00404 -0.00391 -0.00174 0.00113 -0.00052
56 0.00002 0.00891 -0.00807 -0.00523 0.00282 -0.00144
57 0 0.00303 -0.0031 -0.00106 0.00079 -0.00032
58 0 0.00308 -0.00237 -0.00132 0.00078 -0.0004
59 0.00023 0.0242 -0.03982 -0.01558 0.01012 -0.00375
60 0.00002 0.00903 -0.00732 -0.00541 0.00278 -0.00148
61 0.20037 0.23977 -0.98644 0.37756 0.13672 -0.00437
62 0.00001 0.00741 -0.00788 -0.00113 0.00183 -0.00074
63 0.00001 0.00719 -0.00832 -0.00342 0.00227 -0.00097
64 0 0.00269 -0.00246 -0.00104 0.0007 -0.00032
65 0 0.00108 -0.00091 -0.00033 0.00024 -0.0001
66 0 0.00266 -0.00229 -0.00106 0.00068 -0.00032
67 0 0.00248 -0.00195 -0.001 0.00062 -0.00031
68 0 0.00106 -0.00082 -0.00034 0.00023 -0.00011
Total N 68 68 68 68 68 68
62

Multiple Regression Project PDF

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Multiple Regression Project PDF

Caricato da

Copyright:

Formati disponibili

“Multiple Linear Regression Project “

Applied Statistics For Engineer

Submission Date: 7-Nov-2019

Student Name : Khalid Akram Hilu

Literature review about the case study ...................................................................................... 3

2 Multiple Linear Regression: ........................................................ 8

3 Logistics Model: ...................................................................... 29

Logistic Model 1 : .....................................................................................................................31

4 Cross Validation On Model [1] in Multiple Linear Regression: ...... 45

Appendix 1 [ Case Summary For Model 1] ............................................................................... 48

1 General Overview of The Project Problem

Literature review about the case study :

 The Area drained by the stream (in acres) [X1]

This experiment is done by The Maryland Biological Stream Survey (MBSS),and it is to

Here is Some Information Related to the experiment from real life

0-2 mg/L not enough oxygen to support fish life.

7-11 mg/L very good for most stream fish

4- The Area drained by the stream:

 The Area drained by the stream (in acres) [X1]

Descriptive analysis and normality test for each Variable

Null Hypothesis: The Variable is Normally distributed, Alternative: It is not

2 Multiple Linear Regression:

From the Model Summary we can interpret the following:

1- 62.4% from the variance in [Y] is explained by all 6 independent Variables

Null Hypothesis : All Model Coefficient is equal to zero B0 , B1 , B2 ,…… = 0

The null hypothesis for coefficients test is

Null hypothesis B=0

Alternative hypothesis B≠0

Null Hypothesis: The Variable is Normally distributed, Alternative: It is not

In order to check Influential Points:

3-STD DF Beta, STD Df Beta >1 are concerning

of this range is concerning

Appendix 1 Shows All Case Summaries

In this Model we are going to make a Backward Entry,

The Criteria of removeing variable is set to be the

Null Hypothesis: the change is significance

Alternative Hypothesis: the change is not Significance

Reason for Eliminating Temperature [X6]: In my opinion the Temperature removed

In Real Life Experience it is Known that

Reason for Eliminating Sulfate [X5]:

In order to check Influential Points:

no cases have cooks distance more than 1

no cases have leverage value more than 0.309

3-Standard DF Beta, Standard Df Beta >1 are concerning

No DF Betas are more than 1

of this range is concerning

Covariance Ratios no values out of this range [0.691 to 1.309],

No Mahalanobis distance above 15

Appendix 2 Shows The Case Summaries

In this model we need to think in more professional way

So that our model 3 will be carried out without

The null hypothesis for coefficients test is

Null hypothesis B=0

Alternative hypothesis B≠0

In order to check Influential Points:

No case have cooks distance more than 1

no cases have leverage value more than 0.309

3-DF Beta, Df Beta >1 are concerning

No DF Betas are more than 1

of this range is concerning

No values out of this range [0.691 to 1.309],

No Mahalanobis distance above 15

Appendix 3 Shows The Case Summaries