Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
17-1
Chapter Outline
1) Overview
2) Product-Moment Correlation
3) Partial Correlation
4) Nonmetric Correlation
5) Regression Analysis
6) Bivariate Regression
7) Statistics Associated with Bivariate Regression
Analysis
8) Conducting Bivariate Regression Analysis
i.
Scatter Diagram
17-2
Chapter Outline
iii. Estimation of Parameters
iv. Standardized Regression Coefficient
v. Significance Testing
vi. Strength and Significance of Association
vii. Prediction Accuracy
viii. Assumptions
9)
Multiple Regression
17-3
Chapter Outline
17-4
17-5
i=1
r=
i=1
(X i X )(Y i Y )
(X i X )
= 1
i
(Y i Y )2
D iv is io n o f th e n u m er ato r an d d en o m in ato r b y ( n 1 ) g iv es
n
i=1
r=
i=1
=
( X i X )( Y i Y )
n 1
(X i X )2
n 1
= 1
i
(Y i Y )2
n 1
C OV x y
SxSy
17-6
17-7
Duration of
Residence
Importance
Attached to
Weather
10
12
11
12
10
12
11
11
18
10
10
11
10
17
12
17-8
= (10 + 12 + 12 + 4 + 12 + 6 + 8 + 2 + 18 + 9 + 17 + 2)/12
= 9.333
= (6 + 9 + 8 + 3 + 10 + 4 + 5 + 2 + 11 + 9 + 10 + 2)/12
= 6.583
=1 (X i X )(Y i Y )
i
=
+
+
+
+
+
=
+
+
=
=1 (X i X )2
i
i =1
Thus,
+
+
=
+
+
=
r=
179.6668
(304.6668)(120.9168)
= 0.9361
17-10
E x p la in e d v a r ia tio n
r =
T o ta lv a r ia tio n
S S x
=
S S y
2
= T o ta l v a r ia tio n E r r o r v a r ia tio n
T o ta lv a r ia tio n
=
S S
S S
S S
e rro r
y
17-11
H0 : = 0
H1 : 0
Copyright 2010 Pearson Education, Inc.
17-12
t= r n2
1r 2
1/2
17-13
-3
-2
-1
17-14
Partial Correlation
A partial correlation coefficient measures the
association between two variables after
controlling for, or adjusting for, the effects of one
or more additional variables.
rx y . z =
rx y (rx z )(ry z )
1rx2z 1ry2z
17-15
Partial Correlation
17-16
rx y ry z rx z
ry (x . z )=
1rx2z
The partial correlation coefficient is generally viewed as
more important than the part correlation coefficient.
17-17
Nonmetric Correlation
If the nonmetric
variables are ordinal and numeric,
Spearman's rho, s , and Kendall's tau, , are two measures
to examine the
of nonmetric correlation,
which can be used
correlation between them.
Both these measures use rankings rather than the absolute
values of the variables, and the basic concepts underlying
them are quite similar. Both vary from -1.0 to +1.0 (see
Chapter 15).
17-18
Regression Analysis
Regression analysis examines associative relationships
between a metric dependent variable and one or more
independent variables in the following ways:
Determine whether the independent variables explain a
significant variation in the dependent variable: whether a
relationship exists.
Determine how much of the variation in the dependent
variable can be explained by the independent variables:
strength of the relationship.
Determine the structure or form of the relationship: the
mathematical equation relating the independent and
dependent variables.
Predict the values of the dependent variable.
Control for other independent variables when evaluating the
contributions of a specific variable or set of variables.
Regression analysis is concerned with the nature and degree
of association between variables and does not imply or
assume any causality.
Copyright 2010 Pearson Education, Inc.
17-19
17-20
17-21
17-22
17-23
17-24
0 + 1 Xi + ei
Yi =
17-25
Attitude
Fig. 17.3
9
6
3
2.25
4.5
6.75
Duration of Residence
Copyright 2010 Pearson Education, Inc.
17-26
Fig. 17.4
Line 2
Line 3
Line 4
6
3
2.25 4.5
Copyright 2010 Pearson Education, Inc.
6.75
Bivariate Regression
Fig. 17.5
0 + 1X
Y
YJ
eJ
eJ
YJ
X1
X2
X3
X4
X5
17-28
Y i = a + b xi
where Y i is the estimated or predicted value of Yi, and
0 and 1 , respectively.
a and b are estimators of
b=
i=1
S x2
(X i X )(Y i Y )
n
i=1
COV xy
i=1
n
(X i X )
X iY i nX Y
i=1
X i2 nX 2
17-29
XiYi
i =1
= (10) (6) + (12) (9) + (12) (8) + (4) (3) + (12) (10) +
(6) (4)
+ (8) (5) + (2) (2) + (18) (11) + (9) (9) + (17) (10) +
(2) (2)
= 917
12
X i2
i =1
17-30
9.333
Y = 6.583
Given n = 12, b can be calculated as:
917(12)(9.333)(6.583)
b=
1350(12)(9.333)2
= 0.5897
a =Y - b X
= 6.583 - (0.5897) (9.333)
= 1.0793
17-31
17-32
H0 : 1 = 0
H1 : 1 0
SEb
17-33
17-34
SSy=i=1 (YiY)2
n
SSreg=i (YiY)2
=1
n
SSres= i=1 (YiYi)2
17-35
Fig. 17.6
Y
Residual Variation
SSres
Explained Variation
SSreg
Y
al n
t
To tio
ria y
a
V SS
X1
Copyright 2010 Pearson Education, Inc.
X2
X3
X4
X5
17-36
follows:
SS
r2 = reg
SSy
S S S S res
= y
SSy
To illustrate the calculations of r2, let us consider again the effect of att
toward the city on the duration of residence. It may be recalled from e
calculations of the simple correlation coefficient that:
n
SS y = (Y i Y )2
i =1
= 120.9168
Copyright 2010 Pearson Education, Inc.
17-37
17-38
SS reg = (Y i Y )
i =1
= (6.9763-6.5833)2 + (8.1557-6.5833)2
+ (8.1557-6.5833)2 + (3.4381-6.5833)2
+ (8.1557-6.5833)2 + (4.6175-6.5833)2
+ (5.7969-6.5833)2 + (2.2587-6.5833)2
+ (11.6939 -6.5833)2 + (6.3866-6.5833)2
+ (11.1042 -6.5833)2 + (2.2587-6.5833)2
=0.1544 + 2.4724 + 2.4724 + 9.8922 + 2.4724
+ 3.8643 + 0.6184 + 18.7021 + 26.1182
+ 0.0387 + 20.4385 + 18.7021
= 105.9524
Copyright 2010 Pearson Education, Inc.
17-39
= SSreg /SSy
= 105.9524/120.9168
= 0.8762
17-40
17-41
F=
SS reg
SS res /(n2)
H 0 : = 0
H1 : 0
Copyright 2010 Pearson Education, Inc.
17-42
17-43
Bivariate Regression
Table 17.2
Multiple R
R2
Adjusted R2
Standard Error
0.93608
0.87624
0.86387
1.22329
df
Regression
1
Residual
F = 70.80266
ANALYSIS OF VARIANCE
Sum of Squares Mean Square
105.95222
105.95222
10
14.96444
1.49644
Significance of F = 0.0000
Variable
Significance
T
Duration
0.58972
0.07008
8.414
0.0000
Copyright 2010 Pearson Education, Inc.
(Constant)
1.07932
0.74335
T
of
0.93608
1.452
17-44
(Y i Y i )
SEE i 1
n2
or
SEE
SS
res
n2
SS
res
n k 1
For the data given in Table 17.2, the SEE is estimated as follows:
SEE= 14.9644/(122)
= 1.22329
Copyright 2010 Pearson Education, Inc.
17-45
Assumptions
17-46
Multiple Regression
Y= 0 + 1 X1 + 2 X2 + 3 X3+ . . . + k X k + e
17-47
17-48
17-49
17-50
17-51
17-52
Multiple Regression
Table 17.3
Multiple R
R2
Adjusted R2
Standard Error
0.97210
0.94498
0.93276
0.85974
df
Regression
2
Residual
F = 77.29364
ANALYSIS OF VARIANCE
Sum of Squares Mean Square
114.26425
57.13213
9
6.65241
0.73916
Significance of F = 0.0000
Variable
Significance
T
IMPORTANCE
0.28865
0.0085
DURATION
0.48108
Copyright
2010 Pearson Education,
Inc.
0.0000
T
of
0.08608
0.31382
3.353
0.05895
0.76363
8.160
17-53
SSy =
S S reg =
S S res =
i =1
n
i =1
n
i =1
(Y i Y )2
(Y i Y )
(Y i Y i )
17-54
SS reg
SS y
R 2 =
17-55
H0: 1 = 2 = 3 = . . . = k = 0
The overall test can be conducted by using an F statistic:
SS reg /k
F=
SS res /(nk1)
2
R
/k
=
(1R 2 )/(nk1)
which has an F distribution with k and (n - k -1) degrees of freedom.
Copyright 2010 Pearson Education, Inc.
17-56
's
Testing for the significance of the
can be done in a ma
i
similar to that in the bivariate case by using t tests. The
significance of the partial coefficient for importance
attached to weather may be tested by the following equatio
t= b
SE
17-57
17-58
17-59
Residuals
Fig. 17.7
Predicted Y Values
17-60
Residuals
Fig. 17.8
Time
17-61
Residuals
Fig. 17.9
Predicted Y
Values
Copyright 2010 Pearson Education, Inc.
17-62
Stepwise Regression
The purpose of stepwise regression is to select, from a large
number of predictor variables, a small subset of variables that
account for most of the variation in the dependent or criterion
variable. In this procedure, the predictor variables enter or are
removed from the regression equation one at a time. There are
several approaches to stepwise regression.
Forward inclusion. Initially, there are no predictor variables
in the regression equation. Predictor variables are entered one
at a time, only if they meet certain criteria specified in terms of
F ratio. The order in which the variables are included is based
on the contribution to the explained variance.
Backward elimination. Initially, all the predictor variables
are included in the regression equation. Predictors are then
removed one at a time based on the F ratio for removal.
Stepwise solution. Forward inclusion is combined with the
removal of predictors that no longer meet the specified criterion
at each step.
Copyright 2010 Pearson Education, Inc.
17-63
Multicollinearity
Multicollinearity arises when intercorrelations among
the predictors are very high.
Multicollinearity can result in several problems, including:
The partial regression coefficients may not be
estimated precisely. The standard errors are likely to
be high.
The magnitudes, as well as the signs of the partial
regression coefficients, may change from sample to
sample.
It becomes difficult to assess the relative importance of
the independent variables in explaining the variation in
the dependent variable.
Predictor variables may be incorrectly included or
removed in stepwise regression.
17-64
Multicollinearity
17-65
17-66
17-67
Cross-Validation
17-68
Original
Variable
Code
Nonusers.............. 1
Light Users.......... 2
Medium Users...... 3
Heavy Users......... 4
Yi
D2
0
1
0
0
D3
0
0
1
0
17-69
Y
In regression with dummy variables, the predicted
category is the mean of Y for each category.
Product Usage
Category
Predicted
Value
Y
Nonusers...............
Light Users...........
Medium Users.......
Heavy Users..........
a + b1
a + b2
a + b3
a
for each
Mean
Value
a + b1
a + b2
a + b3
a
17-70
SS res = (Y i Y i )
i =1
n
SS reg = (Y i Y )
One-Way ANOVA
= SSwithin = SSerror
= SSbetween = SSx
i =1
Overall F test
Copyright 2010 Pearson Education, Inc.
= F test
17-71
SPSS Windows
The CORRELATE program computes Pearson product moment
correlations and partial correlations with significance levels.
Univariate statistics, covariance, and cross-product deviations
may also be requested. Significance levels are included in the
output. To select these procedures using SPSS for Windows,
click:
Analyze>Correlate>Bivariate
Analyze>Correlate>Partial
Scatterplots can be obtained by clicking:
Graphs>Scatter >Simple>Define
REGRESSION calculates bivariate and multiple regression
equations, associated statistics, and plots. It allows for an
easy examination of residuals. This procedure can be run by
clicking:
Analyze>Regression Linear
Copyright 2010 Pearson Education, Inc.
17-72
2.
3.
4.
5.
6.
7.
Click OK.
17-73
2.
3.
4.
5.
6.
7.
8.
Click CONTINUE.
9.
Click OK.
17-74
17-75
1.
2.
3.
4.
Click OPTIONS.
5.
6.
Click RUN.
17-76
2.
3.
4.
5.
Click MODEL.
6.
7.
Click RUN.
17-77
17-78
17-79
17-80