Sei sulla pagina 1di 63

Business Research Methods

William G. Zikmund

Chapter 23 Bivariate Analysis: Measures of Associations

Copyright 2000 by Harcourt, Inc.

All rights reserved. Requests for permission to make copies of any part of the work should be mailed to the following address: Permissions Department, Harcourt, Inc., 6277 Sea Harbor Drive, Orlando, Florida 32887-6777.

Measures of association
A general term that refers to a number of bivariate statistical techniques used to measure the strength of a relationship between two variables.

Copyright 2000 by Harcourt, Inc. All rights reserved.

RELATIONSHIPS AMONG VARIABLES


CORRELATION ANALYSIS BIVARIATE REGRESSION ANALYSIS

Copyright 2000 by Harcourt, Inc. All rights reserved.

Type of Measurement

Measure of Association

Interval and Ratio Scales

Correlation Coefficient Bivariate Regression

Copyright 2000 by Harcourt, Inc. All rights reserved.

Type of Measurement

Measure of Association

Ordinal Scales

Chi-square Rank Correlation

Copyright 2000 by Harcourt, Inc. All rights reserved.

Type of Measurement

Measure of Association

Nominal

Chi-Square Phi Coefficient Contingency Coefficient

Copyright 2000 by Harcourt, Inc. All rights reserved.

Correlation coefficient
A statistical measure of the covariation or association between two variables. Are dollar sales associated with advertising dollar expenditures?

Copyright 2000 by Harcourt, Inc. All rights reserved.

The Correlation coefficient for two variables, X and Y is

rxy

Copyright 2000 by Harcourt, Inc. All rights reserved.

Correlation coefficient
r r ranges from +1 to -1 r = +1 a perfect positive linear relationship r = -1 a perfect negative linear relationship r = 0 indicates no correlation

Copyright 2000 by Harcourt, Inc. All rights reserved.

Simple Correlation Coefficient

rxy ryx

X X Y Y Xi X Yi Y
i i 2
Copyright 2000 by Harcourt, Inc. All rights reserved.

Simple Correlation Coefficient

rxy ryx

xy
2 x 2 y

Copyright 2000 by Harcourt, Inc. All rights reserved.

Simple Correlation Coefficient -alternative method


= Variance of X
2 x
2 y

= Variance of Y
xy = Covariance of X and Y

Copyright 2000 by Harcourt, Inc. All rights reserved.

CORRELATION PATTERNS

NO CORRELATION

X
Copyright 2000 by Harcourt, Inc. All rights reserved.

CORRELATION PATTERNS

PERFECT NEGATIVE CORRELATION r= -1.0

Copyright 2000 by Harcourt, Inc. All rights reserved.

CORRELATION PATTERNS

A HIGH POSITIVE CORRELATION r = +.98

X
Copyright 2000 by Harcourt, Inc. All rights reserved.

Calculation of r

6.3389 17 .837 5.589


.635

6.3389 99.712

Copyright 2000 by Harcourt, Inc. All rights reserved.

Coefficient of Determination

Explained variance r Total Variance


2
Copyright 2000 by Harcourt, Inc. All rights reserved.

CORRELATION DOES NOT MEAN CAUSATION


high correlation roosters crow and the rising of the sun rooster does not cause the sun to rise teachers salaries and the consumption of liquor co-vary because they are both influenced by a third variable
Copyright 2000 by Harcourt, Inc. All rights reserved.

Correlation matrix
The standard form for reporting correlation results.

Copyright 2000 by Harcourt, Inc. All rights reserved.

CORRELATION MATRIX
Var1 Var1 Var2 Var3 1.0 0.45 0.31 Var2 0.45 1.0 0.10 Var3 0.31 0.10 1.0

Copyright 2000 by Harcourt, Inc. All rights reserved.

Walkups First Laws of Statistics


Law No. 1 Everything correlates with everything, especially when the same individual defines the variables to be correlated. Law No. 2 It wont help very much to find a good correlation between the variable you are interested in and some other variable that you dont understand any better.
Copyright 2000 by Harcourt, Inc. All rights reserved.

Walkups First Laws of Statistics

Law No. 3 Unless you can think of a logical reason why two variables should be connected as cause and effect, it doesnt help much to find a correlation between them. In Columbus, Ohio, the mean monthly rainfall correlates very nicely with the number of letters in the names of the months!
Copyright 2000 by Harcourt, Inc. All rights reserved.

REGRESSION
DICTIONARY DEFINITION GOING OR MOVING BACKWARD

Going back to previous conditions Tall mens sons

Copyright 2000 by Harcourt, Inc. All rights reserved.

BIVARIATE REGRESSION
A MEASURE OF LINEAR ASSOCIATION THAT INVESTIGATES A STRAIGHT LINE RELATIONSHIP USEFUL IN FORECASTING

Copyright 2000 by Harcourt, Inc. All rights reserved.

Bivariate linear regression


A measure of linear association that investigates a straight-line relationship Y = a + bX where Y is the dependent variable X is the independent variable a and b are two constants to be estimated
Copyright 2000 by Harcourt, Inc. All rights reserved.

Y intercept
a An intercepted segment of a line The point at which a regression line intercepts the Y-axis

Copyright 2000 by Harcourt, Inc. All rights reserved.

Slope
B The inclination of a regression line as compared to a base line Rise over run D - notation for a change in

Copyright 2000 by Harcourt, Inc. All rights reserved.

160
150 140 130 120 110 100 90

Scatter Diagram and Eyeball Forecast

My line Your line

80

70

80

90

100

110

120

130

140

150

160

170

180

190
X

Copyright 2000 by Harcourt, Inc. All rights reserved.

REGRESSION LINE AND SLOPE


Y

130 120 110 100 90

X a Y b

80

DX
80 90 100 110

DY

120

130

140

150

160

170

180

190
X

Copyright 2000 by Harcourt, Inc. All rights reserved.

160
150 140 130 120 110 100 90
Y hat for Dealer 3

Least-Squares Regression Line

Actual Y for Dealer 7

Y hat for Dealer 7

80

Actual Y for Dealer 3

70

80

90

100

110

120

130

140

150

160

170

180

190
X

Copyright 2000 by Harcourt, Inc. All rights reserved.

Scatter Diagram of Explained and Unexplained Variation


Y

130
Deviation not explained

120 110 100 90


Total deviation

{}
140 150

Deviation explained by the regression

80

80

90

100

110

120

130

160

170

180

190
X

Copyright 2000 by Harcourt, Inc. All rights reserved.

The least-square method


uses the criterion of attempting to make the least amount of total error in prediction of Y from X. More technically, the procedure used in the least-squares method generates a straight line that minimizes the sum of squared deviations of the actual values from this predicted regression line.

Copyright 2000 by Harcourt, Inc. All rights reserved.

The least-square method


A relatively simple mathematical technique that ensures that the straight line will most closely represent the relationship between X and Y.

Copyright 2000 by Harcourt, Inc. All rights reserved.

Regression - least-square method

e is minimum
i 1 2 i

Copyright 2000 by Harcourt, Inc. All rights reserved.

ei

= Yi - Y i

(The residual)

Yi = actual value of the dependent variable = estimated value of the dependent variable (Y hat) Y i
n = number of observations

i = number of the observation

Copyright 2000 by Harcourt, Inc. All rights reserved.

The logic behind the leastsquares technique


No straight line can completely represent every dot in the scatter diagram There will be a discrepancy between most of the actual scores (each dot) and the predicted score Uses the criterion of attempting to make the least amount of total error in prediction of Y from X
Copyright 2000 by Harcourt, Inc. All rights reserved.

Bivariate Regression

a Y bX

Copyright 2000 by Harcourt, Inc. All rights reserved.

Bivariate Regression

n XY X Y n X
2

Copyright 2000 by Harcourt, Inc. All rights reserved.

b = estimated slope of the line (the regression coefficient)

= estimated intercept of the y axis

Y = dependent variable
Y = mean of the dependent variable

X = independent variable X = mean of the independent variable

= number of observations
Copyright 2000 by Harcourt, Inc. All rights reserved.

15193,345 2,806 ,875 b 15245,759 3,515,625


2,900 ,175 2,806 ,875 3,686 ,385 3,515,625 93,300 .54638 170,760
Copyright 2000 by Harcourt, Inc. All rights reserved.

99 .8 .54638 125 a
99 .8 68 .3 31 .5

Copyright 2000 by Harcourt, Inc. All rights reserved.

99 .8 .54638 125 a
99 .8 68 .3 31 .5

Copyright 2000 by Harcourt, Inc. All rights reserved.

31 .5 .546 X Y
31 .5 .546 89
31 .5 48 .6 80 .1
Copyright 2000 by Harcourt, Inc. All rights reserved.

31 .5 .546 X Y
31 .5 .546 89
31 .5 48 .6 80 .1
Copyright 2000 by Harcourt, Inc. All rights reserved.

Dealer 7 (Actual Y value 129) 31 .5 .546 165 Y


7

121 .6

Dealer 3 (Actual Y value 80 ) 31 .5 .546 95 Y


3

83 .4
Copyright 2000 by Harcourt, Inc. All rights reserved.

ei Y9 Y9 97 96 .5 0 .5
Copyright 2000 by Harcourt, Inc. All rights reserved.

Dealer 7 (Actual Y value 129) 31 .5 .546 165 Y


7

121 .6

Dealer 3 (Actual Y value 80 ) 31 .5 .546 95 Y


3

83 .4
Copyright 2000 by Harcourt, Inc. All rights reserved.

ei Y9 Y9 97 96 .5 0 .5
Copyright 2000 by Harcourt, Inc. All rights reserved.

Y9 31 .5 .546 119

Copyright 2000 by Harcourt, Inc. All rights reserved.

F-test (regression)
A procedure to determine whether there is more variability explained by the regression or unexplained by the regression. Analysis of variance summary table

Copyright 2000 by Harcourt, Inc. All rights reserved.

Total deviation can be partitioned into two parts


Total deviation equals Deviation explained by the regression plus Deviation unexplained by the regression

Copyright 2000 by Harcourt, Inc. All rights reserved.

We are always acting on what has just finished happening. It happened at least 1/30th of a second ago.We think were in the present, but we arent. The present . we know is only a movie of the past. Tom Wolfe in The Electric Kool-Aid Acid Test

Copyright 2000 by Harcourt, Inc. All rights reserved.

PARTITIONING THE VARIANCE

Y Y Y i
=

Yi Y i

Total deviation

Deviation explained by the regression

Deviation unexplained by + the regression (Residual error)

Copyright 2000 by Harcourt, Inc. All rights reserved.

Y = Mean of the total group = Value predicted with regression equation Y Yi = Actual value

Copyright 2000 by Harcourt, Inc. All rights reserved.

Y Y
i

Y i

Total variation explained

Explained = variation

Unexplained + variation (residual)

Copyright 2000 by Harcourt, Inc. All rights reserved.

SUM OF SQUARES

SSt SSr SSe


Copyright 2000 by Harcourt, Inc. All rights reserved.

Coefficient of Determination - r2
the proportion of variance in Y that is explained by X (or vice versa) A measure obtained by squaring the correlation coefficient; that proportion of the total variance of a variable that is accounted for by knowing the value of another variable.

Copyright 2000 by Harcourt, Inc. All rights reserved.

Coefficient of Determination - r2

SSr SSe r 1 SSt SSt


2
Copyright 2000 by Harcourt, Inc. All rights reserved.

Source of variation
EXPLAINED BY REGRESSION DEGREES OF FREEDOM
k-1 where k= number of estimated constants (variables)

SUM OF SQUARES
SSr

MEAN SQUARED
SSr/k-1
Copyright 2000 by Harcourt, Inc. All rights reserved.

Source of variation
UNEXPLAINED BY REGRESSION DEGREES OF FREEDOM
n-k where n=number of observations

SUM OF SQUARES
SSe

MEAN SQUARED
SSe/n-k
Copyright 2000 by Harcourt, Inc. All rights reserved.

r2 in the example

3,398 .49 r .875 3,882 .4


2

Copyright 2000 by Harcourt, Inc. All rights reserved.

MULTIPLE REGRESSION
EXTENSION OF BIVARIATE REGRESSION MULTIDIMENSIONAL WHEN THREE OR MORE VARIABLES ARE INVOLVED SIMULTANEOUSLY INVESTIGATES THE EFFECT OF TWO OR MORE VARIABLES ON A SINGLE DEPENDENT VARIABLE DISCUSSED IN CHAPTER 24

Copyright 2000 by Harcourt, Inc. All rights reserved.

Potrebbero piacerti anche