Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Regression Models
To accompany
Quantitative Analysis for Management, Tenth Edition,
by Render, Stair, and Hanna 2008 Prentice-Hall, Inc.
Power Point slides created by Jeff Heyl 2009 Prentice-Hall, Inc.
Learning Objectives
After completing this chapter, students will be able to:
4.1 Introduction
4.2 Scatter Diagrams
4.3 Simple Linear Regression
4.4 Measuring the Fit of the Regression
Model
4.5 Using Computer Software for
Regression
4.6 Assumptions of the Regression
Model
10
8
Sales ($100,000)
| | | | | | | |
0 1 2 3 4 5 6 7 8
Payroll ($100 million)
Figure 4.1
where
Y = dependent variable (response)
X = independent variable (predictor or explanatory)
0 = intercept (value of Y when X = 0)
1 = slope of the regression line
= random error
Y b0 b1 X
where
^
Y = dependent variable (response)
X = independent variable (predictor or explanatory)
b0 = intercept (value of Y when X = 0)
b1 = slope of the regression line
Y = Sales
X = Area payroll
The line chosen in Figure 4.1 is the one that
minimizes the errors
X
X
average (mean) of X values
n
Y
Y
average (mean) of Y values
n
b1
( X X )(Y Y )
(X X ) 2
b0 Y b1 X
2009 Prentice-Hall, Inc. 4 14
Triple A Construction
Regression calculations
Y X (X X)2 (X X)(Y Y)
6 3 (3 4)2 = 1 (3 4)(6 7) = 1
8 4 (4 4)2 = 0 (4 4)(8 7) = 0
9 6 (6 4)2 = 4 (6 4)(9 7) = 4
5 4 (4 4)2 = 0 (4 4)(5 7) = 0
4.5 2 (2 4)2 = 4 (2 4)(4.5 7) = 5
9.5 5 (5 4)2 = 1 (5 4)(9.5 7) = 2.5
Y = 42 X = 24 (X X)2 = 10 (X X)(Y Y) = 12.5
Y = 42/6 = 7 X = 24/6 = 4
Table 4.2
X
X 24
4
6 6
Y
Y 42
7
6 6
b1
( X X )(Y Y ) 12.5
1.25
(X X ) 2
10
b0 Y b1 X 7 (1.25 )( 4 ) 2
Therefore Y 2 1.25 X
2009 Prentice-Hall, Inc. 4 16
Triple A Construction
Regression calculations
X
X 24
4
6 6 sales = 2 + 1.25(payroll)
Y
Y 42
7
If the payroll next
6 6 year is $600 million
b1
) 2 12
( X X )(Y YY
1.5
.25(6) 9.5 or $ 950,000
1.25
(X X ) 2
10
b0 Y b1 X 7 (1.25 )( 4 ) 2
Therefore Y 2 1.25 X
2009 Prentice-Hall, Inc. 4 17
Measuring the Fit
of the Regression Model
Regression models can be developed
for any variables X and Y
How do we know the model is actually
helpful in predicting Y based on X?
We could just take the average error, but
the positive and negative errors would
cancel each other out
Three measures of variability are
SST Total variability about the mean
SSE Variability about the regression line
SSR Total variability that is explained by
the model
2009 Prentice-Hall, Inc. 4 18
Measuring the Fit
of the Regression Model
Sum of the squares total
SST (Y Y )2
An important relationship
SST SSR SSE
(Y Y)2 =
(Y Y)2 = 22.5 (Y Y)
^2 = 6.875 ^
15.625
Y=7 SST = 22.5 SSE = 6.875 SSR = 15.625
Table 4.3
SST = 22.5
Sum of the squared error SSE = 6.875
SSR=2 15.625
SSE e (Y Y )
2
An important relationship
SST SSR SSE
^
8 YY
YY
Sales ($100,000)
^
6 YY Y
| | | | | | | |
0 1 2 3 4 5 6 7 8
Payroll ($100 million)
Figure 4.2
r r2
r 0.6944 0.8333
Program 4.1A
2009 Prentice-Hall, Inc. 4 26
Using Computer Software
for Regression
Program 4.1B
2009 Prentice-Hall, Inc. 4 27
Using Computer Software
for Regression
Program 4.1C
2009 Prentice-Hall, Inc. 4 28
Using Computer Software
for Regression
Program 4.1D
2009 Prentice-Hall, Inc. 4 29
Using Computer Software
for Regression
Correlation coefficient is
called Multiple R in Excel
Program 4.1D
2009 Prentice-Hall, Inc. 4 30
Assumptions of the Regression Model
Figure 4.4A
2009 Prentice-Hall, Inc. 4 32
Residual Plots
Nonconstant error variance
Error
Figure 4.4B
2009 Prentice-Hall, Inc. 4 33
Residual Plots
Nonlinear relationship
Error
Figure 4.4C
2009 Prentice-Hall, Inc. 4 34
Estimating the Variance
Errors are assumed to have a constant
variance ( 2), but we usually dont know
this
It can be estimated using the mean
squared error (MSE),
MSE s2
SSE
s MSE
2
n k 1
where
n = number of observations in the sample
k = number of independent variables
0.05
F = 7.71 9.09
Figure 4.5
2009 Prentice-Hall, Inc. 4 45
Analysis of Variance (ANOVA) Table
Table 4.4
2009 Prentice-Hall, Inc. 4 46
ANOVA for Triple A Construction
Program 4.1D
(partial)
Y b0 b1 X 1 b2 X 2 ... bk X k
where
Y = predicted value of Y
b0 = sample intercept (and is an estimate
of 0)
bi = sample coefficient of the ith variable
(and is an estimate of i)
Program 4.2
Y 146631 44 X 1 2899 X 2
2009 Prentice-Hall, Inc. 4 52
Evaluating Multiple Regression Models
Program 4.3
2009 Prentice-Hall, Inc. 4 58
Jenny Wilson Realty
Y 121,658 56.43 X 1 3,962 X 2 33,162 X 3 47,369 X 4
F-value
indicates
significance
Low p-values
indicate each
variable is
significant
Program 4.3
2009 Prentice-Hall, Inc. 4 59
Model Building
The best model is a statistically significant
model with a high r2 and few variables
As more variables are added to the model,
the r2-value usually increases
For this reason, the adjusted r2 value is
often used to determine the usefulness of
an additional variable
The adjusted r2 takes into account the
number of independent variables in the
model
SST /( n 1)
* * * *
*
** * * ** *
*
*** * ** *
Linear relationship Nonlinear relationship
20
15
10
5
0 | | | | |
1.00 2.00 3.00 4.00 5.00
Weight (1,000 lb.)
Figure 4.6A
Program 4.4
35
30
25
MPG
20
15
10
5
0 | | | | |
1.00 2.00 3.00 4.00 5.00
Figure 4.6B Weight (1,000 lb.)
2009 Prentice-Hall, Inc. 4 67
Colonel Motors
The nonlinear model is a quadratic model
The easiest way to work with this model is to
develop a new variable
X 2 ( weight)2
This gives us a model that can be solved with
linear regression software
Y b0 b1 X 1 b2 X 2
Program 4.5