Linear Probability Model Shortcomings

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL
Why do some people go to college while others do not?

Why do some women enter the labor force while others
do not?
Why do some people buy houses while others rent?
Why do some people migrate while others stay?
Financial Economists are often interested in the factors behind the decision-making of
individuals or enterprises, examples being shown above.
1

do not?
The models that have been developed for this purpose are known as qualitative response or
binary choice models, with the outcome, which we will denote Y, being assigned a value of
1 if the event occurs and 0 otherwise.
2

do not?
Models with more than two possible outcomes have also been developed, but we will
confine our attention to binary choice models.
3
pi p(Yi 1) 1 2 X i
The simplest binary choice model is the linear probability model where, as the name
implies, the probability of the event occurring, p, is assumed to be a linear function of a set
of explanatory variables.
4
y, p
pi p(Yi 1) 1 2 X i
1 +2Xi
1
Xi
Graphically, the relationship is as shown, if there is just one explanatory variable.
pi p(Yi 1) 1 2 X i
Of course p is unobservable. One has data on only the outcome, Y. In the linear probability
model this is used like a dummy variable for the dependent variable.
6
Why do some people graduate from high school while

others drop out?
As an illustration, we will take the question shown above. We will define a variable GRAD
which is equal to 1 if the individual graduated from high school, and 0 otherwise.
7

. g GRAD=0
. replace GRAD=1 if S>11
(523 real changes made)
. reg GRAD ASVABC
Source |
SS
df
MS
---------+-----------------------------Model | 7.13422753
1 7.13422753
Residual | 35.9903339
568 .063363264
---------+-----------------------------Total | 43.1245614
569
.07579009
Number of obs
F( 1,
568)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
570
112.59
0.0000
0.1654
0.1640
.25172
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
t
P>|t|
[95% Conf. Interval]
---------+-------------------------------------------------------------------ASVABC |
.0121518
.0011452
10.611
0.000
.0099024
.0144012
_cons |
.3081194
.0583932
5.277
0.000
.1934264
.4228124
------------------------------------------------------------------------------
The Stata output above shows the construction of the variable GRAD. It is first set to 0 for
all respondents, and then changed to 1 for those who had more than 11 years of schooling.
8

. g GRAD=0
. reg GRAD ASVABC
Source |
SS
df
MS
---------+-----------------------------Model | 7.13422753
1 7.13422753
Residual | 35.9903339
568 .063363264
---------+-----------------------------Total | 43.1245614
569
.07579009
Number of obs
F( 1,
568)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
570
112.59
0.0000
0.1654
0.1640
.25172
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
t
P>|t|
---------+-------------------------------------------------------------------ASVABC |
.0121518
.0011452
10.611
0.000
.0099024
.0144012
_cons |
.3081194
.0583932
5.277
0.000
.1934264
.4228124
------------------------------------------------------------------------------
Here is the result of regressing GRAD on ASVABC. It suggests that every additional point
on the ASVABC score increases the probability of graduating by 0.012, that is, 1.2%.
9

. g GRAD=0
. reg GRAD ASVABC
Source |
SS
df
MS
---------+-----------------------------Model | 7.13422753
1 7.13422753
Residual | 35.9903339
568 .063363264
---------+-----------------------------Total | 43.1245614
569
.07579009
Number of obs
F( 1,
568)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
570
112.59
0.0000
0.1654
0.1640
.25172
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
t
P>|t|
---------+-------------------------------------------------------------------ASVABC |
.0121518
.0011452
10.611
0.000
.0099024
.0144012
_cons |
.3081194
.0583932
5.277
0.000
.1934264
.4228124
------------------------------------------------------------------------------
The intercept has no sensible meaning. Literally it suggests that a respondent with a 0
ASVABC score has a minus 31% probability of graduating. However a score of 0 is not
possible.
10
pi p(Yi 1) 1 2 X i
Unfortunately, the linear probability model has some serious shortcomings. First, there are
problems with the disturbance term.
11
pi p(Yi 1) 1 2 X i
Yi E (Yi ) ui
As usual, the value of the dependent variable Yi in observation i has a nonstochastic

component and a random component. The nonstochastic component depends on Xi and
the parameters. The random component is the disturbance term.
12
pi p(Yi 1) 1 2 X i
Yi E (Yi ) ui
E (Yi ) 1 pi 0 (1 pi ) pi 1 2 X i
The nonstochastic component in observation i is its expected value in that observation.

This is simple to compute, because it can take only two values. It is 1 with probability pi and
0 with probability (1 - pi) The expected value in observation i is therefore 1 + 2Xi.
13
pi p(Yi 1) 1 2 X i
Yi E (Yi ) ui
E (Yi ) 1 pi 0 (1 pi ) pi 1 2 X i
Yi 1 2 X i ui
This means that we can rewrite the model as shown.
14
Y, p
pi p(Yi 1) 1 2 X i
1 +2Xi
1
Xi
The probability function is thus also the nonstochastic component of the relationship
between Y and X.
15
pi p(Yi 1) 1 2 X i
Yi E (Yi ) ui
E (Yi ) 1 pi 0 (1 pi ) pi 1 2 X i
Yi 1 2 X i ui
Yi 1 ui 1 1 2 X i
Yi 0 ui 1 2 X i
In observation i, for Yi to be 1, ui must be (1 - 1 - 2Xi). For Yi to be 0, ui must be (- 1 - 2Xi).
16
Y, p
1
pi p(Yi 1) 1 2 X i
A
1 - 1 - 2Xi
1 +2Xi
1
1 + 2Xi
B
Xi
The two possible values, which give rise to the observations A and B, are illustrated in the
diagram. Since u does not have a normal distribution, the standard errors and test
statistics are invalid. Its distribution is not even continuous.
17
u2 ( 1 2 X i )(1 1 2 X i )
i
Y, p
1
1 - 1 - 2Xi
1 +2Xi
1
1 + 2Xi
B
Xi
Further, it can be shown that the population variance of the disturbance term in observation
i is given by (1 + 2Xi)(1 - 1 - 2Xi). This changes with Xi, and so the distribution is
heteroscedastic.
18
Y, p
1
1 - 1 - 2Xi
1 +2Xi
1
1 + 2Xi
B
Xi
Yet another shortcoming of the linear probability model is that it may predict probabilities of
more than 1, as shown here. It may also predict probabilities less than 0.
19

. g GRAD=0
. reg GRAD ASVABC
Source |
SS
df
MS
---------+-----------------------------Model | 7.13422753
1 7.13422753
Residual | 35.9903339
568 .063363264
---------+-----------------------------Total | 43.1245614
569
.07579009
Number of obs
F( 1,
568)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
570
112.59
0.0000
0.1654
0.1640
.25172
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
t
P>|t|
---------+-------------------------------------------------------------------ASVABC |
.0121518
.0011452
10.611
0.000
.0099024
.0144012
_cons |
.3081194
.0583932
5.277
0.000
.1934264
.4228124
------------------------------------------------------------------------------
. predict PROB
The Stata command for saving the fitted values from a regression is predict, followed by the
name that you wish to give to the fitted values. We are calling them PROB.
20

. tab PROB if PROB>1
PROB |
Freq.
Percent
Cum.
------------+----------------------------------1.000773 |
33
18.75
18.75
1.012925 |
16
9.09
27.84
1.025077 |
21
11.93
39.77
1.037229 |
26
14.77
54.55
1.04938 |
35
19.89
74.43
1.061532 |
20
11.36
85.80
1.073684 |
16
9.09
94.89
1.085836 |
3
1.70
96.59
1.097988 |
6
3.41
100.00
------------+----------------------------------Total |
176
100.00
tab is the Stata command for tabulating the values of a variable, and for cross-tabulating
two or more variables. We see that there are 176 observations where the fitted value is
greater than 1.
21

PROB |
Freq.
Percent
Cum.
------------+----------------------------------1.000773 |
33
18.75
18.75
1.012925 |
16
9.09
27.84
1.025077 |
21
11.93
39.77
1.037229 |
26
14.77
54.55
1.04938 |
35
19.89
74.43
1.061532 |
20
11.36
85.80
1.073684 |
16
9.09
94.89
1.085836 |
3
1.70
96.59
1.097988 |
6
3.41
100.00
------------+----------------------------------Total |
176
100.00
. tab PROB if PROB<0
no observations
In this example there were no fitted values of less than 0.
22

PROB |
Freq.
Percent
Cum.
------------+----------------------------------1.000773 |
33
18.75
18.75
1.012925 |
16
9.09
27.84
1.025077 |
21
11.93
39.77
1.037229 |
26
14.77
54.55
1.04938 |
35
19.89
74.43
1.061532 |
20
11.36
85.80
1.073684 |
16
9.09
94.89
1.085836 |
3
1.70
96.59
1.097988 |
6
3.41
100.00
------------+----------------------------------Total |
176
100.00
no observations
The main advantage of the linear probability model over logit and probit analysis, the
alternatives considered in the next two sequences, is that it is much easier to fit. For this
reason it used to be recommended for initial, exploratory work.
23

PROB |
Freq.
Percent
Cum.
------------+----------------------------------1.000773 |
33
18.75
18.75
1.012925 |
16
9.09
27.84
1.025077 |
21
11.93
39.77
1.037229 |
26
14.77
54.55
1.04938 |
35
19.89
74.43
1.061532 |
20
11.36
85.80
1.073684 |
16
9.09
94.89
1.085836 |
3
1.70
96.59
1.097988 |
6
3.41
100.00
------------+----------------------------------Total |
176
100.00
no observations
However, this consideration is no longer relevant, now that computers are so fast and
powerful, and logit and probit are typically standard features of regression applications.
24

Linear Probability Model Shortcomings

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Linear Probability Model Shortcomings

Caricato da

Copyright:

Formati disponibili

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

Why do some people go to college while others do not?

Why do some people migrate while others stay?

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

Why do some people go to college while others do not?

Why do some people migrate while others stay?

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

Why do some people go to college while others do not?

Why do some people migrate while others stay?

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

Graphically, the relationship is as shown, if there is just one explanatory variable.

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

Why do some people graduate from high school while

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

As usual, the value of the dependent variable Yi in observation i has a nonstochastic

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

The nonstochastic component in observation i is its expected value in that observation.

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

This means that we can rewrite the model as shown.

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

In observation i, for Yi to be 1, ui must be (1 - 1 - 2Xi). For Yi to be 0, ui must be (- 1 - 2Xi).

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

In this example there were no fitted values of less than 0.

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

BINARY CHOICE MODELS: LINEAR PROBABILITY MODEL

Potrebbero piacerti anche