Sei sulla pagina 1di 26

Alexs ECON241 Notes

Alex Cooper
June 8, 2011
Contents
1 Basics 2
2 Two-Variable Regression 4
3 Nonlinear models 10
4 Analysis of variance (ANOVA) 12
5 Multiple regression 14
6 Reporting a regression model 16
7 Heteroscedasticity 18
8 Auto-correlation 20
9 Multicollinearity 24
10 Qualitative Analysis: Dummy Variables 25
1
1 Basics
1.1 Parameters
Sample mean

X:

X =
1
n
n

i
X
i
Population mean :
=

X P(X)
= E[X] =
_

p(x) dx
Sample variance S
2
:
S
2
=
1
n 1
n

i
_
X

X
_
2
Population variance
2

2
=
n

i
(X )
2
P(X)

2
= E
_
(X )
2
_
=
_

(x )
2
p(x) dx
1.2 Properties of Estimators
An estimator

for a population parameter is unbiased if the expected value
of that parameter equals its true value:
E[

] =
If P(

< ) = P(

> ) then is the population median of the distribution of

. If E[

] = 0 then

is the mean of the population distribution of

.
In the case that E[

] = 0, call

baised. The bias of

is the dierence between
its expectation and its true value:
BIAS(

) = E[

]
Eciency is both a relative and absolute concept.
Consider two unbiased estimators,

1
and

2
. If V [

1
] < V [

2
], then

1
is
relatively more ecient than

2
.

is absolutely ecient if it as least as ecient as any other unbiased estimators


of .
For X [0,
2
], it can be shown that the minimum variance of an unbiased
estimator of is

2
n
. Therefore,

X is the ecient estimator of .
2
An estimator is said to be linear if it is a linear function of all the sample values.
Thus

X is a linear estimator of :

X =
1
n
n

i
X
i
=
1
n
X
1
+
1
n
X
2
+ +
1
n
X
n
An estimator is a best linear unbiased estimator (B.L.U.E.) if it is (1) linear,
(2) unbiased, and (3) ecient.
3
2 Two-Variable Regression
Model is given as a Population Regression Function, where
i
is the ith value
of an unobservable random error which has an average value of zero.
Y
i
= (
1
+
2
X
i
) +
i
i = 1, 2, . . . , n
We estimate this model with a Sample Regression Function:
Y
i
= b
1
+b
2
X
i
+e
i
b
1
estimates
1
b
2
estimates
2
e
i
estimates
i
e
i
is referred to as the ith residual.
This yields the estimated model:

Y
i
= b
1
+b
2
X
i
And note that the residual is the dierence between the actual value of Y
i
and
the estimated value,

Y
i
:
e
i
= Y
i


Y
i
2.1 Assumptions
1. Y is an approximate linear function of X:
Y
i
=
1
+
2
X
i
+
i
2. The expected value of the error term is zero
E[
i
] = 0
3. The variance of the error term is constant
V [
i
] =
2
4. The error terms are statistically independent:
E[
i

j
] = 0 (j = k)
5. The explanatory variable X and the error term are statistically inde-
pendent:
E[X
i

j
] = 0 ( i, j)
6. The error terms are normally distributed:
Normal
Note that assumptions 2, 3, 4 and 6 jointly imply that:
[0,
2
]
4
2.2 Terminology
Y : Explained variable, dependent variable, regressand
X : Explanatory variable, independent variable, regressor
: Random error term, disturbance term
e : Estimated value of , residual
2.3 Estimation of regression model
Ordinary least squares (OLS) estimators for
1
and
2
:
b
2
=

XY nXY

X
2
nX
2
b
1
= Y b
2
X
Dierent forms of b
2
equation:
b
2
=

XY nXY

X
2
nX
2
=
n

XY (

X) (

Y )
n

X
2
(

X)
2
=

(X X)(Y Y )

(X X)
2
The last version is often written as

xy

x
2
If the assumptions of the regression model are valid, then the OLS estimates
are BLUE:
1. Both coecients b
1
and b
2
are linear functions of Y.
2. OLS estimates are unbiased:
E[b
j
] =
j
3. OLS estimators are ecient (Gauss-Markov Theorem)
V [b
j
] V
_

j
_
where

j
is any other linear unbiased estimator of
j
.
5
2.4 Estimation of Variance of
2
The sample estimator S
2

is an unbiased estimator of the error term


2
.
S
2

=
1
n 2
n

i
e
2
i
=
1
n 2
n

i
_
Y
i


Y
i
_
2
=
1
n 2
n

i
(Y
i
b
1
b
2
X
i
)
2
2.5 Coecient of determination R
2
Measures goodness of t of regression model.
R
2
=
Variation in Y explained by model
Total variation in Y
=
Total variation in YUnexplained variation in Y
Total variation in Y
=
_
Y Y
_
2

e
2
_
Y Y
_
2
= 1

e
2
_
Y Y
_
2
R
2
is the proportion of variability in Y explained by the regression model/by
variability of X.
2.6 Issues
1. The relationship between X and Y may not be approximately linear.
2. Y may be an approximate linear function of more than one explanatory
variable.
3. The assumptions involving the error term, , may not be valid.
Also, we need at least two points to dene a line. But, if n = 2, then S
2

may
not be dened (divide by zero). So using OLS, we can only estimate the two
coeceints if n > 2. In practice, we need n much larger than that.
2.7 Assumptions about X
We dont require that X is xed in repeated samples (often required for basic
regression). However, it is required that:
1. X is statistically independent of the error term, and
6
2. X is a stationary variable.
Population mean/variance of X and Y needs to be constant over time. Can
result in spurious regression, which is beyond the scope of this course.
(See Gujarati section 6.1)
2.8 Estimating
2
For the slope
2
, we can show that:
1. Population mean of b
2
is
2
:
E [b
2
] =
2
2. Population variance of b
2
,
2
b
2
is
V [b
2
] =
2
b
2
=

2

i
_
X
i
X
_
2
3. And b
2
is normally distributed.
b
2
N
_

2
,
2
b
2

_
b
2

b
2
_
Z
Since
b
2
is unknown, estimate it with S
2
b
2
which is an unbiased estimator:
S
2
b
2
=
S
2

i
_
X
i
X
_
2
S
2

=
1
n 2
n

i
e
2
i
Note the (n 2), because were estimating a line. Lines need two points to
dene them, so estimating one eats up another degree of freedom.
It can be shown that:
_
b
2

2
S
2
b
2
_
= t t
n2
Notation: S
b
2
= se(b
2
)
100(1 )% C.I. for
2
= b
2
t
n
2
,/2
se(b
2
)
Use the same se(b
2
) as test statistic when drawing inferences about
2
. Under
H
0
, we have:
_
b
2

2
S
2
b
2
_
= t t
n2
7
2.9 Estimating
1
It can be shown that:
1. Population mean of b
1
is
1
:
E[b
1
] =
1
2. Variance of b
1
,
2
b
1
, is
V [b
1
] =
2
b
1
=

2

n
i
X
2
i
n

n
i
_
X X
_
2
3. And b
1
is normally distributed.
Therefore, we have:
b
1
N
_

1
,
2
b
1

_
b
1

b
1
_
Z
Since
2
b
1
unknown, estimate with S
2
b
1
, which is unbiased:
S
2
b
1
=
S
2

n
i
X
2
i

n
i
_
X
i
X
_
2
S
2

=
1
n 2
n

i
e
2
i
We can also write S
b
1
as se(b
1
).
100(1 )% C.I. for
1
= b
1
t
n2,/2
se(b
1
)
Use the same se(b
1
) as test statistic when drawing inferences about
1
. Under
H
0
, we have:
_
b
1

1
S
2
b
1
_
= t t
n2
2.10 Prediction interval for

Y
Predicted values are found by plugging values for X into the model, i.e. X = X
p
:

Y
p
= b
1
+b
2
X
p
But since b
1
and b
2
are random variables, so is

Y . The forecast error
_

Y
p
Y
p
_
is the dierence between the predicted value

Y
p
and the actual value at X
p
, Y
p
.
It can be shown that: _

Y
p
Y
p
_
N
_
0,
2
FE

8
Where the population variance of FE is:

2
FE
=
2
_
1 +
1
n
+
_
X
p
X
_
2

n
i
_
X
i
X
_
2
_
Standardization gives:
_

Y
p
Y
p

FE
p
_
Z
Since
2
FE
p
is unknown, we estimate with unbiased estimator S
2
FE
p
:
S
2
FE
p
= S
2

_
1 +
1
n
+
_
X
p
X
_
2

n
i
_
X
i
X
_
2
_
S
2

=
1
n 2
n

i
e
2
i
It can be shown that:
_

Y
p
Y
p
S
FE
p
_
= t t
n2
Therefore, when X = X
p
, the 100(1 )% prediction interval is given by:
100(1 )% C.I. for Y
p
=

Y
p
t
n2,/2
S
FE
p
Prediction bands are variable in size, and smallest at X
p
= X. They increase
in magnitude the further away from X.
Note: this is the prediction interval for

Y
p
, and not E[Y
p
], which isnt part of
this course.
2.11 Signicance testing
We test the signicance of the model by verifying that b
1
and b
2
cant be zero.
H
0
:
2
= 0
H
1
:
2
= 0
t =
_
b
2

2
se(b
2
)
_
t
n2
9
3 Nonlinear models
Theory might suggest that the relationship between regressand and regressors
is not linear. However, we can transform the relationship into one that is ap-
proximately linear.
Lin-log model:
Y =
1
+
2
ln(X)
Log-log model (aka log-linear or double-log):
ln(Y ) =
1
+
2
ln(X)
For quadratics, we regress:
Y =
1
+
2
X +
3
Z + where Z = X
2
We just require that Y has a linear relationship with each regressand. In other
words, for each regressand X
i
,
Y
X
i
is a constant.
NB: many relationships cannot use OLS to estimate
1
and
2
:
Y =
1
X

2
+
Y = (
1
+X)

2
+
Example of log-log relationship given in lectures: Cobb-Douglas production
function. Short-run production function (short-run because capital stock is
presumed to be xed.)
Q = AL

Which can be easily transformed into a double-log model:


ln Q = ln A+ ln L +
Where
1
= ln A and
2
= , giving:
ln Q =
1
+
2
L +
3.1 Elasticity
Elasticity is dened as the percentage change in Y with respect to a percentage
change in X. This turns out to be:
=
dY
dX

X
Y
=
d ln Y
d ln X
In a double-log model ln Y =
1
+
2
ln X + , the elasticity of Y wrt X corre-
sponds to the slope coecient,
2
. Note the elasticity is constant. A 1% change
in X leads to a change in Y of
2
% (on average).
10
3.1.1 Interpretation of
2
A change in ln X of 1 unit leads to a change in ln Y of
2
units on average.

2
is hte elasticity of Y w.r.t. X.
A 1% change in X leads to a change in Y of
2
%, on average.
3.1.2 Interpretation of
1
If ln X is equal to zero (X = 1), then on average ln Y will have a value equal to

1
.
3.1.3 Interpretation of b
2
A change in ln X of 1 unit leads to a change in the estimated value of ln Y of
b
2
units, on average.
b
2
is the estimated elasticity of Y w.r.t. X.
A 1% change in X leads to a change in the estimated value of Y of b
2
% on
average.
3.1.4 Interpretation of b
1
If ln X is equal to zero (X = 1), then on average the predicted or estimated
value of ln Y will have a value equal to b
1
.
11
4 Analysis of variance (ANOVA)
Sample variance of dependent variable Y:
S
2
Y
=
1
n 1
n

i
_
Y
i
Y
_
2

(Y
i
Y )
2
=

Y
i
Y )
2
+

e
2
i
Total vy of Y = Explained vy of Y + Unexplained vy of Y
Total SS = Explained SS + Unexplained SS
Total SS = SS from regression + SS of error term
Total SS = Explained SS + Residual SS
TSS = ESS + RSS
Degrees of freedom associated with these measures:
Total = Explained + Unexplained
(n 1) = (k 1) + (n k)
Coecient of determination:
R
2
=
Explained variation
Total variation
=
ESS
TSS
=

i
_

Y
i
Y
_
2

i
_
Y
i
Y
_
2
4.1 ANOVA table
Anova from mean, k 2.
Source SS df MSS =
SS
df
F
Regression

i
_

Y
i
Y
_
2
(k 1)

i
(

Y
i
Y )
2
k1
F
Error

i
e
2
(n k)

i
e
2
i
nk
-
Total

i
_
Y
i
Y
_
2
(n 1)

i
(Y
i
Y )
2
n1
-
Note MSS for regression doesnt really have a name, because theres no partic-
ular use for that number.
Estimator of
2
Y
:
S
2
Y
=
1
n 1

(Y Y )
2
Estimator of
2

:
S
2

=
1
n k

e
2
i
And signicance measures:
12
R
2
=
ESS
TSS
=

i
(

Y Y )
2

i
(Y
i
Y )
2
F =
ESS/(k 1)
RSS/(n k)
=

i
_

Y
i
Y
_
2
/(k 1)

i
e
2
i
/(n k)
4.2 ANOVA test
H
0
:
XY
= 0
H
1
:
XY
= 0
If H
0
is true, then it can be shown that the sample statistic F has an F distri-
bution with (k 1) DoF in the numerator and (nk) DoF in the denominator:
F =
Explained variation/(k 1)
Unexplained variation/(n k)
F
k1,nk
Note the relationship between F and R
2
:
R
2
=
Explained Variation
Total Variation
F =
Explained variation/(k 1)
Unexplained variation/(n k)
=
Explained variation/(k 1)
(Total variation Explained variation)/(n k)
=
Explained variation/(k1)
Total variation
_
1
Explained variation
Total variation
_
/(n k)
=
R
2
/(k 1)
(1 R
2
)/(n k)
Note that if either the numerator or the denominator of the F critical values
are not available in the tables, then we need to interpolate. Just use linear
interpolation intuitively.
TODO: lots left out about rho
4.3 Two-variable case
TODO: lots left out about the relationship between F and the t statistic in the
two variable case.
TODO: see p.25 in topic 13
TODO: ANOVA table
13
5 Multiple regression
The k-variable multiple linear regression model:
Y =
1
+
2
X
2
+
3
X
3
+ +
K
X
K
+
Individual instances of this model can be written as:
Y
i
=
1
+
2
X
i2
+
3
X
i3
+ +
K
X
iK
+
i
Sample multiple regression equation:
Y
i
= b
1
+b
2
X
i2
+b
3
X
i3
+ +b
K
X
iK
+e
i
Estimated multiple regression model:

Y
i
= b
1
+b
2
X
i2
+b
3
X
i3
+ +b
K
X
iK
We choose b
1
, b
2
, . . . , b
k
to minimize RSS, which is a quadratic function of the
parameters. Let the computer do it.
5.1 Assumptions
Assumptions of the multiple regression model (pretty much the same as for two
variables):
1. Y is an approximate linear function of X
2
, X
3
, . . . , X
K
.
2. The expected value of the error term is zero
E[
i
] = 0
3. The population variance of the error term is constant
V [
i
] =
2
4. The error terms are statistically independent:
E[
i

j
] = 0 (j = k)
5. The explanatory variables X
2
, X
3
, . . . , X
K
are linearly independent.
6. The explanatory variables are statistically independent of the error term.
7. The error terms are normally distributed:
Normal
Note that assumptions 2, 3, 4 and 7 jointly imply that the error terms are
normally and independently distributed with a mean of zero and variance
2
:
N.I.D.[0,
2
]
OLS estimates of the regression coecients in a multiple regression model are
BLUE.
14
5.2 Variance of estimated parameters
It can be shown that S
2

is an unbiased estimator for


2
:
S
2

=
1
n k
n

i
e
2
i
=
1
n k
n

i
_
Y
i


Y
i
_
2
It can be shown that:
t
i
=
_
b
i

i
se(b
i
)
_
t
nk
100(1 )% CI for
i
= b
i
t
nk;/2
.se(b
i
)
Coecient of determination R
2
, where r
Y,

Y
is the sample correlation ecient
between the actual values of Y and the predicted values

Y .
R
2
= r
2
Y,

Y
0 R
2
1
5.3 F test for signicance of model
The F test is used to test joint signiance of the parameters, except
1
:
H
0
:
2
= 0 and
3
= 0 . . . and
K
= 0
H
0
:
2
= 0 or
3
= 0 . . . or
K
= 0
If H
0
is true, then none of the (k 1) variables is signicant in determining Y .
If H
1
is true, then at least one of the variables is signicant in determining Y .
Under H
0
:
F =
Explained variation/(k 1)
Unexplained variation/(n k)
=
R
2
/(k 1)
(1 R
2
)/(n k)
F
k1,nk
Note that this is a test of whether R
2
is signicantly greater than 0.
F does not have to be a large number for the model to be signicant: you can
get perfectly good and useful models with very small R
2
.
5.4 Interpretation of the F test
The F test is a test of the overall signiance of the multiple regression model in
explaining the variability of Y.
If H
0
is rejected, then b
2
, b
3
, . . . , b
k
are said to be jointly signicant. (At least
one of
2
,
3
, . . . ,
k
is likely to be non-zero.)
Alternately, if H
0
is rejected then X
2
, X
3
, . . . , X
K
are said to be jointly signif-
icant in explaining the variability of Y . (At least one of X
2
, X
3
, . . . , X
K
is a
statistically signicant factor in determining Y .)
Also, if H
0
is rejected, we can conlude on the basis of hte sample evidence that
the Sample Coecient of Determination is signicantly greater than zero.
15
6 Reporting a regression model
6.1 Two variable case
Note that only one of se and t rows are requireddivide coecient by one to
work out one or the other. In economics/nance usually just give t; stats usually
report se.

Y = b
1
+ b
2
X
(s.e.) se(b
1
) se(b
2
)
(t) t
b
1
t
b
2
n = . . . R
2
= . . .
6.2 Pro-forma report
Linear Model 81.41variation in the Mortgage Interest Rate.
Both the estimated slope coecient, b2, and the estimated intercept, b1, are
strongly signicant. (At the 5% signicance level, the critical value is 2.306; at
the 1% level it is 3.355).
The Mortgage Interest Rate is a highly signicant variable in determining the
Number of Houses Sold.
The estimated slope coecient, 8.1310, indicates that if the Mortgage Interest
Rate falls by 1 unit (e.g. from 13% to 12%), the Number of Houses Sold in one
month will increase by 813, on average.
The estimated slope coecient, 8.1310, indicates that if the Mortgage Interest
Rate is increased by 1 unit (e.g. from 13% to 14%), the Number of Houses Sold
in one month will fall by 813, on average.
If the Mortgage Interest Rate is zero, the estimated model indicates, on average,
that the Number of Houses Sold in one month (measured in 100s) is 127.34. That
is, if the Mortgage Interest Rate is zero, then the actual Number of Houses Sold
in one month is estimated or predicted to be 12,734.
However, it is not sensible to consider a housing loans market where the Mort-
gage Interest Rate is zero. It is also the case that zero is well below the observed
range of sample values for the Mortgage Interest Rate on which the analysis is
based. We have no sample information on the Number of Houses Sold when the
Mortgage Interest Rate is below 10.5%. In these circumstances, interpretation
of the estimated intercept is not meaningful.
The plot of the Number of Houses Sold against the Mortgage Interest Rate
suggests that a curvilinear relationship exists between these two variables, with
a negative but marginally increasing slope.
However, the plot of the residuals against the explanatory variable is inconclu-
sive. The scatter of points may be consistent with a random scatter, which
would suggest that the assumptions that are the basis of regression analysis
are reasonable valid in this case, or they may be consistent with a nonrandom
scatter, which would indicate that the assumptions underlying the regression
analysis are not valid in this case.
16
With such a small sample, only 10 observations, caution has to be exercised in
examining the graphical evidence of these plots.
Taken together, these two plots place in doubt the validity of the assumptions
on which regression analysis is based for the linear model.
On theoretical grounds, a linear model is unsatisfactory because it suggests that
at high mortgage interest rates the Number of Houses Sold in one month would
be negative. For this reason, together with the evidence from the rst plot in
particular, i.e. the scatter diagram of the Number of Houses Sold against the
Mortgage Interest Rate, a nonlinear relationship should be investigated.
6.3 Multiple regression

Y = b
1
+ b
2
X
2
+ b
3
X
3
+ . . . + b
k
X
k
(s.e.) se(b
1
) se(b
2
) se(b
2
) se(b
k
)
(t) t
b
1
t
b
2
t
b
3
t
b
k
n = . . . R
2
= . . . F = . . .
17
7 Heteroscedasticity
We assume that the variance of the error term is constant:
V [
i
] =
2
This is called the assumption of homoscedasticity. If this assumption is correct,
error terms are said to be homoscedastic; otherwise said to be heteroscedastic.
Heteroscedasticity: the variance of the error term is not constant:
V [
i
] =
2
i
The consequences of heteroscedasticity are:
1. OLS estimates for regression coecients are inecient (but remain unbi-
ased).
2. Statistical inference is invalid (hypothesis tests, condence intervals and
prediction intervals).
7.1 BPG (Koenker) Test
Consider population regression model:
Y
i
=
1
+
2
X
i2
+
3
X
i3
+ +
k
X
ik
+
i
Null hypothesis is that the error terms are homoscedastic. Alternate hypothesis
is that they are heteroscedastic (i.e. population variance of at least one eror
term diers from the population variance of at least one other error term.)
Test procedure:
1. Estimate the regression model using OLS:
Y
i
= b
1
+b
2
X
i2
+b
3
X
i3
+ +b
k
X
ik
2. Construct the residuals from the estimated regression model, and square
them:
e
2
i
=
_
Y
i


Y
i
_
2
3. Use OLS to estimate the following auxiliary regression model against the
(k 1) auxiliary variables:
e
2
i
=
1
+
2
X
i2
+
3
X
i3
+ +
k
X
ik
+
i
4. Construct the BPG (Koenker) test statistic:
LM = nR
2
18
5. Under the null hypothesis of homoscedastic error terms in the original
regression model, the BPG (Koenker) statistic has a
2
distribution with
(k 1) degrees of freedom.
LM = nR
2

2
k1
6. For a given signicance level , compare the LM test statistic with the
critical value form the
2
k1
distribution, and a proportion in the right-
hand tail of the distribution.
NOTE: The original BPG and the BPG (Koenker) test are asymptotic tests
they are not valid in small samples.
Under BPG, the error terms are assumed to be normally distributed.
The Koenker variant does not depend on normality of error terms (say its robust
to the assumption of normality.) Hence prefer Koenker variant.
Can use BPG/BPG (Koenker) to test for other forms of heteroscedasticity. Not
considered in this course.
7.2 Whites test
White (1980) proposed a more robust test for heteroscedasticity, which depends
on a large number of explanatory variables in the auxiliary regression model.
This results in a loss of degrees of freedom, which is a disadvantage.
Whites test not considered in this course.
19
8 Auto-correlation
Autocorrelation is one type of heteroscedasticitya violation of the regression
assumption that Error terms are statistically independent. It is very common
in time series data. Autocorrelation is represented by the model:

t
=
t1
+
t
1. is a constant
2. 1 < < 1
3.
t
[0,
2
]
4. Values of
t
statistically independent: E[
t
.
s
] = 0 (t = s)
Note that is the population correlation coecient between
t
and
t1
.
Therefore, if = 0 then
t
and
t1
are uncorrelated, so
=
t
[0,
2
]
However, if = 0 then
t
and
t1
are correlated and the assumptions of the
regression model dont hold.
In this case, called rst-order auto-correlated. Second and higher orders do
exist, where error terms depend on more than one previous error term. Higher
orders are useful for time-series (e.g. quarterly) data:

t
=
1

t1
+
2

t2
+
t
The consequences of autocorrelated errors are as mentioned above:
1. OLS estimates for regression coecients are inecient.
2. Estimator for variance of error terms is invalid, and therefore statistical
inference is invalid:
(a) Hypothesis test invalid
(b) Condence intervals invalid
(c) Prediction intervals invalid
8.1 Durbin-Watson test
Tests 1st order auto-correlation. Older test which is widely available, and grad-
ually being replaced in the literature. Despite this, lots of the existing econo-
metric and statistical literature includes references to it.
Assumptions for DW test are very similar to that of regression:

t
=
t1
+
t
20
1 < < 1

t
N[0,
2
]
E[
t
.
s
] = 0 (t = s)
Hypotheses (nb can use both > or <):
H
0
: = 0
H
1
: 0
Test statistic:
d = DW =
n

t=2
(e
t
e
t1
)
2
n

t=1
e
2
t
Note that
t
=
t1
+
t
resembles a regression model. You can just regress

t
on
t1
with estimates: e
t
= e
t1
+
t
, then give an OLS estimate for .
Unfortunately the assumptions for the t-test break down, so we use the DW
test as a proxy instead.
The D-W statistic approximately related to in large samples:
d 2 (1 )
If = 1 then d 0, and so on. Approximately, 0 < d < 4.
Note the distribution changes, and depends on n, and k

. Have to look up
two critical values. If outside both critical values, then denitely a problem.
Between the critical values, test is inconclusivebut in reality this indicates
a problem.
0 < d
L
< d
U
< 2 < (4 d
U
) < (4 d
L
) < 4
Procedure: reject H
0
if d < d
L
or d > (4 d
L
).
If d
L
< d < d
U
or (4 d
U
) < d < (4 d
L
) then the test is inconclusive.
Tables give the lower-tail value (for positive ); if upper tail is needed, subtract
4 from table value.
Degrees of freedom k

is the number of explanatory variables, excluding the


constant.
k

= k 1
21
8.2 Breusch-Godfrey (BG) Test for 1st-order Autocorre-
lation
Population regression function:
Y
t
=
1
+
2
X
t2
+
3
X
t3
+ +
k
X
tk
+
t
(t = 1, 2, . . . , n)
Estimated regression equation:

Y
t
= b
1
+b
2
X
t2
+b
3
X
t3
+ +b
k
X
tk
e
t
= Y
t


Y
t
If rst-order autocorrelation occurs, ie = 0, then:

t
=
t1
+
t
1 < < 1

t
n.i.d.[0,
2
]
NB: result: no problem with bias on estimates (although inecient.) BUT
estimate of error is biased. So all bets are o with inference.
If autocorrelation exists, then:
e
t
= Y
t


Y
t
= (
1
+
2
X
t2
+
3
X
t3
+ +
k
X
tk
+
t
)
(b
1
+b
2
X
t2
+b
3
X
t3
+ +b
k
X
tk
)
= (
1
b
1
) + (
2
b
2
)X
t2
+ (
3
b
3
)X
t3
+ + (
k
b
k
)X
tk
+ (
t1
+
t
)
Note that the above is a regression equation. Remember that theres no bias
in the regression coecient estimates, so each term here approximately equals
zero.
We can also write it as:
e
t
=
1
+
2
X
t2
+
3
X
t3
+ +
k
X
tk
+
t1
+
t
Since
t1
not known, estimate with e
t1
. The resulting auxilliary regression
equation:
e
t

1
+
2
X
t2
+
3
X
t3
+ +
k
X
tk
+e
t1
+
t
Steps to conduct the test:
1. Perform regression with the equation

Y
t
= b
1
+b
2
X
t2
+b
3
X
t3
+ +b
k
X
tk
2. Calculate OLS residuals:
e
t
= Y
t


Y
t
22
3. Perform regression on auxilliary equation:
e
t

1
+
2
X
t2
+
3
X
t3
+ +
k
X
tk
+e
t1
+
t
4. Calculate test statistic from the coecient of determination R
2
, noting
that we lose a datum:
LM = NR
2
N = (n 1)
5. Test against
LM = NR
2

2
1
BG test is a two-tailed test of the following hypothesis:
H
0
: = 0
H
1
: = 0
Compare with D-W which is a one-sided test.
Note that this is a large-sample test. It may not be valid (or have high error)
in small samples.
NB you can test for pth-order autocorrelation with LM = (N p)R
2

2
p
,
slightly dierent to our version, but theres no appreciable dierence in large
samples. Not considered in this course.
Conclusion: The sample evidence is consistent with the existence of a serious
problem of rst-order autocorrelation in the error term.
8.3 Durbin-Watson versus Breusch-Godfrey test
The Durbin-Watson test is only valid for 1st-order autocorrelation if:
1. Regression model includes an intercept
2. Explanatory variables are xed in repeating samples, ie non-stochastic.
3. Error terms cannot follow a pth-order autocorrelation pattern where p > 1.
4. Error terms must be normally distributed.
5. Population regression model doesnt include any lagged values of depen-
dent variable as explanatory variables.
If these assumptions are satised, then D-W works even for small samples. BG
test is only valid in large samples.
The BG test remains valid even if explanatory variables are lagged values of the
dependent variable. It can also be extended test for pth-order autocorrelation,
where p > 1.
23
9 Multicollinearity
In the multiple regression model we assume that the explanatory variables
X
1
, X
2
, . . . , X
k
are linearly independent: cant express any variables as an exact
linear combination of each other.
Y =
1
+
2
X
2
+
3
X
3
+ +
K
X
K
+
If assumption violated, explanatory variables are said to be exactly collinear
or exactly multicollinear. In multiple regression where this happens exactly or
approximately, this is called multicollinearity (can use with collinearity inter-
changeably.)
Two variables are exactly collinear if the correlation coecient between them is
1 or -1. (And you wont be able to even estimate the model.) If the correlation
coecient is 0, they are not collinear.
Assume 3 explanatory variables, X
2
, X
3
, X
4
. Regress X
2
against X
3
, X
4
.
1. R
2
= 1: exactly collinear.
2. 0 < R
2
< 1: linear association exists, but only approximate.
3. R
2
= 0: relationship doesnt exist. But, this very rarely happens.
Multicollinearity is a potential problem in all regression models based on eco-
nomic nancial data.
Multicollinearity is a data problem, not the model.
9.1 Consequences
If exact multicollinearity exists, the regression coecients cannot be estimated
(0).
If explanatory variables are closely related but collinearity not exact:
1. Regression coecients can be estimated, but have large errors. (The co-
ecients arent biased; theyre in fact still B.L.U.E.)
2. t-tests of signicance may be misleading, because they lack power. Rele-
vant explanatory variables may appear to be statistically insignicant.
Perfect multicollinearity: serious problem.
Near-perfect multicollinearity: If the degree is high, then likely to be a
serious problem.
24
10 Qualitative Analysis: Dummy Variables
Dummy variables take the values {0, 1}. They assess how economic models
respond to structural change.
You can measure a model with two dierent sets of observations.
Model with an Additive Dummy Variable:
Y =
1
+
2
X
2
+D +
i
D =
_
0
1
This eectively yields two separate, parallel regression models with dierent
intercepts:
Y =
1
+
2
X
2
+
i
when D = 0
Y = (
1
+) +
2
X
2
+
i
when D = 1
Model with a Multiplicative Dummy Variable:
Y =
1
+
2
X +DX +
i
The two resulting models have the same intercept, but dierent slopes:
Y =
1
+
2
X +
i
when D = 0
Y =
1
+ (
2
+)X +
i
when D = 1
And of course you can mix and match the two types:
Y =
1
+
2
X +
1
D
2
DX +
i
Extremely useful for modelling structural change at some point during the model
(e.g. strike, liberalization of dollar.) The signicance of these changes can be
assessed statistically:
H
0
:
1
= 0
H
1
:
1
> 0
Dummy Variable Trap: including unneccessary dummy variables in the model
can lead to perfect multicollinearity, which renders the model unestimable. For
example, variables for both male and female.
You can get away with two perfect variables that are perfect complements by
eliminating the regression constant. All of these models are ne:
Y =
1
+
2
X
2
+
m
D
m
+
i
Y =
1
+
2
X
2
+
f
D
f
+
i
Y = X +
m
D
m
+
f
D
f
+
i
where D
f
and D
m
are indicators for biological sex. It would not be possible,
however, to include the constant and both dummy variables.
25
10.1 Time-Series Analysis
You can use dummy variables to model summer/winter, or quarters. Remember
to not specify all of them, or you get perfect collinearity. For summer/winter,
use 1 dummy; for quarters, use 3 dummies.
Data is often spikey, with eg seasonal peaks. Very often, need to be aware of
trends, irrespective of seasonal variations. Want seasonally adjusted values.
Note that if there is a seasonal variation, and the specication ignores that
variation (ie spec. error), youll almost certainly get a high D-W value and
signicant .
You might model quarterly data as follows:
Y =
1
+
2
TIME +
1
Q
1
+
2
Q
2
+
3
Q
3
+
i
Where Q
1
is 1 in September quarter, 0 otherwise, and so on. We omit Q
4
or
well get perfect multicollinearity. (Or, suppress intercept.)
You still need to examine the correlation between the dummy variables; its
entirely possible to get low t-values because of high correlation. If theres no
low t, then theres no multicollinearity problem.
26

Potrebbero piacerti anche