Sei sulla pagina 1di 31

Econometrics [EM2008]

Lecture 3
Specification error in the the k-variable model

Irene Mammi

irene.mammi@unive.it

Academic Year 2018/2019

1 / 35
outline

I specification error in k-variable linear regression


I specification error
I model evaluation
I tests of parameter constancy
I tests of structural change
I dummy variables

I References:
I Johnston, J. and J. DiNardo (1997), Econometrics Methods, 4th
Edition, McGraw-Hill, New York, Chapter 4.

2 / 35
specification error

I if any of the underlying assumptions are wrong, there is a


specification error
I the specification of the linear model centers on the error vector u and
the X matrix
I recall the assumptions:
y = Xβ + u
ui are iid (0, σ2 ) i = 1, . . . , n
or
ui are iid N (0, σ2 ) i = 1, . . . , n
E(Xit us ) = 0 for all i = 1, . . . , k and t, s = 1, . . . , n
X is nonstochastic with full column rank k

3 / 35
possible problems with u

1. ui are white noise but not normally distributed: this does not destroy
the BLUE property of OLS, but inference procedures are only
asymptotically valid
2. E(uu 0 ) = diag[σ12 · · · σn2 ]: the assumption of homoskedasticity is
violated The variance-covariance matrix for u is diagonal
3. E(ut ut −s ) 6= 0,(s 6= 0): the errors are pairwise correlated

4 / 35
possible problems with X

1. exclusion of relevant variables


2. inclusion of irrelevant variables
3. incorrect functional form: we may have an appropriate list of variables
but have embedded them in an incorrect functional form. E.g. we use
equation
Y = β 1 + β 2 X2 + β 3 X3 + u
when the correct specification would be

Y = β 1 + β 2 X2 + β 3 X3 + γ2 X22 + γ3 X32 + δ(X2 X3 ) + u

4. the X matrix has less than full column rank: this precludes estimation
of a unique b vector
5. nonzero correlations between the regressors and the errors
6. nonstationary variables

5 / 35
possible problems with β

I the implicit assumption for model y = X β + u is that the β vector is


constant over all actual or possible sample observations
I there may be structural breaks in coefficients
I the effects may be heterogeneous across individuals

6 / 35
tests of parameter constancy

the Chow forecast test

I if the parameter vector is constant, out-of-sample predictions will


have specified probabilities of lying within bounds calculated from the
sample data
I “large” prediction errors cast doubts on the constancy hypothesis
I instead of using all the sample observations for estimation, divide the
sample of n observations into n1 observations to be used for
estimation and n2 = n − n1 observations to be used for testing

7 / 35
tests of parameter constancy (cont.)

I the test of predictive accuracy, or Chow test, is as follows:

1. estimate the OLS vector from the n1 observations, obtaining

b 1 = (X 10 X 1 )−1 X 10 y 1

where X i , y i (i = 1, 2) indicate the partitioning of the data into


n1 , n2 observations
2. use b 1 to obtain a prediction of the y 2 vector, namely.

ŷ 2 = X 2 b 1

3. obtain the vector of prediction errors and analyze its sample


distribution under the null hypothesis of parameter constancy

8 / 35
tests of parameter constancy (cont.)

I the vector of prediction errors is

d = y 2 − ŷ 2 = y 2 − X 2 b 1

I if the equation y = X β + u, with E(uu 0 ) = σ2 I, holds for both sets


of data, the vector of prediction errors may be written as

d = y 2 − X 2 b 1 = u 2 − X 2 (b 1 − β )

I thus E(d ) = 0, and it may be shown that the variance-covariance


matrix for d is

var(d ) = E(d d 0 )
= σ2 I n2 + X 2 · var(b 1 ) · X 20
= σ2 [I n2 + X 2 (X 10 X 1 )−1 X 20 ]

9 / 35
tests of parameter constancy (cont.)

I there is an alternative way to derive the test of predictive accuracy


I suppose we allow for the possibility of a different coefficient vector in
the forecast period so that the complete model could be rewritten as

y 1 = X 1 β + u1

y 2 = X 2 α + u 2 = X 2 β + X 2 (α − β) + u 2 = X 2 β + γ + u 2
where γ = X 2 (α − β). If γ = 0, then α = β, and the coefficient
vector is constant over the estimation and forecast periods

I the resultant test statistic for γ = 0 is

(e 0∗ e ∗ − e 10 e 1 )/n2
F = ∼ F (n2 , n1 − k )
e 10 e 1 /(n1 − k )

11 / 35
tests of parameter constancy (cont.)

I the Chow test may thus be implemented as follows:

1. using n1 observations, regress y 1 on X 1 and obtain the RSS, e 10 e 1


2. fit the same regression using all (n1 + n2 ) observations and obtain
the restricted RSS, e 0∗ e ∗
3. substitute in the F statistic and reject the null of parameter
constancy if F exceeds the relevant critical value

15 / 35
tests of parameter constancy (cont.)
the Ramsey RESET test

I Ramsey argues that various specification errors (omitted variables,


incorrect functional form, correlation between X and u) give rise to a
nonzero u vector
I thus the null and alternative hypotheses are

H0 : u ∼ N (0, σ2 I )
H1 : u ∼ N (µ, σ2 I ) µ 6= 0

I the test of H0 is based on an augmented regression

y = Xβ + Zα + u

I the test for specification error is the α = 0


I Ramsey’s suggestion is that Z should contain powers of the predicted
values of the dependent variable

16 / 35
tests of parameter constancy (cont.)

I using the second, third and fourth power gives

Z = ŷ 2 ŷ 3 ŷ 4
 

0
where ŷ = Xb and ŷ 2 = ŷ 21 ŷ 22 · · · ŷ 2n etc. The first power,


ŷ, is not included since it is an exact linear combination of the


 
columns of X. Its inclusion would make the regressor matrix X Z
have less than full rank.

17 / 35
tests of structural change

test of one structural change

I three alternative (equivalent) ways to carry out the test


I let y i , X i (i = 1, 2) indicate the appropriate partitioning of the data
I the unrestricted model may be written as
    
y1 X1 0 β1
= + u u ∼ N (0, σ2 I )
y2 0 X 2 β2

where β1 and β2 are k-vectors of, say, peacetime and wartime


coefficients respectively
I the null hypothesis of no structural break is

H0 : β 1 = β 2

18 / 35
tests of structural change (cont.)

I first approach: straightforward application of the test for linear


 
restrictions (R β = r ): the null hypothesis defines R = I k −I k
and r = 0

19 / 35
tests of structural change (cont.)

I second approach: a test of linear restrictions may also be formulated


in terms of an unrestricted RSS and a restricted RSS: in this case, the
null hypothesis gives the restricted model as
   
y1 X1
= β+u
y2 X2

I denoting RSS from the restricted model as e 0∗ e ∗ , the test of the null
is given by
(e 0 e ∗ − e 0 e )/k
F = ∗0 ∼ F (k, n − 2k )
e e/(n − 2k )

20 / 35
tests of structural change (cont.)

I third approach: consider an alternative setup of the unrestricted


model,      
y1 X1 0 β1
= + +u
y2 X2 X2 β2 − β1
I now to test H0 simply test the joint significance of the last k
regressors

21 / 35
tests of structural change (cont.)
tests of slope coefficients

I frequently one does not wish to impose restrictions on the intercept


term
I partition the X matrices as

X 1∗ X 2∗
   
X1 = i1 X2 = i2

where i 1 , i 2 are n1 and n2 vectors of ones, and the X i∗ are matrices of


the k − 1 regressor variables
I the conformable partitioning of the β vectors is

β10 = α1 β1∗0 β20 = α2 β2∗0


   

I the null hypothesis is now

β1∗ = β2∗

22 / 35
tests of structural change (cont.)
I the unrestricted model is
 
 α1
X 1∗
  
y1 i 0 0   α2 
= 1

∗  ∗ + u
y2 0 i2 0 X2 β1
β2∗
I the restricted model is
 
 α
X 1∗  1 
  
y1 i 0
= 1 α2 + u
y2 0 i2 X 2∗
β∗
I the test can be based on the RSS from these two regressions
I an alternative formulation of the unrestricted model is
 
α1
0 X 1∗ 0 
   
y1
= 1
i  α2 −∗ α1  + u

∗ ∗ 
y2 i2 i2 X2 X2 β1 
β2 − β1∗

I the test of the last k − 1 regressors is a test of the null hypothesis


23 / 35
tests of structural change (cont.)
test of intercepts

I a meaningful test of differential intercepts assumes common


regression slopes
I the unrestricted model is again
 
 α
X 1∗  1 
  
y1 i 0
= 1 α2 + u
y2 0 i2 X 2∗
β∗

I the restricted model is

X 1∗
    
y1 i α
= 1 +u
y2 i2 X 2∗ β∗

I contrasting RSS between the two models provides a test of equality of


intercepts, given equal regression slopes

24 / 35
tests of structural change (cont.)

I the alternative setup of the unrestricted model is


 
α1
i 1 0 X 1∗ 
   
y1
= α2 − α1  + u
y2 i 2 i 2 X 2∗
β∗

I now a test of the significance of the second regressor tests the


conditional hypothesis that the intercepts are equal

25 / 35
tests of structural change (cont.)
summary
I there is a hierarchy of the three models, namely,

X 1∗
    
y1 i α
I: = 1 +u Common parameters
y2 i2 X 2∗ β∗
 
 α
X 1∗  1 
  
y1 i 0 Differential intercepts
II : = 1 α2 + u
y2 0 i2 X 2∗ common slope vectors
β∗
 
α1
X 1∗ 0 
   
y1 i 0  α2  + u Differential intercepts
= 1

III :
y2 0 i2 0 X ∗  β∗ 
2 1 differential slope vectors
β2∗

I application of OLS to each model will yield a residual sum of squares,


RSS, with associated degrees of freedom, respectively, of n − k,
n − k − 1, and n − 2k

26 / 35
tests of structural change (cont.)
I the test statistics for various hypotheses are as follows

H0 : α1 = α2 Test of differential intercepts

RSS1 − RSS2
F = ∼ F (1, n − k − 1)
RSS2 /(n − k − 1)

H0 : β1∗ = β2∗ Test of differential slope vectors


(RSS2 − RSS3 )/(k − 1)
F = ∼ F (k − 1, n − 2k )
RSS3 /(n − 2k )

H0 : β1 = β2 Test of differential parameters (intercept and slopes)

(RSS1 − RSS3 )/k


F = ∼ F (k, n − 2k )
RSS3 /(n − 2k )

27 / 35
dummy variables

examples

I the last n2 variables in the augmented matrix


      
y1 X1 0 β u
= + 1
y2 X 2 I n2 γ u2

take the form  


0
I n2
where the 0 matrix is of order n1 × n2 , and I n2 is the identity matrix
of order n2 : each n-vector column is a dummy variable. The effect of
the dummies is to exclude the last n2 observations from the
estimation of the β vector; the coefficients of the dummies are the
forecast errors for the last n2 points

28 / 35
dummy variables (cont.)

I a second type of dummy takes the form


 
0
d2 =
i2

as in the model of structural change

29 / 35
dummy variables (cont.)

Figure 1: Regressions with dummy variables

30 / 35
dummy variables (cont.)

seasonal dummies

I may want to allow for seasonal shift in a relation, e.g. specifying a


quarterly dummy such as

Qit = 1 if observation is in quarter i


= 0 otherwise

for i = 1, . . . , 4
I for the four quarters of each year these dummies are

Q1 Q2 Q3 Q4
1 0 0 0
0 1 0 0
0 0 1 0
0 0 0 1

31 / 35
dummy variables (cont.)
I the relationship may then be written as

Yt = α1 Q1t + · · · + α4 Q4t + x t0 β + ut

I an alternative specification is

Yt = α1 + γ2 Q2t + γ3 Q3t + γ4 Q4t + x t0 β + ut

I comparing coefficients of the dummies in the two equations gives

γ2 = α2 − α1 γ3 = α3 − α1 γ4 = α4 − α1

which shows that the γ’s measure differential intercepts, wrt α1


I the hypothesis of interest is usually

H0 : α1 = α2 = α3 = α4

I alternatively the null hypothesis may be expressed as

H0 : γ2 = γ3 = γ4 = 0
32 / 35
dummy variables (cont.)

qualitative dummies

I assume the interest is on an earning function as

Income = f ((sex, race, education, age)

I the first two variables are qualitative and may be represented by


dummy variables as

S1 = 1 if male
= 0 otherwise

and

S2 = 1 if female
= 0 otherwise

33 / 35
dummy variables (cont.)
I typical entries for the S and R dummies would be

S1 S2 R1 R2 R3
0 1 1 0 0
1 0 0 1 0

I as a different example, let E1 be the dummy for dropouts, and E2 , E3 ,


E4 be the dummies for the highest diploma awarded
I modeling income just as a function of educational level, we have

Y = α1 E1 + α2 E2 + α3 E3 + α4 E4 + u

I the expected level of income, conditional on educational level, is

E(Y |Ei ) = α1 i = 1, . . . , 4

I suppressing E1 , an alternative specification is

Y = α1 + γ2 E2 + γ3 E3 + γ4 E4 + u
34 / 35
dummy variables (cont.)
I the γs measure the marginal increment in expected income for a
diploma over the no-diploma level
I to measure the stepwise marginal increments, let the same dummies
to have value of one if a person has the relevant diploma, irrespective
of whether she has one or more diplomas, and E1 to be the pre-high
school diploma dummy
I the equation to be fitted is

Y = α1 + δ2 E2 + δ3 E3 + δ4 E4 + u

and the expected values are

E (Y |pre-HS diploma) = α1
E (Y |HS diploma) = α1 + δ2
E (Y |bachelor diploma) = α1 + δ2 + δ3
E (Y |graduate diploma) = α1 + δ2 + δ3 + δ4
I the δs provide estimates of the marginal increment from one level to
the next higher level
35 / 35

Potrebbero piacerti anche