Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Monique de Haan
(moniqued@econ.uio.no)
Lecture outline
Autocorrelation
Autoregressions
Cross section data is data collected for multiple entities at one point in time
Panel data is data collected for multiple entities at multiple points in time
Time series data is data collected for a single entity at multiple points in time
Quarterly data on the inflation & unemployment rate in the U.S from
1957-2005
4
Quarterly time series data on inflation for the U.S. from 1957 to 2005
15
10
Inflation (%)
Quarterly time series data on unemployment for the U.S. from 1957-2005
12
11
10
Unemployment rate
9
8
7
6
5
4
3
2
1
0
1957q1 1969q1 1981q1 1993q1 2005q1
time
6
In general Ytj is called jth lag and similarly, Yt+j is the jth future value
(2) forecasting
8
Models that are useful for forecasting need not have a causal
interpretation!
Yt Yt1 4Yt
=
Yt1 Yt1
instead, we often use the logarithmic growth" or log-difference":
inflationt
= 100 4 (ln (Yt ) ln (Yt1 ))
tsset time
gen ln_CPI=ln(CPI)
Stata: gen ln_CPI_1stlag=ln(L1.CPI)
gen inflation=400*( ln(CPI) - ln(L1.CPI) )
Autocorrelation
In time series data, Yt is typically correlated with Ytj , this is called
autocorrelation or serial correlation
Cov\
(Yt , Ytj )
bj =
\
Var (Yt )
j start-up observations are lost in constructing these sample statistics
Autocorrelation
Tuesday April 15 15:38:55 2014 Page 1
Stationarity
When predicting the future of a time series a good place start is in the
immediate past.
Yt = 0 + 1 Yt1 + ut
Y
bT +1|T = b0 + b1 YT
Forecast error = YT +1 Y
bT +1|T
15
They are calculated for the observations in the sample used to estimate
the regression.
They are calculated for some date beyond the data set used to estimate
the regression.
Robust
d_inflation Coef. Std. Err. t P>|t| [95% Conf. Interval]
d_inflation
L1. -.2380471 .0965017 -2.47 0.015 -.4285431 -.047551
4inf
[
2005q1|2004q4 = 0.017 0.238 4inf2004q4 = 0.43
Forecasts are uncertain and the Root Mean Squared Forecast Error
(RMSFE) is a measure of forecast uncertainty.
AR(p) model
Robust
d_inflation Coef. Std. Err. t P>|t| [95% Conf. Interval]
d_inflation
L1. -.2579426 .0925934 -2.79 0.006 -.4407471 -.0751381
L2. -.3220312 .0805465 -4.00 0.000 -.4810518 -.1630106
L3. .1576089 .0841017 1.87 0.063 -.0084307 .3236484
L4. -.0302511 .0930471 -0.33 0.746 -.2139512 .153449
4inf
[
2005q1|2004q4 = 0.41
1 . test L2.d_inflation=L3.d_inflation=L4.d_inflation=0
( 1) L2.d_inflation - L3.d_inflation = 0
( 2) L2.d_inflation - L4.d_inflation = 0
( 3) L2.d_inflation = 0
F( 3, 167) = 6.71
Prob > F = 0.0003
One approach is to start with a model with many lags and to perform a
hypothesis test on the final lag
Delete the final lag if insignificant and perform an hypothesis test on the
new final lag,..., continue until all included lags are significant.
Drawback of this approach is that it can produce too large a model
h i
SSR(p)
1st term ln T
: always decreasing in p
ln(T ) 2
2nd term (p + 1) T
(BIC) or (p + 1) T
(AIC): always increasing in p.
AIC estimates more lags (larger p) than the BIC for T > 7.4,
Robust
d_inflation Coef. Std. Err. t P>|t| [95% Conf. Interval]
d_inflation
L1. -.2380471 .0965017 -2.47 0.015 -.4285431 -.047551
Robust
d_inflation Coef. Std. Err. t P>|t| [95% Conf. Interval]
d_inflation
L1. -.4685035 .0771115 -6.08 0.000 -.6207425 -.3162645
L2. -.4251441 .0821394 -5.18 0.000 -.5873096 -.2629787
unemployment_rate
L1. -2.243865 .4020402 -5.58 0.000 -3.037602 -1.450129
L2. 2.044221 .3875693 5.27 0.000 1.279054 2.809388
This does not mean that we have estimated the causal effect of X on Y!!
Robust
d_inflation Coef. Std. Err. t P>|t| [95% Conf. Interval]
d_inflation
L1. -.4685035 .0771115 -6.08 0.000 -.6207425 -.3162645
L2. -.4251441 .0821394 -5.18 0.000 -.5873096 -.2629787
unemployment_rate
L1. -2.243865 .4020402 -5.58 0.000 -3.037602 -1.450129
L2. 2.044221 .3875693 5.27 0.000 1.279054 2.809388
3 . test L1.unemployment=L2.unemployment=0
( 1) L.unemployment_rate - L2.unemployment_rate = 0
( 2) L.unemployment_rate = 0
F( 2, 167) = 16.13
Prob > F = 0.0000
Nonstationarity: trends
Deterministic trend: Yt = 0 + t + u1
series is a nonrandom function of time
Nonstationarity: trends
80 80
Y
60
Y
60
40 40
20 20
0 20 40 60 80 100 0 20 40 60 80 100
Time Time
32
The value of the series tomorrow is its value today plus an unpredictable
change.
An extension of the random walk model is the random walk with drift
1 1 z 2 z 2 ... p z p = 0
1
In the special case of an AR(1): 1 1 z = 0 gives z = 1
which is
bigger than |1| if |1 | < 1
34
H0 : 1 = 1 vs H1 : 1 < 1 in Yt = 0 + 1 Yt1 + ut
Test can be performed by adding and subtracting Yt1 from both sides
of the equation and estimate
4Yt = 0 + Yt1 + t
H0 : = 1 1 = 0 vs H1 : < 0
35
inflation
Table of DF-.1643553
Critical
L1. Values
.0415311 -3.96 0.000 -.2463383 -.0823722
Reject
DFif the DF t-statistic
= 3.96 (the t-statistic
this is more negative testing =so
than -2.86 0) we
is less than
reject the null
the specified critical value. This is a 1-sided test of the null hypothesis
hypothesis of a stochastic trend at a 5% significance
of a unit root (random walk trend) vs. the alternative that the
level.
36
Often an AR(1) model does not capture all the serial correlation in Yt
and we should include more lags & estimate an AR(p) model.
H0 : = 0 vs H1 : < 0
in the regression
inflation
L1. -.1134169 .0422344 -2.69 0.008 -.1968029 -.030031
d_inflation
L1. -.1864426 .0805144 -2.32 0.022 -.3454068 -.0274783
L2. -.2563879 .081463 -3.15 0.002 -.4172251 -.0955507
L3. .1990491 .0793514 2.51 0.013 .0423811 .3557171
L4. .0099994 .0779921 0.13 0.898 -.1439849 .1639837
DF = 2.69 this is less negative than -2.86 so we do not reject the null
hypothesis of a stochastic trend at a 5% significance level.
38
(a) Yt 4Y
= 0t +
= Y ++
0t1 tu+ (intercept
t Yt1 + 1 4Y t1 + 2only)
4Yt2 + ... + p 4Ytp + ut
(b) Yt = 0 + t + Yt1 + ut (intercept and time trend)
And we have to use the critical values in the second row:
Best way to deal with a trend is to transform the series such that it does
not have a trend.
If the series Yt has a stochastic trend, then the first difference of the
series 4Yt does not have a stochastic trend.
Yt = 0 + Yt1 + ut
4Yt = 0 + ut
This is the reason that in the beginning of the lecture we estimated ARs
with 4inflationt .
40
15
10
Inflation (%)
5
4
Change in inflation
-2
-4
-6
1957q1 1969q1 1981q1 1993q1 2005q1
time