Sei sulla pagina 1di 20

Univariate Time

series - 2
Methods of Economic
Investigation
Lecture 19

Last Time

Concepts that are useful

Stationarity
Ergodicity
Ergodic Theorem
Autocovariance Generating Function
Lag Operators

Time Series Processes

AR(p)
MA(q)

Todays Class

Building up to estimation

Wold Decomposition
Estimating with exogenous, serially correlated
errors
Testing for Lag Length

Refresher
Stationarity: Some persistence but not too
much
Ergodicity: Persistence dies out entirely over
some finite period of time
Square Summability (assumption for MA

process) with parameter such that j2

j 0

Invertibility (assumption for AR process)


with parameter which has roots such
that 1

ARMA process

In general can have a process with both AR and


MA components

A general ARMA(p, q) process in our lag


function notation this looks like:a(L)xt = b(L)t

For example, we may have an ARMA(2, 1)


xt (1xt-1 + 2xt-3) = t + 1 t-1
(1 - 1L 2L2) xt = (1 + 1L) t

If the process is invertible then we can rewrite


this as: xt=a(L)-1 b(L) t

Why Focus on ARMA


processes

Define the range of ARMA processes


(invertible AR lag polynomials, square
summable MA lag polynomials) which can
rely on convergence theorems

any time series that is covariance


stationary, has a linear ARMA
representation.

Information Sets

At time t-n

Everything for time t-n and before is known


Everything at time t is unknown
Information set t-n

Define Et-n(t) = E[t | t-n]

Distinct from E[t] because we know previous values


of s up until t-n
For example, suppose n = 1 and t = t-1+,

E (t) =0 for all t so its a mean zero process

Et-1(t) = t-1

Recalling the CEF

Define the linear conditional expectation


function CEF(a | b) which is the linear
project, i.e. the fitted values of a
regression of a on b. i.e. a = b

This is distinct from the general


expectations operator in that it is imposing
a linear form of the conditional expectation
function.

Wold Decomposition
Theorem - 1

Formally the Wold Decomposition Theorem


says that:
Any mean zero weakly stationary process
{xt} can be represented in the form

xt j t j t

j 0

This comes with some properties for each


term

Wold Decomposition
Theorem - 2

Where
t xt CEF(xt | xt-1, xt-2, . . . ,x0).

Properties of t

CEF (t|xt1, xt2, . . . x0)=0, E(txtj) = 0, E(t) = 0,

E(t2) = 2 for all t, and E(t s) = 0 for all t s

The MA polynomial is invertible


The parameters is square summable
{j} and {s} are unique.

t is linearly deterministic

i.e. t = CEF(t|xt1, . . . .).

A note on the Wold


Decomposition

Much of the properties come directly from


our assumptions tha the process is weakly
stationary

While it says mean zero process,


remember we can de-mean our data so
most processes can be represented in this
format.

Uses of Wold Form

This theorem is extremely useful because it


returns time-series processes back to our
standard OLS model.

the Wold MA() representation is unique.

Notice that weve relaxed some of the conditions for the


Gauss-Markov theorem to hold.
if two time series have the same Wold representation,
then they are the same time series
This on true only up to second moments in linear
forecasting

Emphasis on Linearity

although CEF(t | xtj) = 0, can have


E(t | xtj) 0 with nonlinear projections

If the true xt is not generated by linear


combinations of past xt plus a shock, then the
Wold shocks (s) will be different from the true
shocks.

The uniqueness result only states that the Wold


representation is the unique linear
representation where the shocks are linear
forecast errors.

Estimating with Serially


Correlated Errors

Suppose that we have: Yt = Xt + t

E[t | xt] = 0, E[t2 | xt]=2

E[ t t-k] = k for k0 and so define E[tk] = 2

We could consistently estimate but our standard


errors would be incorrect making it difficult to do
inference.

Just a heteroskedasticity problem which we have


already seen with random effects

Use feasible GLS to estimate weights and then re-estimate


OLS to obtain efficient standard errors.

Endogenous Lagged
Regressors

May be the case that either the dependent


variable or the regressor should enter the
estimating equation in lag values too

Suppose we were estimating t

Yt = 0 Xt +1 Xt-1 + k Xt-k + t.

We think that these Xs are correlated with Y up


to some lag length k
We think these Xs are correlated with each other
(e.g. some underlying AR process)
but were not sure how many lags to include

Naive Test
Include lags going very far back r >> k
test the longest lag coefficient r = 0 and
see if that is significant. If not, drop it and
keep going.
Problems:

Practically, the longer lags you take, the more


data you make unusable because it doesnt
have enough time periods to construct the lags.

doesnt allow lag t-6 but exclude lag t-3.


The theoretical issue is that we will reject the
null 5 percent of the time, even if its true (or
whatever the significance of the test is).

More sophisticated testing

Can be a bit more sophisticated comparing


restricted and unrestricted models

define pmax as some long lag length greater


than the expected relevant lag length
In general, we do not test our pmax but as
before, as p pmax the sample size decreases.
Define j = Yt = 0 Xt +1 Xt-1 + j Xj and let
N be the sample size.
We therefore could imagine trying to minimize
the sum of squared residual:

j ' j
min log
o j pmax
N

c ( n)
( j 1)
n

Cost Functions

Intuition: c( . ) is a penalty for adding


additional parameters

thus we try to pick the best specification using


that cost function to penalize inclusion of extra
but irrelevant lags.
Akaike (AIC): c(n) = 2
the AIC criterion is not well-founded in theory and
will be biased in finite samples
the bias will tend to overstate the true lag length

Bayesian: c(n) = log(n)

the BIC will converge to the true p.

Return to Likelihood Ratio


Tests

The minimization problem is just likelihood


ratio test
To see this, compare lag length j to lag
length k. We can write:

Define constant

LR test

Constant: Declining in N

Next Time

Multivariate Time Series

Testing for Unit Roots


Cointegration

Returning to Causal Effects

Impulse Response Functions


Forecasting

Potrebbero piacerti anche