MEI Iyengar univariateTS2

Univariate Time
series - 2
Methods of Economic
Investigation
Lecture 19
Last Time
Concepts that are useful
Stationarity
Ergodicity
Ergodic Theorem
Autocovariance Generating Function
Lag Operators
Time Series Processes
AR(p)
MA(q)
Todays Class
Building up to estimation
Wold Decomposition
Estimating with exogenous, serially correlated
errors
Testing for Lag Length
Refresher
Stationarity: Some persistence but not too
much
Ergodicity: Persistence dies out entirely over
some finite period of time
Square Summability (assumption for MA
process) with parameter such that j2
j 0
Invertibility (assumption for AR process)

with parameter which has roots such
that 1
ARMA process
In general can have a process with both AR and

MA components
A general ARMA(p, q) process in our lag

function notation this looks like:a(L)xt = b(L)t
For example, we may have an ARMA(2, 1)

xt (1xt-1 + 2xt-3) = t + 1 t-1
(1 - 1L 2L2) xt = (1 + 1L) t
If the process is invertible then we can rewrite

this as: xt=a(L)-1 b(L) t
Why Focus on ARMA

processes
Define the range of ARMA processes

(invertible AR lag polynomials, square
summable MA lag polynomials) which can
rely on convergence theorems
any time series that is covariance

stationary, has a linear ARMA
representation.
Information Sets
At time t-n
Everything for time t-n and before is known

Everything at time t is unknown
Information set t-n
Define Et-n(t) = E[t | t-n]
Distinct from E[t] because we know previous values

of s up until t-n
For example, suppose n = 1 and t = t-1+,
E (t) =0 for all t so its a mean zero process
Et-1(t) = t-1
Recalling the CEF
Define the linear conditional expectation

function CEF(a | b) which is the linear
project, i.e. the fitted values of a
regression of a on b. i.e. a = b
This is distinct from the general

expectations operator in that it is imposing
a linear form of the conditional expectation
function.
Wold Decomposition
Theorem - 1
Formally the Wold Decomposition Theorem

says that:
Any mean zero weakly stationary process
{xt} can be represented in the form
xt j t j t
j 0
This comes with some properties for each

term
Wold Decomposition
Theorem - 2
Where
t xt CEF(xt | xt-1, xt-2, . . . ,x0).
Properties of t
CEF (t|xt1, xt2, . . . x0)=0, E(txtj) = 0, E(t) = 0,
E(t2) = 2 for all t, and E(t s) = 0 for all t s
The MA polynomial is invertible

The parameters is square summable
{j} and {s} are unique.
t is linearly deterministic
i.e. t = CEF(t|xt1, . . . .).
A note on the Wold

Decomposition
Much of the properties come directly from

our assumptions tha the process is weakly
stationary
While it says mean zero process,

remember we can de-mean our data so
most processes can be represented in this
format.
Uses of Wold Form
This theorem is extremely useful because it

returns time-series processes back to our
standard OLS model.
the Wold MA() representation is unique.
Notice that weve relaxed some of the conditions for the

Gauss-Markov theorem to hold.
if two time series have the same Wold representation,
then they are the same time series
This on true only up to second moments in linear
forecasting
Emphasis on Linearity
although CEF(t | xtj) = 0, can have

E(t | xtj) 0 with nonlinear projections
If the true xt is not generated by linear

combinations of past xt plus a shock, then the
Wold shocks (s) will be different from the true
shocks.
The uniqueness result only states that the Wold

representation is the unique linear
representation where the shocks are linear
forecast errors.
Estimating with Serially

Correlated Errors
Suppose that we have: Yt = Xt + t
E[t | xt] = 0, E[t2 | xt]=2
E[ t t-k] = k for k0 and so define E[tk] = 2
We could consistently estimate but our standard

errors would be incorrect making it difficult to do
inference.
Just a heteroskedasticity problem which we have

already seen with random effects
Use feasible GLS to estimate weights and then re-estimate

OLS to obtain efficient standard errors.
Endogenous Lagged
Regressors
May be the case that either the dependent

variable or the regressor should enter the
estimating equation in lag values too
Suppose we were estimating t
Yt = 0 Xt +1 Xt-1 + k Xt-k + t.
We think that these Xs are correlated with Y up

to some lag length k
We think these Xs are correlated with each other
(e.g. some underlying AR process)
but were not sure how many lags to include
Naive Test
Include lags going very far back r >> k
test the longest lag coefficient r = 0 and
see if that is significant. If not, drop it and
keep going.
Problems:
Practically, the longer lags you take, the more

data you make unusable because it doesnt
have enough time periods to construct the lags.
doesnt allow lag t-6 but exclude lag t-3.

The theoretical issue is that we will reject the
null 5 percent of the time, even if its true (or
whatever the significance of the test is).
More sophisticated testing
Can be a bit more sophisticated comparing

restricted and unrestricted models
define pmax as some long lag length greater

than the expected relevant lag length
In general, we do not test our pmax but as
before, as p pmax the sample size decreases.
Define j = Yt = 0 Xt +1 Xt-1 + j Xj and let
N be the sample size.
We therefore could imagine trying to minimize
the sum of squared residual:
j ' j
min log
o j pmax
N
c ( n)
( j 1)
n
Cost Functions
Intuition: c( . ) is a penalty for adding

additional parameters
thus we try to pick the best specification using

that cost function to penalize inclusion of extra
but irrelevant lags.
Akaike (AIC): c(n) = 2
the AIC criterion is not well-founded in theory and
will be biased in finite samples
the bias will tend to overstate the true lag length
Bayesian: c(n) = log(n)
the BIC will converge to the true p.
Return to Likelihood Ratio

Tests
The minimization problem is just likelihood

ratio test
To see this, compare lag length j to lag
length k. We can write:
Define constant
LR test
Constant: Declining in N
Next Time
Multivariate Time Series
Testing for Unit Roots

Cointegration
Returning to Causal Effects
Impulse Response Functions

Forecasting

MEI Iyengar univariateTS2

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

MEI Iyengar univariateTS2

Caricato da

Copyright:

Formati disponibili

Univariate Time

Concepts that are useful

Time Series Processes

process) with parameter such that j2

Invertibility (assumption for AR process)

In general can have a process with both AR and

A general ARMA(p, q) process in our lag

For example, we may have an ARMA(2, 1)

If the process is invertible then we can rewrite

Why Focus on ARMA

Define the range of ARMA processes

any time series that is covariance

Everything for time t-n and before is known

Define Et-n(t) = E[t | t-n]

Distinct from E[t] because we know previous values

E (t) =0 for all t so its a mean zero process

Recalling the CEF

Define the linear conditional expectation

This is distinct from the general

Formally the Wold Decomposition Theorem

This comes with some properties for each

CEF (t|xt1, xt2, . . . x0)=0, E(txtj) = 0, E(t) = 0,

E(t2) = 2 for all t, and E(t s) = 0 for all t s

The MA polynomial is invertible

i.e. t = CEF(t|xt1, . . . .).

A note on the Wold

Much of the properties come directly from

While it says mean zero process,

Uses of Wold Form

This theorem is extremely useful because it

the Wold MA() representation is unique.

Notice that weve relaxed some of the conditions for the

although CEF(t | xtj) = 0, can have

If the true xt is not generated by linear

The uniqueness result only states that the Wold

Estimating with Serially

Suppose that we have: Yt = Xt + t

E[t | xt] = 0, E[t2 | xt]=2

E[ t t-k] = k for k0 and so define E[tk] = 2

We could consistently estimate but our standard

Just a heteroskedasticity problem which we have

Use feasible GLS to estimate weights and then re-estimate

May be the case that either the dependent

Suppose we were estimating t

We think that these Xs are correlated with Y up

Practically, the longer lags you take, the more

doesnt allow lag t-6 but exclude lag t-3.

More sophisticated testing

Can be a bit more sophisticated comparing

define pmax as some long lag length greater

Intuition: c( . ) is a penalty for adding

thus we try to pick the best specification using

Bayesian: c(n) = log(n)

the BIC will converge to the true p.

Return to Likelihood Ratio

The minimization problem is just likelihood

Multivariate Time Series

Testing for Unit Roots

Returning to Causal Effects

Impulse Response Functions

Potrebbero piacerti anche