Sei sulla pagina 1di 31

Introduction to Time Series Analysis. Lecture 11.

Peter Bartlett
Last lecture: Forecasting.
1. The innovations representation.
2. Recursive method: Innovations algorithm.
1
Introduction to Time Series Analysis. Lecture 11.
1. Review: Forecasting.
2. Example: Innovations algorithm for forecasting an MA(1)
3. Linear prediction based on the innite past
4. The truncated predictor
2
Review: One-step-ahead linear prediction
X
n
n+1
=
n1
X
n
+
n2
X
n1
+ +
nn
X
1

n
=
n
,
P
n
n+1
= E
_
X
n+1
X
n
n+1
_
2
= (0)

1
n

n
,

n
=
_

_
(0) (1) (n 1)
(1) (0) (n 2)
.
.
.
.
.
.
.
.
.
(n 1) (n 2) (0)
_

_
,

n
= (
n1
,
n2
, . . . ,
nn
)

,
n
= ((1), (2), . . . , (n))

.
3
Review: The innovations representation
Write the best linear predictor as
X
n
n+1
=
n1
_
X
n
X
n1
n
_
. .
innovation
+
n2
_
X
n1
X
n2
n1
_
+ +
nn
_
X
1
X
0
1
_
.
The innovations are uncorrelated:
Cov(X
j
X
j1
j
, X
i
X
i1
i
) = 0 for i = j.
Well see that this is useful for estimation.
4
Introduction to Time Series Analysis. Lecture 11.
1. Review: Forecasting.
2. Example: Innovations algorithm for forecasting an MA(1)
3. Linear prediction based on the innite past
4. The truncated predictor
5
Example: Innovations algorithm for forecasting an MA(1)
Suppose that we have an MA(1) process {X
t
} satisfying
X
t
= W
t
+
1
W
t1
.
Given X
1
, X
2
, . . . , X
n
, we wish to compute the best linear forecast of
X
n+1
, using the innovations representation,
X
0
1
= 0, X
n
n+1
=
n

i=1

ni
_
X
n+1i
X
ni
n+1i
_
.
6
Example: Innovations algorithm for forecasting an MA(1)
An aside: The linear predictions are in the form
X
n
n+1
=
n

i=1

ni
Z
n+1i
for uncorrelated, zero mean random variables Z
i
. In particular,
X
n+1
= Z
n+1
+
n

i=1

ni
Z
n+1i
,
where Z
n+1
= X
n+1
X
n
n+1
(and all the Z
i
are uncorrelated).
This is suggestive of an MA representation. Why isnt it an MA?
7
Example: Innovations algorithm for forecasting an MA(1)

n,ni
=
1
P
i
i+1
_
_
(n i)
i1

j=0

i,ij

n,nj
P
j
j+1
_
_
.
P
0
1
= (0) P
n
n+1
= (0)
n1

i=0

2
n,ni
P
i
i+1
.
The algorithm computes P
0
1
= (0),
1,1
(in terms of (1));
P
1
2
,
2,2
(in terms of (2)),
2,1
; P
2
3
,
3,3
(in terms of (3)), etc.
8
Example: Innovations algorithm for forecasting an MA(1)

n,ni
=
1
P
i
i+1
_
_
(n i)
i1

j=0

i,ij

n,nj
P
j
j+1
_
_
.
For an MA(1), (0) =
2
(1 +
2
1
), (1) =
1

2
.
Thus:
1,1
= (1)/P
0
1
;

2,2
= 0,
2,1
= (1)/P
1
2
;

3,3
=
3,2
= 0;
3,1
= (1)/P
2
3
, etc.
Because (n i) = 0 only for i = n 1, only
n,1
= 0.
9
Example: Innovations algorithm for forecasting an MA(1)
For the MA(1) process {X
t
} satisfying
X
t
= W
t
+
1
W
t1
,
the innovations representation of the best linear forecast is
X
0
1
= 0, X
n
n+1
=
n1
_
X
n
X
n1
n
_
.
More generally, for an MA(q) process, we have
ni
= 0 for i > q.
10
Example: Innovations algorithm for forecasting an MA(1)
For the MA(1) process {X
t
},
X
0
1
= 0, X
n
n+1
=
n1
_
X
n
X
n1
n
_
.
This is consistent with the observation that
X
n+1
= Z
n+1
+
n

i=1

ni
Z
n+1i
,
where the uncorrelated Z
i
are dened by Z
t
= X
t
X
t1
t
for
t = 1, . . . , n + 1.
Indeed, as n increases, P
n
n+1
Var(W
t
) (recall the recursion for P
n
n+1
),
and
n1
= (1)/P
n1
n

1
.
11
Recall: Forecasting an AR(p)
For the AR(p) process {X
t
} satisfying
X
t
=
p

i=1

i
X
ti
+W
t
,
X
0
1
= 0, X
n
n+1
=
p

i=1

i
X
n+1i
for n p. Then
X
n+1
=
p

i=1

i
X
n+1i
+Z
n+1
,
where Z
n+1
= X
n+1
X
n
n+1
.
The Durbin-Levinson algorithm is convenient for AR(p) processes.
The innovations algorithm is convenient for MA(q) processes.
12
Introduction to Time Series Analysis. Lecture 11.
1. Review: Forecasting.
2. Example: Innovations algorithm for forecasting an MA(1)
3. An aside: Innovations algorithm for forecasting an ARMA(p,q)
4. Linear prediction based on the innite past
5. The truncated predictor
13
An aside: Forecasting an ARMA(p,q)
There is a related representation for an ARMA(p,q) process, based on the
innovations algorithm. Suppose that {X
t
} is an ARMA(p,q) process:
X
t
=
p

j=1

j
X
tj
+W
t
+
q

j=1

j
W
tj
.
Consider the transformed process (C. F. Ansley, Biometrika 66: 5965, 1979)
Z
t
=
_
_
_
X
t
/ if t = 1, . . . , m,
(B)X
t
/ if t > m.
If p > 0, this is not stationary. However, there is a more general version of
the innovations algorithm, which is applicable to nonstationary processes.
14
An aside: Forecasting an ARMA(p,q)
Let
n,j
be the coefcients obtained from the application of the innovations
algorithm to this process Z
t
. This gives the representation
X
n
n+1
=
_
_
_

n
j=1

nj
_
X
n+1j
X
nj
n+1j
_
n < m,

p
j=1

j
X
n+1j
+

q
j=1

nj
_
X
n+1j
X
nj
n+1j
_
n m
For a causal, invertible {X
t
}:
E(X
n
X
n1
n
W
n
)
2
0,
nj

j
, and P
n+1
n

2
.
Notice that this illustrates one way to simulate an ARMA(p,q) process
exactly. Why?
15
Introduction to Time Series Analysis. Lecture 11.
1. Review: Forecasting.
2. Example: Innovations algorithm for forecasting an MA(1)
3. An aside: Innovations algorithm for forecasting an ARMA(p,q)
4. Linear prediction based on the innite past
5. The truncated predictor
16
Linear prediction based on the innite past
So far, we have considered linear predictors based on n observed values of
the time series:
X
n
n+m
= P(X
n+m
|X
n
, X
n1
, . . . , X
1
).
What if we have access to all previous values, X
n
, X
n1
, X
n2
, . . .?
Write

X
n+m
= P(X
n+m
|X
n
, X
n1
, . . .)
=

i=1

i
X
n+1i
.
17
Linear prediction based on the innite past

X
n+m
= P(X
n+m
|X
n
, X
n1
, . . .) =

i=1

i
X
n+1i
.
The orthogonality property of the optimal linear predictor implies
E
_
(

X
n+m
X
n+m
)X
n+1i
_
= 0, i = 1, 2, . . .
Thus, if {X
t
} is a zero-mean stationary time series, we have

j=1

j
(i j) = (m1 +i), i = 1, 2, . . .
18
Linear prediction based on the innite past
If {X
t
} is a causal, invertible, linear process, we can write
X
n+m
=

j=1

j
W
n+mj
+W
n+m
, W
n+m
=

j=1

j
X
n+mj
+X
n+m
.
In this case,

X
n+m
= P(X
n+m
|X
n
, X
n1
, . . .)
= P(W
n+m
|X
n
, . . .)

j=1

j
P(X
n+mj
|X
n
, . . .)
=
m1

j=1

j
P(X
n+mj
|X
n
, . . .)

j=m

j
X
n+mj
.
19
Linear prediction based on the innite past

X
n+m
=
m1

j=1

j
P(X
n+mj
|X
n
, . . .)

j=m

j
X
n+mj
.
That is,

X
n+1
=

j=1

j
X
n+1j
,

X
n+2
=
1

X
n+1

j=2

j
X
n+2j
,

X
n+3
=
1

X
n+2

2

X
n+1

j=3

j
X
n+3j
.
The invertible (AR()) representation gives the forecasts

X
n
n+m
.
20
Linear prediction based on the innite past
To compute the mean squared error, we notice that

X
n+m
= P(X
n+m
|X
n
, X
n1
, . . .) =

j=1

j
P(W
n+mj
|X
n
, X
n1
, . . .)
+P(W
n+m
|X
n
, X
n1
, . . .)
=

j=m

j
W
n+mj
.
E(X
n+m
P(X
n+m
|X
n
, X
n1
, . . .))
2
= E
_
_
m1

j=0

j
W
n+mj
_
_
2
=
2
w
m1

j=0

2
j
.
21
Linear prediction based on the innite past
That is, the mean squared error of the forecast based on the innite history
is given by the initial terms of the causal (MA()) representation:
E
_
X
n+m


X
n+m
_
2
=
2
w
m1

j=0

2
j
.
In particular, for m = 1, the mean squared error is
2
w
.
22
The truncated forecast
For large n, truncating the innite-past forecasts gives a good
approximation:

X
n+m
=
m1

j=1

j

X
n+mj

j=m

j
X
n+
m
j

X
n
n+m
=
m1

j=1

j

X
n
n+mj

n+m1

j=m

j
X
n+
m
j
.
The approximation is exact for AR(p) when n p, since
j
= 0 for j > p.
In general, it is a good approximation if the
j
converge quickly to 0.
23
Example: Forecasting an ARMA(p,q) model
Consider an ARMA(p,q) model:
X
t

i=1

i
X
ti
= W
t
+
q

i=1

i
W
ti
.
Suppose we have X
1
, X
2
, . . . , X
n
, and we wish to forecast X
n+m
.
We could use the best linear prediction, X
n
n+m
.
For an AR(p) model (that is, q = 0), we can write down the coefcients
n
.
Otherwise, we must solve a linear system of size n.
If n is large, the truncated forecasts

X
n
n+m
give a good approximation. To
compute them, we could compute
i
and truncate.
There is also a recursive method, which takes time O((n +m)(p +q))...
24
Recursive truncated forecasts for an ARMA(p,q) model

W
n
t
= 0 for t 0.

X
n
t
=
_
_
_
0 for t 0,
X
t
for 1 t n.

W
n
t
=

X
n
t

1

X
n
t1

p

X
n
tp

1

W
n
t1

q

W
n
tq
for t = 1, . . . , n.

W
n
t
= 0 for t > n.

X
n
t
=
1

X
n
t1
+ +
p

X
n
tp
+
1

W
n
t1
+ +
q

W
n
tq
for t = n + 1, . . . , n +m.
25
Example: Forecasting an AR(2) model
Consider the following AR(2) model.
X
t
+
1
1.21
X
t2
= W
t
.
The zeros of the characteristic polynomial z
2
+ 1.21 are at 1.1i. We can
solve the linear difference equations
0
= 1, (B)
t
= 0 to compute the
MA() representation:

t
=
1
2
1.1
t
cos(t/2).
Thus, the m-step-ahead estimates have mean squared error
E(X
n+m


X
n+m
)
2
=
m1

j=0

2
j
.
26
Example: Forecasting an AR(2) model
0 5 10 15 20 25 30
0.5
0
0.5
1
i

i
AR(2): X
t
+ 0.8264 X
t2
= W
t
27
Example: Forecasting an AR(2) model
10 12 14 16 18 20 22 24 26 28 30
5
4
3
2
1
0
1
2
3
4
t
X
t
AR(2): X
t
+ 0.8264 X
t2
= W
t
28
Example: Forecasting an AR(2) model
10 12 14 16 18 20 22 24 26 28 30
6
4
2
0
2
4
6
t
AR(2): X
t
+ 0.8264 X
t2
= W
t
X
t
onestep prediction
95% prediction interval
29
Example: Forecasting an AR(2) model
0 5 10 15 20 25
3
2
1
0
1
2
3
t
AR(2): X
t
+ 0.8264 X
t2
= W
t
X
t
prediction
95% prediction interval
30
Introduction to Time Series Analysis. Lecture 11.
1. Review: Forecasting.
2. Example: Innovations algorithm for forecasting an MA(1)
3. Linear prediction based on the innite past
4. The truncated predictor
31

Potrebbero piacerti anche