Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Statistical Methods
Autocorrelation
Dr. Orlaith
Burke
Contents
1 Model Identification
1.1
1.2
1.3
10
The PACF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10
1.2.1
AR(1) Process . . . . . . . . . . . . . . . . . . . . . . . . .
12
1.2.2
AR(p) Process . . . . . . . . . . . . . . . . . . . . . . . . .
12
1.2.3
MA(1) Process . . . . . . . . . . . . . . . . . . . . . . . .
13
1.2.4
MA(q) process
. . . . . . . . . . . . . . . . . . . . . . . .
13
14
2.2
17
17
2.1.1
Estimating Drift . . . . . . . . . . . . . . . . . . . . . . .
18
2.1.2
19
20
2.2.1
AR Models . . . . . . . . . . . . . . . . . . . . . . . . . .
20
2.2.2
MA Models . . . . . . . . . . . . . . . . . . . . . . . . . .
21
2.2.3
ARMA(1,1) Model . . . . . . . . . . . . . . . . . . . . . .
21
CONTENTS
2.3
22
2.3.1
22
2.3.2
24
2.3.3
25
1
Model Identification
We have been introduced to ARIMA models and have examined in detail the
behaviour of various models. But all of our investigations so far have looked at
theoretical models for time series. Our next task is to consider how we might fit
these models to real time series data.
To begin, in this chapter we examine how to establish what are appropriate values
for p, d and q in fitting an ARIMA(p, d, q) model to a set of data.
1.1
The first technique we use to determine values for p, d and q is to compute the
sample ACF from the data and compare this to known properties of the ACF for
ARIMA models.
The sample autocorrelation function is defined as
Pnk
(Yt Y )(Yt+k Y )
.
rk = t=1Pn
2
(Y
Y
)
t
t=1
(1.1)
The sample autocorrelation rk provides us with an estimate for the true autocorrelation
k . In order to make proper statistical determinations about k we need to
1
1. MODEL IDENTIFICATION
(1.2)
approaches a joint normal distribution with zero means and with variance-covariance
matrix cij where
cij =
k=
2i k k+j 2j k k+i + 2i j 2k ).
(1.3)
1. MODEL IDENTIFICATION
AR1: Yt = 0.9Yt1 + t
t(Y)
200
400
600
800
1000
Time
0.0
0.2
0.4
ACF
0.6
0.8
1.0
10
15
Lag
20
25
30
1. MODEL IDENTIFICATION
AR1: Yt = 0.4Yt1 + t
t(Y)
200
400
600
800
1000
Time
0.0
0.2
ACF
0.4
0.6
0.8
1.0
10
15
Lag
20
25
30
1. MODEL IDENTIFICATION
AR1: Yt = 0.7Yt1 + t
t(Y)
200
400
600
800
1000
Time
0.5
0.0
ACF
0.5
1.0
10
15
Lag
20
25
30
1. MODEL IDENTIFICATION
AR1: Yt = 1.0Yt1 + t
10
t(Y)
10
20
200
400
600
800
1000
Time
0.0
0.2
0.4
ACF
0.6
0.8
1.0
10
15
Lag
20
25
30
1. MODEL IDENTIFICATION
MA1: Yt = 1.0t1 + t
0
4
t(Y)
200
400
600
800
1000
Time
0.5
0.0
ACF
0.5
1.0
10
15
Lag
20
25
30
1. MODEL IDENTIFICATION
MA1: Yt = 0.4t1 + t
t(Y)
200
400
600
800
1000
Time
0.4
0.2
0.0
0.2
ACF
0.4
0.6
0.8
1.0
10
15
Lag
20
25
30
1. MODEL IDENTIFICATION
t(Y)
200
400
600
800
1000
Time
0.4
0.2
0.0
0.2
ACF
0.6
0.8
1.0
10
15
Lag
20
25
30
1. MODEL IDENTIFICATION
1.1.1
10
We can now begin to see how the ACF can be used to determine which model
best fits a set of data.
Firstly we can see from the theoretical ACF and from the previous graphs that
the ACF is very useful for identifying MA processes. This is because the first q
terms in the ACF of an MA(q) process are non-zero and the remaining terms are
all zero.
Secondly we have seen that the ACF of a non-stationary series displays a behaviour
quite different from that of stationary series. In non-stationary series the terms
of the ACF do not decay to zero as they do in a stationary series (c.f. the plots
above).
We have also seen that the ACF of a stationary AR process decays exponentially
to zero.
1.2
The PACF
In the previous section we have seen how the ACF can be used to identify MA
processes, clearly indicating the order q of the process by the number of non-zero
terms in the ACF. It would be great if we could similarly identify the order of an
AR process. There is another correlation function which allows us to do precisely
that: The Partial Autocorrelation Function or PACF.
The PACF computes the correlation between two variables yt and ytk after
removing the effect of all intervening variables yt1 , yt2 , . . . , ytk+1 . We can
think of the PACF as a conditional correlation:
1. MODEL IDENTIFICATION
11
(1.4)
Another way to think of the PACF is to consider regressing yt on yt1 , yt2 , . . . , ytk+1
to give the following least squares estimate:
1 yt1 + 2 yt2 + . . . + k1 ytk+1 .
(1.5)
(1.6)
Consider also regressing ytk on yt1 , yt2 , . . . , ytk+1 . It can be shown (however
we will not prove this result) that the least squares estimate of ytk is
1 ytk+1 + 2 ytk+2 + . . . + k1 yt1 .
(1.7)
(1.8)
(1.9)
1. MODEL IDENTIFICATION
12
(1.10)
2 21
.
1 21
(1.11)
1.2.1
AR(1) Process
(1.12)
hence
22 =
2 2
=0
1 2
(1.13)
1.2.2
AR(p) Process
k>p
(1.14)
1. MODEL IDENTIFICATION
1.2.3
13
MA(1) Process
2
1 + 2 + 4
(1.15)
kk =
k (1 2 )
for k 1
1 2(k+1)
(1.16)
It is clear from these equations that the PACF for an MA(1) process decays
exponentially to zero.
1.2.4
MA(q) process
For a general MA(q) process we can use the following Yule-Walker equations to
solve for the PACF kk :
j = k1 j1 + k2 j2 + . . . + kk jk , j = 1, 2, . . . , k
(1.17)
Note that here the terms are known and we are solving for kk . Of all the terms
k1 , k2 , . . . , kk we are only interested in kk .
We should note here that these Yule-Walker equations are slightly different from
those we have seen before. These equations allow us to solve for kk for any
stationary process not just an AR(p) process. If however we are dealing with an
AR(p) process then these equations match exactly those that we ave already seen
with pp = p and kk = 0 for k > p.
1. MODEL IDENTIFICATION
1.3
14
When presented with actual data we can use equation (1.17) to compute the
Sample PACF if we replace the true ACFs () by the sample ACFs (r). We will
not go into any more details here however as in the case of the sample ACF we
note that the behaviour of the sample PACF will be similar to the true PACF.
We should not however expect complete agreement between the two and we will
therefore not be surprised to see some ripples or trends in the Sample PACF
which would not be present in the true ACF.
1. MODEL IDENTIFICATION
15
AR1: Yt = Yt1 + t
0.4
0.0
0.2
Partial ACF
0.6
0.8
10
15
20
25
30
25
30
Lag
0.0
Partial ACF
0.2
0.4
0.6
10
15
Lag
20
1. MODEL IDENTIFICATION
16
MA1: Yt = ()t1 + t
0.2
0.3
0.4
0.5
Partial ACF
0.1
0.0
10
15
Lag
20
25
30
2
Estimating the Model
Parameters
1
In the last chapter we examined how to determine values for p, d, q and hence
2.1
When we first encountered the Box-Jenkins ARIMA model the model contained
a mean term . Throughout this course so far we have set = 0. Let us now
allow the possibility that 6= 0 and consider how we might estimate this non-zero
mean.
17
18
(2.1)
where E(yt ) = 0 and we wish to estimate using the observed sample time series
x1 , x2 , . . . , xn .
2.1.1
Estimating Drift
Suppose that is a constant then the obvious way to estimate is use the sample
mean:
x=
1X
xt .
n t=1
(2.2)
n1
0 X
|k|
Var(x) =
k
1
n k=n+1
n
"
#
n1
X
0
k
=
1+2
1
k
n
n
k=1
(2.3)
(2.4)
Var(x) =
1
n2
"
n
X
t 2 + 2
t=1
= 2 (2n + 1)
2.1.2
19
n X
s1
X
#
t 2
(2.5)
s=2 t=1
n+1
6n
(2.6)
Suppose now that the mean is no longer a constant but instead takes the form:
= 0 + 1 t.
(2.7)
We want to estimate the two parameters 0 and 1 . One method is to use the
ordinary least squares technique used in regression. Here we minimise
n
X
(xt 0 1 t)2 .
Q(0 , 1 ) =
(2.8)
t=1
1 =
Pn
0 = x 1 t
t)
(2.9)
(2.10)
(2.11)
2.2
20
2.2.1
AR Models
(2.12)
(2.13)
2 = 1 1 + 2 .
(2.14)
r1 = 1 + r1 2
(2.15)
r2 = r1 1 + 2 .
(2.16)
(2.17)
(2.18)
The exact same technique gives us the method of moments estimators of the p
parameters in an AR(p) model, namely we replace each of the p ACFs by the
corresponding sample ACFs rin the Yule-Walker equations and then we solve
21
2.2.2
MA Models
1 =
1
1 + 12
(2.19)
Now we replace 1 by r1 and then we solve this equation to get two roots
1
p
1 4r12
=
2r
p1
1 1 4r12
=
2r1
1 +
(2.20)
(2.21)
2.2.3
ARMA(1,1) Model
k =
(1 1 1 )(1 1 ) k1
.
1 21 1 + 12
(2.22)
22
(2.23)
r2
1 = .
r1
(2.24)
r1 =
(1 1 1 )(1 1 )
.
1 21 1 + 2
(2.25)
We can now solve this quadratic equation for 1 , remembering that of the two
solutions thus obtained we only keep the invertible solution.
2.3
We have just seen that the method of moments cannot estimate the parameters
in a Moving Average model. We now note that there are several alternative
estimation procedures that can be used to estimate the parameters in an ARIMA
model. The first is based on the principle of least squares, the second uses
maximum likelihood methods and the technique of unconditional least squares
is a compromise between these two methods. We will not go into the detail of
any of these estimation procedures here but in the following we will provide a
brief introduction to the basic ideas.
2.3.1
(2.26)
can be viewed as a regression with predictor variable Yt1 and response variable
Yt . The least squares estimator of the parameter is clearly then obtained by
23
S (, ) =
n
X
(2.27)
t=2
Pn
P
Yt nt=2 Yt1
(n 1)(1 )
t=2
(2.28)
Y Y
=Y.
1
(2.29)
And substituting Y for we find that when dealing with large samples the
estimator for is given by
Pn
Y)
(2.30)
In the AR(p) model we again find that for large sample sizes
=Y.
(2.31)
to obtain 1 and 2 .
r1 = 1 + r1 2
(2.32)
r2 = r1 1 + 2
(2.33)
24
Suppose now we are trying to estimate the parameter in the MA(1) model:
Yt = t 1 t1 .
(2.34)
(2.35)
t
(2.36)
2.3.2
(2.37)
if we assume that the error terms t are iid normal white noise then we can show
that the likelihood function is given by
2
2 n/2
L(1 , , ) = (2 )
(1
21 )1/2 exp
S(1 , )
2 2
(2.38)
25
where
S(1 , ) =
n
X
(2.39)
t=2
The term S(1 , ) is called the unconditional sum of squares functionand should
be compared with the previous conditional sum of squares. Clearly they are
related as follows:
S(1 , ) = S (1 , ) + (1 21 )(Y1 )2 .
(2.40)
2.3.3
In previous sections we have seen that when estimating parameters using Least
Squares we must minimise the conditional sum of squares S (1 , ). We have also
just seen that the maximum likelihood estimators are obtained by minimising
L(1 , , 2 ) and we have seen L(1 , , 2 ) is a function of the unconditional sum
of squares S(1 , ). An alternative estimation procedure tells us that instead of
minimising the full likelihood function L(1 , , 2 ) we could just minimise the
unconditional sum of squares S(1 , ). This procedure is then a compromise
between the conditional least squares estimates and the full maximum likelihood
procedure.
We will finish our discussion here and will not investigate any of these estimation
procedures further.