Time Series Analysis: C5 ARIMA (Box-Jenkins) Models

Part II TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models
Angel A. Juan & Carles Serrat - UPC 2007/2008
2.5.1: Introduction to ARIMA models
Recall that stationary processes The Autoregressive Integrated Moving Average (ARIMA) vary about a fixed level, and models, or Box-Jenkins methodology, are a class of linear nonstationary processes have no natural constant mean level. models that is capable of representing stationary as well as nonstationary time series. The ACF and PACF associated to the TS
ARIMA models rely heavily on autocorrelation patterns in data both ACF and PACF are used to select an initial model.
are matched with the theoretical autocorrelation pattern associated with a particular ARIMA model.
The Box-Jenkins methodology uses an iterative approach:

1. An initial model is selected, from a general class of ARIMA models, based on an examination of the TS and an examination of its autocorrelations for several time lags The chosen model is then checked against the historical data to see whether it accurately describes the series: the model fits well if the residuals are generally small, randomly distributed, and contain no useful information. If the specified model is not satisfactory, the process is repeated using a new model designed to improve on the original one. Once a satisfactory model is found, it can be used for forecasting.
2.
3.
4.
2.5.2: Autoregressive Models AR(p)

A pth-order autoregressive model, or AR(p), takes the form:
Yt 0 1Yt 1 2Yt 2 ... pYt p t
An AR(p) model is a regression model with lagged values of the dependent variable in the independent variable positions, hence the name autoregressive model.
Yt response variable at time t Yt k observation (predictor variable) at time t k
i regression coefficients to be estimated t error term at time t
Autoregressive models are appropriate for stationary time series, and the coefficient 0 is related to the constant level of the series.
Theoretical behavior of the ACF and PACF for AR(1) and AR(2) models:
AR(1)
ACF 0
AR(2)
ACF 0
PACF = 0 for lag > 1
PACF = 0 for lag > 2
2.5.3: Moving Average Models MA(q)

A qth-order moving average model, or MA(q), takes the form:
Yt t 1 t 1 2 t 2 ... q t q
An MA(q) model is a regression model with the dependent variable, Yt, depending on previous values of the errors rather than on the variable itself.
Yt response variable at time t
constant mean of the process i regression coefficients to be estimated t k error in time period t - k
The term Moving Average is historical and should not be confused with the moving average smoothing procedures.
MA models are appropriate for stationary time series. The weights i do not necessarily sum to 1 and may be positive or negative.
Theoretical behavior of the ACF and PACF for MA(1) and MA(2) models:
MA(1) ACF = 0 for lag > 1; PACF 0
MA(2) ACF = 0 for lag > 2; PACF 0
2.5.4: ARMA(p,q) Models

A model with autoregressive terms can be combined with a model having moving average terms to get an ARMA(p,q) model:
Yt 0 1Yt 1 2Yt 2 ... pYt p t 1 t 1 2 t 2 ... q t q
Note that:
ARMA(p,0) = AR(p) ARMA(0,q) = MA(q)
In practice, the values of p and q each rarely exceed 2.
ARMA(p,q) models can describe a wide variety of behaviors for stationary time series. Theoretical behavior of the ACF and PACF for autoregressivemoving average processes:
ACF AR(p) MA(q) ARMA(p,q)

Die out
PACF
Cut off after the order p of the process Die out
In this context
Cut off after the order q of the process Die out
Die out means tend to zero gradually
Die out
Cut off means disappear or is zero
2.5.5: Building an ARIMA model (1/2)

The first step in model identification is to determine whether the series is stationary. It is useful to look at a plot of the series along with the sample ACF. If the series is not stationary, it can often be converted to a stationary series by differencing: the original series is replaced by a series of differences and an ARMA model is then specified for the differenced series (in effect, the analyst is modeling changes rather than levels) Models for nonstationary series are called Autoregressive Integrated Moving Average models, or ARIMA(p,d,q), where d indicates the amount of differencing. Once a stationary series has been obtained, the analyst must identify the form of the model to be used by comparing the sample ACF and PACF to the theoretical ACF and PACF for the various ARIMA models. Principle of parsimony: all things being equal, simple models are preferred to complex models Once a tentative model has been selected, the parameters for that model are estimated using least squares estimates.
A nonstationary TS is indicated if the series appears to grow or decline over time and the sample ACF fail to die out rapidly.
In some cases, it may be necessary to difference the differences before stationary data are obtained.
Note that: ARIMA(p,0,q) = ARMA(p,q) By counting the number of significant sample autocorrelations and partial autocorrelations, the orders of the AR and MA parts can be determined.
Advice: start with a model containing few rather than many parameters. The need for additional parameters will be evident from an examination of the residual ACF and PACF.
2.5.5: Building an ARIMA model (2/2)

Before using the model for forecasting, it must be checked for adequacy. Basically, a model is adequate if the residuals cannot be used to improve the forecasts, i.e.,
Many of the same residual plots that are useful in regression analysis can be developed for the residuals from an ARIMA model (histogram, normal probability plot, time sequence plot, etc.)
The residuals should be random and normally distributed

The individual residual autocorrelations should be small. Significant residual autocorrelations at low lags or seasonal lags suggest the model is inadequate
After an adequate model has been found, forecasts can be made. Prediction intervals based on the forecasts can also be constructed. As more data become available, it is a good idea to monitor the forecast errors, since the model must need to be reevaluated if:
The magnitudes of the most recent errors tend to be consistently larger than previous errors, or The recent forecast errors tend to be consistently positive or negative
In general, the longer the forecast lead time, the larger the prediction interval (due to greater uncertainty)
Seasonal ARIMA (SARIMA) models contain:

Regular AR and MA terms that account for the correlation at low lags
Seasonal AR and MA terms that account for the correlation at the seasonal lags
In addition, for nonstationary seasonal series, an additional seasonal difference is often required
2.5.6: ARIMA with Minitab Ex. 1 (1/4)
290 280 270 260

Index
File: PORTFOLIO_INVESTMENT.MTW
Stat > Time Series >
Time Series Plot of Index
A consulting corporation wants to try the Box-Jenkins technique for forecasting the Transportation Index of the Dow Jones.
The series show an upward trend.
250 240 230 220 210 1 6 12 18 24 30 36 Index 42 48 54 60

1,0 0,8 0,6 0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0
(with 5% significance limits for the autocorrelations)
Autocorrelation Function for Index
The first several autocorrelations are persistently large and trailed off to zero rather slowly a trend exists and this time series is nonstationary (it does not vary about a fixed level)
Idea: to difference the data to see if we could eliminate the trend and create a stationary series.
Autocorrelation
8 9 Lag
10
11
12
13
14
15
16

First order differences.
5 4
Time Series Plot of Diff1
Diff1
A plot of the differenced data appears to vary about a fixed level.
3 2 1 0 -1 -2 -3 -4 1 6 12 18 24 30 36 Index 42 48 54 60
Comparing the autocorrelations with their error limits, the only significant autocorrelation is at lag 1. Similarly, only the lag 1 partial autocorrelation is significant. The PACF appears to cut off after lag 1, indicating AR(1) behavior. The ACF appears to cut off after lag 1, indicating MA(1) behavior we will try: ARIMA(1,1,0) and ARIMA(0,1,1)
A constant term in each model will be included to allow for the fact that the series of differences appears to vary about a level greater than zero.
(with 5% significance limits for the autocorrelations) 1,0 0,8 0,6

Autocorrelation
Autocorrelation Function for Diff1
(with 5% significance limits for the partial autocorrelations) 1,0 0,8

Partial Autocorrelation
Partial Autocorrelation Function for Diff1
0,6 0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0
0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 1 2 3 4 5 6 7 8 9 Lag 10 11 12 13 14 15 16
8 9 Lag
10
11
12
13
14
15
16

ARIMA(1,1,0) ARIMA(0,1,1)
The LBQ statistics are not significant as indicated by the large pvalues for either model.

(with 5% significance limits for the autocorrelations) 1,0 0,8 0,6 0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 1 2 3 4 5 6 7 8 9 Lag 10 11 12 13 14 15 16
Autocorrelation Function for RESI1
Autocorrelation
(with 5% significance limits for the autocorrelations) 1,0 0,8
Autocorrelation
Finally, there is no significant residual autocorrelation for the ARIMA(1,1,0) model. The results for the ARIMA(0,1,1) are similar.
0,6 0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 1 2 3 4 5 6 7 8 9 Lag 10 11 12 13 14 15 16
Therefore, either model is adequate and provide nearly the same one-step-ahead forecasts.
File: READINGS.MTW
Stat > Time Series >
Time Series Plot of Readings
110 100 90 80
Readings
A consulting corporation wants to try the Box-Jenkins technique for forecasting a process.
The time series of readings appears to vary about a fixed level of around 80, and the autocorrelations die out rapidly toward zero the time series seems to be stationary.
70 60 50 40 30 20 1 7 14 21 28 35 42 Index 49 56 63 70
The first sample ACF coefficient is significantly different form zero. The autocorrelation at lag 2 is close to significant and opposite in sign from the lag 1 autocorrelation. The remaining autocorrelations are small. This suggests either an AR(1) model or an MA(2) model. The first PACF coefficient is significantly different from zero, but none of the other partial autocorrelations approaches significance, This suggests an AR(1) or ARIMA(1,0,0)
(with 5% significance limits for the autocorrelations) 1,0 0,8

Partial Autocorrelation
1,0 0,8 0,6 0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0
Autocorrelation Function for Readings
Partial Autocorrelation Function for Readings
(with 5% significance limits for the partial autocorrelations)
0,6
Autocorrelation
0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 2 4 6 8 10 Lag 12 14 16 18
10 Lag
12
14
16
18

AR(1) = ARIMA(1,0,0) A constant term is included in both models to allow for the fact that the readings vary about a level other than zero.
MA(2) = ARIMA(0,0,2)
Both models appear to fit the data well. The estimated coefficients are significantly different from zero and the mean square (MS) errors are similar.
Lets take a look at the residuals ACF

(with 5% significance limits for the autocorrelations) 1,0 0,8 0,6
Autocorrelation
0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 2 4 6 8 10 Lag 12 14 16 18
(with 5% significance limits for the autocorrelations) 1,0
Autocorrelation
Finally, there is no significant residual autocorrelation for the ARIMA(1,0,0) model. The results for the ARIMA(0,0,2) are similar.
0,8 0,6 0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 2 4 6 8 10 Lag 12 14 16 18
Therefore, either model is adequate and provide nearly the same threestep-ahead forecasts. Since the AR(1) model has two parameters (including the constant term) and the MA(2) model has three parameters, applying the principle of parsimony we would use the simpler AR(1) model to forecast future readings.

Time Series Analysis: C5 ARIMA (Box-Jenkins) Models

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Time Series Analysis: C5 ARIMA (Box-Jenkins) Models

Caricato da

Copyright:

Formati disponibili

Part II TIME SERIES ANALYSIS C5 ARIMA (Box-Jenkins) Models

Angel A. Juan & Carles Serrat - UPC 2007/2008

2.5.1: Introduction to ARIMA models

The Box-Jenkins methodology uses an iterative approach:

2.5.2: Autoregressive Models AR(p)

Yt response variable at time t Yt k observation (predictor variable) at time t k

i regression coefficients to be estimated t error term at time t

PACF = 0 for lag > 1

PACF = 0 for lag > 2

2.5.3: Moving Average Models MA(q)

Yt response variable at time t

MA(1) ACF = 0 for lag > 1; PACF 0

MA(2) ACF = 0 for lag > 2; PACF 0

2.5.4: ARMA(p,q) Models

In practice, the values of p and q each rarely exceed 2.

ACF AR(p) MA(q) ARMA(p,q)

Cut off after the order q of the process Die out

Die out means tend to zero gradually

Cut off means disappear or is zero

2.5.5: Building an ARIMA model (1/2)

2.5.5: Building an ARIMA model (2/2)

The residuals should be random and normally distributed

Seasonal ARIMA (SARIMA) models contain:

2.5.6: ARIMA with Minitab Ex. 1 (1/4)

290 280 270 260

The series show an upward trend.

250 240 230 220 210 1 6 12 18 24 30 36 Index 42 48 54 60

(with 5% significance limits for the autocorrelations)

Autocorrelation Function for Index

2.5.6: ARIMA with Minitab Ex. 1 (2/4)

Time Series Plot of Diff1

A plot of the differenced data appears to vary about a fixed level.

(with 5% significance limits for the autocorrelations) 1,0 0,8 0,6

Autocorrelation Function for Diff1

(with 5% significance limits for the partial autocorrelations) 1,0 0,8

Partial Autocorrelation Function for Diff1

0,6 0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0

0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 1 2 3 4 5 6 7 8 9 Lag 10 11 12 13 14 15 16

2.5.6: ARIMA with Minitab Ex. 1 (3/4)

2.5.6: ARIMA with Minitab Ex. 1 (4/4)

Autocorrelation Function for RESI1

(with 5% significance limits for the autocorrelations) 1,0 0,8

Autocorrelation Function for RESI2

2.5.7: ARIMA with Minitab Ex. 2 (1/3)

(with 5% significance limits for the autocorrelations) 1,0 0,8

Autocorrelation Function for Readings

Partial Autocorrelation Function for Readings

(with 5% significance limits for the partial autocorrelations)

0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 2 4 6 8 10 Lag 12 14 16 18

2.5.7: ARIMA with Minitab Ex. 2 (2/3)

Lets take a look at the residuals ACF

2.5.7: ARIMA with Minitab Ex. 2 (3/3)

Autocorrelation Function for RESI1

0,4 0,2 0,0 -0,2 -0,4 -0,6 -0,8 -1,0 2 4 6 8 10 Lag 12 14 16 18

(with 5% significance limits for the autocorrelations) 1,0

Autocorrelation Function for RESI2

Potrebbero piacerti anche