Sei sulla pagina 1di 224

SPSS Trends 13.

For more information about SPSS software products, please visit our Web site at http://www.spss.com or contact SPSS Inc. 233 South Wacker Drive, 11th Floor Chicago, IL 60606-6412 Tel: (312) 651-3000 Fax: (312) 651-3668 SPSS is a registered trademark and the other product names are the trademarks of SPSS Inc. for its proprietary computer software. No material describing such software may be produced or distributed without the written permission of the owners of the trademark and license rights in the software and the copyrights in the published materials. The SOFTWARE and documentation are provided with RESTRICTED RIGHTS. Use, duplication, or disclosure by the Government is subject to restrictions as set forth in subdivision (c) (1) (ii) of The Rights in Technical Data and Computer Software clause at 52.227-7013. Contractor/manufacturer is SPSS Inc., 233 South Wacker Drive, 11th Floor, Chicago, IL 60606-6412. General notice: Other product names mentioned herein are used for identification purposes only and may be trademarks of their respective companies. TableLook is a trademark of SPSS Inc. Windows is a registered trademark of Microsoft Corporation. DataDirect, DataDirect Connect, INTERSOLV, and SequeLink are registered trademarks of DataDirect Technologies. Portions of this product were created using LEADTOOLS 19912000, LEAD Technologies, Inc. ALL RIGHTS RESERVED. LEAD, LEADTOOLS, and LEADVIEW are registered trademarks of LEAD Technologies, Inc. Sax Basic is a trademark of Sax Software Corporation. Copyright 19932004 by Polar Engineering and Consulting. All rights reserved. Portions of this product were based on the work of the FreeType Team (http://www.freetype.org). A portion of the SPSS software contains zlib technology. Copyright 19952002 by Jean-loup Gailly and Mark Adler. The zlib software is provided as is, without express or implied warranty. A portion of the SPSS software contains Sun Java Runtime libraries. Copyright 2003 by Sun Microsystems, Inc. All rights reserved. The Sun Java Runtime libraries include code licensed from RSA Security, Inc. Some portions of the libraries are licensed from IBM and are available at http://oss.software.ibm.com/icu4j/. SPSS Trends 13.0 Copyright 2004 by SPSS Inc. All rights reserved. Printed in the United States of America. No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording, or otherwise, without the prior written permission of the publisher. 1234567890 ISBN 1-56827-352-5 07 06 05 04

Preface

SPSS 13.0 is a comprehensive system for analyzing data. The Trends optional add-on module provides the additional analytic techniques described in this manual. The Trends add-on module must be used with the SPSS 13.0 Base system and is completely integrated into that system.
Installation

To install the Trends add-on module, run the License Authorization Wizard using the authorization code that you received from SPSS Inc. For more information, see the installation instructions supplied with the SPSS Base system.
Compatibility

SPSS is designed to run on many computer systems. See the installation instructions that came with your system for specific information on minimum and recommended requirements.
Serial Numbers

Your serial number is your identification number with SPSS Inc. You will need this serial number when you contact SPSS Inc. for information regarding support, payment, or an upgraded system. The serial number was provided with your Base system.
Customer Service

If you have any questions concerning your shipment or account, contact your local office, listed on the SPSS Web site at http://www.spss.com/worldwide. Please have your serial number ready for identification.
iii

Training Seminars

SPSS Inc. provides both public and onsite training seminars. All seminars feature hands-on workshops. Seminars will be offered in major cities on a regular basis. For more information on these seminars, contact your local office, listed on the SPSS Web site at http://www.spss.com/worldwide.
Technical Support

The services of SPSS Technical Support are available to registered customers. Customers may contact Technical Support for assistance in using SPSS or for installation help for one of the supported hardware environments. To reach Technical Support, see the SPSS Web site at http://www.spss.com, or contact your local office, listed on the SPSS Web site at http://www.spss.com/worldwide. Be prepared to identify yourself, your organization, and the serial number of your system.
Additional Publications

Additional copies of SPSS product manuals may be purchased directly from SPSS Inc. Visit the SPSS Web Store at http://www.spss.com/estore, or contact your local SPSS office, listed on the SPSS Web site at http://www.spss.com/worldwide. For telephone orders in the United States and Canada, call SPSS Inc. at 800-543-2185. For telephone orders outside of North America, contact your local office, listed on the SPSS Web site. The SPSS Statistical Procedures Companion, by Marija Noruis, has been published by Prentice Hall. A new version of this book, updated for SPSS 13.0, is planned. The SPSS Advanced Statistical Procedures Companion, also based on SPSS 13.0, is forthcoming. The SPSS Guide to Data Analysis for SPSS 13.0 is also in development. Announcements of publications available exclusively through Prentice Hall will be available on the SPSS Web site at http://www.spss.com/estore (select your home country, and then click Books).
Tell Us Your Thoughts

Your comments are important. Please let us know about your experiences with SPSS products. We especially like to hear about new and interesting applications using the SPSS system. Please send e-mail to suggest@spss.com or write to SPSS Inc.,

iv

Attn.: Director of Product Planning, 233 South Wacker Drive, 11th Floor, Chicago, IL 60606-6412.
About This Manual

This manual documents the graphical user interface for the procedures included in the Trends add-on module. Illustrations of dialog boxes are taken from SPSS for Windows. Dialog boxes in other operating systems are similar. Detailed information about the command syntax for features in this module is provided in the SPSS Command Syntax Reference, available from the Help menu.
Contacting SPSS

If you would like to be on our mailing list, contact one of our offices, listed on our Web site at http://www.spss.com/worldwide.

Contents
Part I: User's Guide 1 Overview 1

Time Series Analysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Reasons for Analyzing Time Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 A Model-Building Strategy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 How Trends Can Help . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Model Identification. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Trends Procedures Common Features

Defining Time Series Data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Data Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 New Series Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Historical and Validation Periods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Reusing Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 Changing Settings with Command Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Performance Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Exponential Smoothing

15

Custom Exponential Smoothing Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

vii

Exponential Smoothing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Saving Predicted Values and Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 EXSMOOTH Command Additional Features . . . . . . . . . . . . . . . . . . . . . . . . . 21

Autoregression

23

Autoregression Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Saving Predicted Values and Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 Autoregression Performance and Embedded Missing Data. . . . . . . . . . . . . 28 AREG Command Additional Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

ARIMA

31

ARIMA Options. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Saving Predicted Values and Residuals . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 ARIMA Performance and Embedded Missing Data . . . . . . . . . . . . . . . . . . . 37 ARIMA Command Additional Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Seasonal Decomposition

39

Seasonal Decomposition Save . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 SEASON Command Additional Features . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Spectral Plots

43

SPECTRA Command Additional Features. . . . . . . . . . . . . . . . . . . . . . . . . . . 46

viii

Part II: Examples 8 Exponential Smoothing


Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Understanding Your Data . . . . . . . . . . . . . . . . . . . . . . . . . Building and Analyzing Exponential Smoothing Models . . Testing the Predictive Ability of the Model . . . . . . . . . . . . Using the Model to Predict Future Sales . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Related Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... . . . . . . .

49
49 50 53 78 84 89 90

Using Exponential Smoothing to Predict Future Sales . . . . . . . . . . . . . . . . . 49

Recommended Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Autoregression
Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Analysis from Ordinary Least-Squares Regression . . . Applying Autoregression to the Problem . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Related Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

91
. 91 . 92 . 99 105 105

Determining Significant Predictors in the Presence of Autocorrelation . . . . 91

10 ARIMA

107

The ARIMA Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Autoregression (ARIMA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Differencing (ARIMA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

ix

Moving-average (ARIMA) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 Seasonal Orders . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Steps in Using ARIMA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Identification . Estimation . . . Diagnosis . . . Preliminaries . . . . ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... 110 111 112 112 114 116 124 126 128 129 136 137 137 139 140 144 148 151 152 155 156 157

Using Seasonal ARIMA with Predictors to Model Catalog Sales . . . . . . . . 113 Plotting the Catalog Sales Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . Identifying a Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Creating the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model Diagnosis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Adding Predictors to the Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Testing the Predictive Ability of the Model . . . . . . . . . . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Using Intervention Analysis to Determine Change in Market Share . . . . . . Plotting the Market Share Series . . . . Intervention Analysis Strategy . . . . . . Identifying a Model . . . . . . . . . . . . . . Determining the Intervention Period. . Creating Intervention Variables . . . . . Running the Analysis . . . . . . . . . . . . . Model Diagnosis . . . . . . . . . . . . . . . . Assessing the Intervention . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . Related Procedures . . . . . . . . . . . . . . . . . ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...

Recommended Readings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157

11 Seasonal Decomposition

159

Removing Seasonality from Sales Data . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

Determining and Setting the Periodicity . . Running the Analysis . . . . . . . . . . . . . . . . Understanding the Output . . . . . . . . . . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . Related Procedures . . . . . . . . . . . . . . . . . . . .

... ... ... ... ...

... ... ... ... ...

... ... ... ... ...

... ... ... ... ...

... ... ... ... ...

... ... ... ... ...

... ... ... ... ...

160 166 167 169 170

12 Spectral Plots
Running the Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . Understanding the Periodogram and Spectral Density . . . Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Related Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ... ... ... ... ... ... ... ...

171
... ... ... ... 171 173 176 176

Using Spectral Plots to Verify Expectations about Periodicity . . . . . . . . . . 171

xi

Appendices A Durbin-Watson Significance Tables B Guide to ACF/PACF Plots Bibliography Index 177 201 207 209

xii

Part 1: User's Guide

Chapter

Overview

SPSS Trends performs comprehensive forecasting and time series analyses. Its graphical user interface, comprehensive manual, and online Help system ensure that you will find Trends easy to use. The range of analytical techniques available in Trends extends from simple, basic tools to more sophisticated types of analysis. These include:
Plots. With facilities in the SPSS Base system, you can easily produce a variety of series and autocorrelation plots that you can enhance using the SPSS Chart Editor. Smoothing. You can use simple but efficient smoothing techniques that can yield

high-quality forecasts with a minimum of effort.


Decomposition. You can break down a series into its components, saving the seasonal factors and trend, cycle, and error components automaticallyready to use in further analysis. Regression. You can build regression models using a variety of techniques, including those in the SPSS Base system, such as ordinary least-squares regression and curve fitting. Trends adds a special facility for regression with autocorrelated errors. ARIMA Modeling. You can apply the powerful techniques of ARIMA modeling in

a fully interactive environment in which identification, estimation, and diagnosis quickly lead you to the best model.
Spectral Analysis. You can examine a time series as a combination of periodic cycles of various lengths.

2 Chapter 1

Time Series Analysis


A time series is a set of observations obtained by measuring a single variable regularly over a period of time. In a series of inventory data, for example, the observations might represent daily inventory levels for several months. A series showing the market share of a product might consist of weekly market share taken over a few years. A series of total sales figures might consist of one observation per month for many years. What each of these examples has in common is that some variable was observed at regular, known intervals over a certain length of time. Thus, the form of the data for a typical time series is a single sequence or list of observations representing measurements taken at regular intervals.
Table 1-1 Daily inventory time series

Time t1 t2 t3 t4 t5 t6 t60

Week 1 1 1 1 1 2 12

Day Monday Tuesday Wednesday Thursday Friday Monday ... Friday

Inventory level 160 135 129 122 108 150 120

Reasons for Analyzing Time Series


One of the most important reasons for doing time series analysis is to forecast future values of the series. The parameters of the model that explained the past values may also predict whether and how much the next few values will increase or decrease. The ability to make such predictions successfully is obviously important to any business or scientific field. Another reason for analyzing time series data is to evaluate the effect of some event that intervenes and changes the normal behavior of a series. Intervention analysis examines the pattern of a time series before and after the occurrence of such an event. The goal is to see if the outside event had a significant impact on the series pattern.

3 Overview

If it did, there is a significant upward or downward shift in the values of the series after the occurrence of the event. For this reason, such series are called interrupted time series. Weekly numbers of automobile fatalities before and after a new seat belt law, monthly totals of armed robberies before and after a new gun law, and daily measurements of productivity before and after initiation of an incentive plan are examples of interrupted time series. What they have in common is a hypothetical interruption in their usual pattern after the specific time when some outside event occurred. Since the time of the outside event is known and the pattern before and after the event is observable, you can evaluate the impact of the interruption.

A Model-Building Strategy
No matter what the primary goal of the time series analysis, the approach basically starts with building a model that will explain the series. The most popular strategy for building a model is the one developed by Box and Jenkins (Box and Jenkins, 1976), who defined three major stages of model building: identification, estimation, and diagnostic checking. Although Box and Jenkins originally demonstrated the usefulness of this strategy specifically for ARIMA model building, the general principles can be extended to all model building. Identification involves selecting a tentative model type with which to work. This tentative model type includes initial judgements about the number and kind of parameters involved and how they are combined. In making these judgements, you should be parsimonious. The methods usually employed at this stage include plotting the series and its autocorrelation function to find out whether the series shows any upward or downward trend, whether some sort of data transformation might simplify analysis, and whether any kind of seasonal pattern is apparent. Estimation is the process of fitting the tentative model to the data and estimating its parameters. This stage usually involves using a computerized model-fitting routine to estimate the parameters and test them for significance. The estimated parameters can then be used to see how well they would have predicted the observed values. If the parameter estimates are unsatisfactory on statistical grounds, you return to the identification stage, since the tentative model could not satisfactorily explain the behavior of the series. Diagnosis is the stage in which you examine how well the tentative model fits the data. Methods used at this stage include plots and statistics describing the residual, or error, series. This information tells you whether the model can be used

4 Chapter 1

with confidence or whether you should return to the first stage and try to identify a better model.

How Trends Can Help


SPSS Trends is designed to help you accomplish the goals of model identification, model parameter estimation, and model diagnosis.

Model Identification
The most useful tools for identifying a model are plots of the series itself or of various correlation functions. The SPSS Base system provides many plots that are helpful for analyzing time series, such as sequence charts and autocorrelation plots.
Plotting the Series. With the Sequence Charts procedure in the SPSS Base system, you

can plot the values of your series horizontally or vertically. You have the option of plotting the series itself, a log transformation of the series, or the differences between adjacent (or seasonally adjacent) points in the series.
Plotting Correlation Functions. The Base system provides facilities for plotting

correlation functions. As with the series plots, you can show the function itself, a log transformation of the function, or the differences between adjacent (or seasonally adjacent) points. Confidence limits are included on the plots, and the values and standard errors of the correlation function are displayed in the Viewer. The following facilities are available: The Autocorrelations procedure displays and plots the autocorrelation function and the partial autocorrelation function among the values of a series at different lags. It also displays the Box-Ljung statistic and its probability level at each lag in the Viewer. The Cross-Correlations procedure displays and plots the cross-correlation functions of two or more time series at different lags.

5 Overview

Parameter Estimation
SPSS Trends includes techniques for estimating the coefficients of your model. You can group these techniques loosely under the general areas of smoothing, regression methods, Box-Jenkins or ARIMA analysis, and decomposition of cyclic data into their component frequencies.
Smoothing. The Exponential Smoothing procedure uses exponential smoothing

methods to estimate up to three parameters for a wide variety of common models. Forecasts and forecast error values for one or more time series are produced using the most recent data in the series, previous forecasts and their errors, and estimates of trend and seasonality. You can specify your own estimates for any of the parameters or let Trends find them for you. The output includes statistics arranged to help you evaluate the estimates. Trends also includes the Seasonal Decomposition procedure, which lets you estimate multiplicative or additive seasonal factors for periodic time series. New series containing seasonally adjusted values, seasonal factors, trend and cycle components, and error components can be automatically added to your working data file so that you can perform further analyses.
Regression Methods. The Regression procedure in the SPSS Base system is useful

when you want to analyze time series using ordinary least-squares regression. Additional procedures for regression methods include: The Curve Estimation procedure, which is part of the Base system, fits selected curves to time series and produces forecasts, forecast error values, and confidence interval values. The curve is chosen from a variety of trend-regression models that assume that the observed series is some function of the passage of time. The Autoregression procedure, which is part of Trends, allows you to estimate regression models reliably when the error from the regression is correlated between one time point and the nexta common situation in time series analysis. Autoregression offers two traditional methods (Prais-Winsten and Cochrane-Orcutt) as well as an innovative maximum-likelihood method that is able to handle missing data embedded in the series.
Box-Jenkins Analysis. The ARIMA procedure lets you estimate nonseasonal and seasonal univariate ARIMA models. You can include predictor variables in the model to evaluate the effect of some outside event or influence while estimating the coefficients of the ARIMA process. ARIMA produces maximum-likelihood estimates

6 Chapter 1

and can process time series with missing observations. It uses the traditional ARIMA model syntax, so you can describe your model just as it would be described in a book on ARIMA analysis. Summary statistics for the parameter estimates help you to evaluate the model. New series containing forecasts as well as their errors and confidence limits are automatically created.
Seasonal-Adjustment Methods. The Seasonal Decomposition procedure lets you

estimate multiplicative or additive seasonal factors for periodic time series using the ratio-to-moving-average (Census I) method of seasonal decomposition. Seasonal Decomposition automatically creates new series in your working data file containing seasonally adjusted values, seasonal factors, trend and cycle components, and error components so that you can perform further analyses.
Frequency-Component Analysis. The Spectral Plots procedure lets you decompose

a time series into its harmonic components, a set of regular periodic functions at different wavelengths or periods. By noting the prominent frequencies in this model-free analysis, you can detect features of a periodic or cyclic series that would be obscured by other methods. Spectral Plots provides statistics, plots, and methods of tailoring them for univariate and bivariate spectral analysis, including periodograms, spectral density estimates, gain and phase spectra, popular spectral windows for smoothing the periodogram, and optional user-defined filters. Plots can be produced by period, frequency, or both.

Diagnosis
The ability to diagnose how well the model fits the data is a vital part of time series analysis. Several facilities are available to assist you in evaluating models: The automatic residual and confidence-interval series generated along with the forecasts help you to assess the accuracy of your models. Standard errors and other statistics help you to judge the significance of the coefficients estimated for your model. In regression analysis and elsewhere, you frequently need to determine whether the residuals from a model are normally distributed. The SPSS Base system offers Normal P-P and Normal Q-Q plots, which compare the observed values of a series against the values that would be observed if the series were normally distributed. They give you quick and effective visual checks for normality.

Chapter

Trends Procedures Common Features


Defining Time Series Data

A time series is a set of observations obtained by measuring a single variable regularly over a period of time. The form of the data for a typical time series is a single sequence or list of observations representing measurements taken at regular intervals. When you define time series data for use with SPSS Trends, each series corresponds to a separate variable. For example, to define a time series in the Data Editor, click the Variable View tab and enter a variable name in any blank row. Each observation in a time series corresponds to a case in SPSS (a row in the Data Editor). A property of time series data is that the observations are taken at equally spaced time intervals. If you open a spreadsheet containing time series data, each series should be arranged in a column in the spreadsheet. If you already have a spreadsheet with time series arranged in rows, you can open it anyway and use Transpose on the Data menu to flip the rows into columns.

Data Transformations
A number of data transformation procedures provided in the SPSS Base system are useful in time series analysis. The Define Dates procedure (on the Data menu) generates date variables used to establish periodicity and to distinguish between historical, validation, and forecasting periods. Trends is designed to work with the variables created by the Define Dates procedure.

8 Chapter 2

The Create Time Series procedure (on the Transform menu) creates new time series variables as functions of existing time series variables. It includes functions that use neighboring observations for smoothing, averaging, and differencing. For example, you can perform a difference transformation on a nonstationary series to produce a series that is stationary and suitable for analysis. The Replace Missing Values procedure (on the Transform menu) replaces systemand user-missing values with estimates based on one of several methods. Missing data at the beginning or end of a series pose no particular problem; they simply shorten the useful length of the series. Gaps in the middle of a series (embedded missing data) can be a much more serious problem. In particular, Autoregression (other than the maximum-likelihood method), Exponential Smoothing, Seasonal Decomposition, and Spectral Plots can not be run in the presence of embedded missing data. See the SPSS Base Users Guide for detailed information concerning data transformations for time series.

New Series Names


All of the Trends procedures, with the exception of Spectral Plots, can automatically generate new series containing such things as predicted values and residuals. Each procedure reports the names of any new series that it creates. The first three letters of the series name indicate the type of series: fit contains the predicted value according to the current model. err is a residual or error series. (Normally the fit series plus the err series equals the original series.) ucl and lcl contain upper and lower confidence limits. sep is the standard error of the fit or predicted series. sas, saf, and stc are series components extracted by the Seasonal Decomposition procedure.

9 Trends Procedures Common Features

Historical and Validation Periods


It is often useful to divide your time series into a historical or estimation period and a validation period. You develop a model on the basis of the observations in the historical period and then test it to see how well it works in the validation period. When you are not sure which model to choose, this technique is sometimes more efficient than comparing models based on the entire sample. The facilities in the Select Cases dialog box (available through the Data menu) and the Save dialog box (available through the main dialog box for many procedures) make it easy to set aside part of your data for validation purposes.
Select Cases. Specifies a range of observations for analysis. The selection Based on
time or case range allows you to specify a range of observations using date variables if

you have attached them to your time series or using observation numbers if you have not. You normally define a historical period in this way.
Save. Specifies a range of observations for forecasts or validation. Trends procedures that save new series containing such things as fit values and residuals allow you to predict values for observations past the end of the series being analyzed. To define a validation period, select the default Predict from estimation period through last case. Trends then uses the model developed from the historical period to forecast values through the validation period so that you can compare these forecasts to the actual data. Forecasts created in this way are n-step-ahead forecasts. For information on generating one-step-ahead forecasts, refer to Forecasts below. Forecasts

Forecasts are ubiquitous in time series analysisboth real forecasts and the validation forecasts discussed above. It is often useful to distinguish between one-step-ahead forecasts and n-step-ahead forecasts. One-step-ahead forecasts useand requireinformation in the time period immediately preceding the period being forecast, while n-step-ahead forecasts are based on older information. You can produce either type of forecast in Trends. Real forecaststhat is, forecasts for observations beyond the end of existing seriesare always n-step-ahead forecasts. To generate these forecasts, specify the forecast range in a Save dialog box using the Predict through alternative. Trends will automatically extend the series to allow room for the forecast observations. (This

10 Chapter 2

type of forecast can be generated by ARIMA and Exponential Smoothing and by Curve Estimation in the Base system.) Validation forecasts can be either one- or n-step-ahead. To generate n-step-ahead validation forecasts, simply specify the historical period in the Select Cases dialog box and the validation period in the Save dialog box, as discussed above. If you need one-step-ahead validation forecasts, you must use a certain amount of SPSS command syntax: 1. Specify the historical period in the Select Cases dialog box. 2. Estimate the model in which you are interested. Instead of executing it directly, click the Paste button to paste its command syntax into a syntax window. 3. Execute the command from the syntax window by clicking the Run button on the toolbar. 4. Go back to the Select Cases dialog box and specify both the historical and validation periods. Generally this means to select All cases. 5. Activate the syntax window and edit the command that you executed in step 3. Leave the command name (EXSMOOTH, ARIMA, or AREG), but replace all of its specifications with the single specification /APPLY FIT (for EXSMOOTH use /APPLY). Then execute the command by clicking Run. Trends generates a fit variable through both the historical and validation periods, based on the coefficients estimated in step 3 for the historical period.

Reusing Models
When you click OK or Paste in SPSS, current dialog box settings are saved. When you return to a dialog box you have used once, all of your previous specifications are still there (unless you have opened a different data file). This persistence of dialog box settings is especially helpful as you develop models for time series data, since you can selectively modify model specifications as needed: You can change one or more specifications in the dialog box (or in any subdialog box) and repeat the analysis with the new specifications. You can switch variables to repeat your analysis or chart with different variables but with the same specifications.

11 Trends Procedures Common Features

You can use the Select Cases facility to restrict the analysis to a range of cases or to process all cases instead of restricting the analysis to a previously specified range. You can then repeat an analysis or chart with identical specifications but a different range of observations. You can use a transformation procedure, such as Replace Missing Values, or edit data values in the Data Editor and then repeat an analysis or chart with identical specifications but modified data.
Reusing Command Syntax

If you are using command syntax instead of the dialog boxes, you can still reuse and selectively modify models using the APPLY subcommand. When it is used, the APPLY subcommand is usually the first specification after the name of the command or after the name of a command and a series. It means run this command as before, with the following changes. If you want to change any specifications from the previous model, continue the command with a slash and enter only those specifications you want to add or change. For commands that estimate coefficients that you can apply to prediction (ARIMA, AREG, and EXSMOOTH), you have the option of applying the coefficients estimated for a previous model to a new model, either as initial estimates or as final coefficients to be used in calculating predicted values and residuals. You can also apply specifications or coefficients from an earlier model rather than from the previous specification of the same command by specifying the model name. See the SPSS Command Syntax Reference for a complete discussion of the APPLY subcommand and models.

Changing Settings with Command Syntax


Several SPSS commands determine settings that affect the operation of Trends procedures. In particular, the TSET, USE, and PREDICT commands modify the operation of most subsequent analytical procedures in Trends. If you execute such commands from a syntax window and later execute Trends procedures from the dialog boxes, you cannot necessarily assume that the settings you established in the syntax window remain in effect. The following are areas where this might occur: The Select Cases dialog box can generate a USE command.

12 Chapter 2

The Save dialog box for any Trends procedure and for Curve Estimation in the Base system can generate a PREDICT command. Trends dialog boxes routinely generate a TSET command to reflect settings that are specified in the dialog box. Never assume that your TSET specifications survive the use of a dialog box without inspecting the journal file for a TSET command generated by the dialog box. The existence and name of the journal file can be verified on the General tab in the Options dialog box (Edit menu).

Performance Considerations
New Series

Many Trends procedures automatically generate new series. This facility can be a great aidbut not always. Possible difficulties with saving new series include: Trends must read and write the entire file an extra time to add the new series. Your file becomes largerin some cases, dramatically soand subsequent processing therefore takes longer. Most of the time, you do not need most of the new series. Merely keeping track of their names becomes a problem. When you use procedures on the
Analyze Time Series

submenu, its a good idea always to open the Save subdialog box and give a moments thought to the creation of new variables. The default for these procedures is to add new variables permanently to your data file. If you are doing preliminary analysis and are not yet certain of the models you want to use, the Replace existing alternative for new variables gives you the benefits of residuals and predicted values but does not keep all of them around. Once you have settled on a model, you may want to go back into the Save subdialog box and choose the Add to file alternative.

13 Trends Procedures Common Features

General Techniques for Efficiency

For any iterative procedures in Trends, you may find it useful to: Relax the convergence criteria for the procedure. These are specified in the Options dialog box for the specific procedure. Perform exploratory analysis to determine the best model. Restore the stricter convergence criteria for the final estimation of your coefficients. The general point is that some estimation algorithms used in Trends require a lot of processing and will take a long time if you use them blindly. Take advantage of the interactive character of Trends. Loosen things up for speed while you are exploring your data, and thenwhen you are ready to estimate your final coefficientsexploit the full accuracy of the Trends algorithms.

Chapter

Exponential Smoothing

This procedure produces fit/forecast values and residuals for one or more time series, using an algorithm that smoothes out irregular components of time series data. A variety of models differing in trend (none, linear, or exponential) and seasonality (none, additive, or multiplicative) are available.
Example. Inventory-intensive businesses often employ statistical techniques for

projecting future inventory. The Exponential Smoothing procedure can be used both to develop a model of the inventory time series and to produce fast forecasts based on that model.
Statistics. Initial seasonal indices, initial level parameter, initial trend parameter, error degrees of freedom. For each model: model parameters, sum of squared errors. Data. The variables and any seasonal factors should be numeric. Assumptions. The variables should not contain any embedded missing data. Creating an Exponential Smoothing Model
E From the menus choose: Analyze Time Series Exponential Smoothing

15

16 Chapter 3 Figure 3-1 Exponential Smoothing dialog box

E Select one or more variables from the available list and move them into the Variables

list. Note that the list includes only numeric variables.


E Select the type of model to use. These models differ in their trend and seasonal

components.
Simple. The simple model assumes that the series has no trend and no seasonal

variation.
Holt. The Holt model assumes that the series has a linear trend and no seasonal

variation.
Winters. The Winters model assumes that the series has a linear trend and

multiplicative seasonal variation (its magnitude increases or decreases with the overall level of the series). The Winters model is unavailable unless you have defined dates with the Define Dates dialog box, available from the Data menu.
Custom. A custom model allows you to specify the trend and seasonality

components. For seasonal models, the series must contain at least four full seasons of data. The following options are also available: For models with seasonal components, you can specify a set of seasonal factors or allow Trends to determine them for you.

17 Exponential Smoothing

Click Save to specify how new variables are to be saved or to select options for predicting cases. Click Parameters to set the parameter values for the model, or specify a grid search to determine the best-fitting parameter values.

Custom Exponential Smoothing Models


Figure 3-2 Exponential Smoothing Custom Model dialog box

Trend Component. Choices for the custom trend component include no trend, linear

trend, exponential trend, and damped trend.


None. The level of the series does not show any overall linear trend. Linear. The mean level of the series increases or decreases linearly (at a constant

rate) with time.


Exponential. The mean level of the series increases or decreases exponentially

with time.
Damped. The mean level of the series increases or decreases with time, but the

rate of change declines. A trend that is dying out.


Seasonal Component. Choices for the custom seasonal component include no seasonality, additive seasonality, and multiplicative seasonality. None. The series does not show periodic variation at the periodicity defined for it. Additive. The series has seasonal variation that is additive: the magnitude of

seasonal variation does not depend on the overall level of the series.
Multiplicative. The series shows periodic variation at the periodicity defined for it,

and the size of the periodic variation depends on the overall level of the series.

18 Chapter 3

The Holt model in the main Exponential Smoothing dialog box is equivalent to selecting Linear for Trend Component and None for Seasonal Component in the Custom dialog box. The Winters model is equivalent to selecting Linear for Trend Component and Multiplicative for Seasonal Component.

Exponential Smoothing Parameters


Figure 3-3 Exponential Smoothing Parameters dialog box

Alpha. Exponential smoothing parameter that controls the relative weight given to

recent observations, as opposed to the overall series mean. When alpha equals 1, the single most recent observation is used exclusively; when alpha equals 0, old observations count just as heavily as recent ones. Alpha is used for all models.
Gamma. Exponential smoothing parameter that controls the relative weight given to recent observations in estimating the present series trend. It ranges from 0 to 1, with higher values giving more weight to recent values. Gamma is used only for exponential smoothing models with a linear or exponential trend, or a damped trend and no seasonal component. It is not used for the simple model. Delta. Exponential smoothing parameter that controls the relative weight given to

recent observations in estimating the present seasonality. It ranges from 0 to 1, with values near 1 giving higher weight to recent values. Delta is used for all exponential

19 Exponential Smoothing

smoothing models with a seasonal component. It is not used for the simple or Holt models.
Phi. Exponential smoothing parameter that controls the rate at which a trend is

"damped," or reduced in magnitude over time. It ranges from 0 to 1 (but cannot equal 1), with values near 1 representing more gradual damping. Phi is used for exponential smoothing models with a damped trend. It is not used for the simple, Holt, or Winters models.
Value. Values for alpha, gamma and delta can be any value between and including 0

and 1. For phi, the value can be any value between but not including 0 and 1.
Grid Search. The parameter is assigned a starting value in the Start box, an increment

value in the By box, and an ending value in the Stop box. Enter these values after selecting Grid Search. The ending value must be greater than the starting value, and the increment value must be less than their difference.
Initial Values. You can specify the starting and trend values used in smoothing the series by selecting one of the following: Automatic. The program calculates suitable starting and trend values from the

data. This is usually desirable.


Custom. Enter a number in the Start text box and, for models with a trend, a

number in the Trend text box. Poor choice of initial values can result in an inferior solution.
Display only 10 best models for grid search. The parameter value(s) and sum of squared errors (SSE) are displayed for only the 10 parameter combinations with the lowest SSE regardless of the number of parameter combinations tested. If this option is not selected, all tested parameter combinations are displayed.

20 Chapter 3

Saving Predicted Values and Residuals


Figure 3-4 Exponential Smoothing Save dialog box

Create Variables. Allows you to choose how to treat new variables. Add to file. The new series are saved as regular variables in your working data

file. Variable names are formed from a three-letter prefix, an underscore, and a number.
Replace existing. The new series are saved as temporary variables in the working

data file, and any existing temporary variables created by Trends commands are dropped. Variable names are formed from a three-letter prefix, a pound sign (#), and a number.
Do not create. The new series are not added to the working data file. Predict Cases. Allows you to use the estimated model to predict values of the

dependent variable:
Predict from estimation period through last case. Predicts values for all cases from

the estimation period through the end of the file but does not create new cases. If you are analyzing a range of cases that start after the beginning of the file, cases prior to that range are not predicted. If no estimation period has been defined, all cases are used to predict values.
Predict through. Predicts values through the specified date, time, or observation

number, based on the cases in the estimation period. This can be used to forecast values beyond the last case in the time series. The text boxes that are available

21 Exponential Smoothing

for specifying the end of the prediction period depend on the currently defined date variables.

EXSMOOTH Command Additional Features


The SPSS command language also allows you to: Specify initial seasonal factor estimates for seasonal models. Seasonal factors can be specified numerically by providing as many additive or multiplicative numbers as the seasonal periodicity. See the SPSS Command Syntax Reference for complete syntax information.

Chapter

Autoregression

The Autoregression procedure estimates true regression coefficients from time series with first-order autocorrelated errors. It offers three algorithms. Two algorithms (Prais-Winsten and Cochrane-Orcutt) transform the regression equation to remove the autocorrelation. The third (maximum likelihood) uses the same algorithm that the ARIMA procedure uses for estimating autocorrelation. Maximum-likelihood (ML) estimation is more demanding computationally but gives better resultsand it can tolerate missing data in the series.
Example. Is the consumption of alcoholic spirits in any given year related to real

per-capita income and the price level of the spirits in question for that year? A standard regression analysis may not be valid in this case because the variables represent time series. Most time series have some trend, either up or down, and any two trending series will correlate simply because of the trends, regardless of whether they are causally related or not. Autoregression allows you to remove the autocorrelation inherent in many time series and ascertain any statistically significant relationships between dependent variables and candidate regressors.
Statistics. For the Prais-Winsten and Cochrane-Orcutt estimation methods: rho

value with standard error, Dubin-Watson statistic, and mean squared error at each iteration; R, R2, adjusted R2, standard error of the estimate, analysis-of-variance table, and regression statistics for the ordinary least-square and final Prais-Winsten or Cochrane-Orcutt estimates. For the maximum-likelihood method: rho, regression coefficients, adjusted sum of squares, and Marquardt constant at each iteration. For the final maximum-likelihood parameter estimates: regression statistics, correlation matrix, covariance matrix, residual sum of squares, adjusted residual sum of squares, residual variance, model standard error, log-likelihood, Akaikes information criterion, and Schwartzs Bayesian criterion.
Methods. Prais-Winsten, Cochrane-Orcutt, and Exact maximum-likelihood

(equivalent to an ARIMA(1,0,0) model).


23

24 Chapter 4

Data. The dependent variable and any independent variables should be numeric. Assumptions. Unless the maximum-likelihood method is used, the dependent variable

and any independent variables should not contain any embedded missing data. The time series to be modeled should be stationary.
Stationary. A condition that must be met by the time series to which you fit an

ARIMA model. Pure MA series will be stationary; however, AR and ARMA series might not be. A stationary series has a constant mean and a constant variance over time.
To Create an Autoregression Model
E From the menus choose: Analyze Time Series Autoregression... Figure 4-1 Autoregression dialog box

E Select a variable from the available list and move it into the Dependent list. Note that

the list includes only numeric variables.


E Select one or more variables from the available list and move them into the

Independent(s) list.

25 Autoregression E Select one of the Method options to choose an estimation technique. Available

methods are Exact maximum-likelihood, Cochrane-Orcutt, and Prais-Winsten.


Exact maximum-likelihood. An estimation technique that for a given model and set

of data finds the parameter estimates that are "most likely" to have produced the observed data. This method can handle missing data within the series and can be used when one of the independent variables is the lagged dependent variable.
Cochrane-Orcutt. A simple and widely used estimation procedure for estimating a

regression equation whose errors follow a first-order autoregressive process. It cannot be used when a series contains embedded missing values.
Prais-Winsten. A generalized least-squares method for estimating a regression

equation whose errors follow a first-order autoregressive process. It cannot be used when a series contains embedded missing values. Generally, the Prais-Winsten method is preferable to the Cochrane-Orcutt method. The following options are also available: Deselect Include constant in model if you do not want to estimate a constant term. Click Save to specify how new variables are to be saved or to select options for predicting cases. Click Options to select the initial value of the autoregressive parameter, set convergence criteria, or choose how to display parameters in the output.

Autoregression Options
Figure 4-2 Autoregression Options dialog box

26 Chapter 4

Initial value of autoregressive parameter. The value from which the iterative search for the optimal value of rho begins. You can specify any number less than 1 and greater than -1, although negative values of rho are uncommon in this procedure. Convergence Criteria. Allows you to specify the criteria used to determine when

iteration ceases.
Maximum iterations. The iterative algorithm stops after the number of specified

iterations even if the algorithm has not converged. You can specify a positive integer in this text box.
Sum of squares change. The iterative algorithm stops if the adjusted sums of

squares does not decrease by 0.001% from one iteration to the next. You can choose a smaller or larger value for more or less precision in the parameter estimates. For greater precision, it may be necessary to increase the maximum iterations.
Display. Allows you to control the information that is printed in the output. Initial and final parameters with iteration summary. Displays the initial and final

parameter estimates, goodness-of-fit statistics, the number of iterations, and the reason that iteration terminated.
Initial and final parameters with iteration details. Displays the initial and final

parameter estimates, the parameter estimates after each iteration, goodness-of-fit statistics, the number of iterations, and the reason that iteration terminated.
Final parameters only. Displays the final parameter estimates and goodness-of-fit

statistics.

27 Autoregression

Saving Predicted Values and Residuals


Figure 4-3 Save dialog box

Create Variables. Allows you to choose how to treat new variables. Add to file. The new series are saved as regular variables in your working data

file. Variable names are formed from a three-letter prefix, an underscore, and a number.
Replace existing. The new series are saved as temporary variables in the working

data file, and any existing temporary variables created by Trends commands are dropped. Variable names are formed from a three-letter prefix, a pound sign (#), and a number.
Do not create. The new series are not added to the working data file. Confidence intervals. Displays confidence limits for the forecast. As long as the

process remains the same, you should expect N% of the series to remain between the upper and lower confidence limits. Choose a level from the drop-down list.
Predict Cases. Allows you to use the estimated model to predict values of the dependent variable. Predict from estimation period through last case. Predicts values for all cases from

the estimation period through the end of the file but does not create new cases. If you are analyzing a range of cases that start after the beginning of the file, cases

28 Chapter 4

prior to that range are not predicted. If no estimation period has been defined, all cases are used to predict values.
Predict through. Predicts values through the specified date, time, or observation

number, based on the cases in the estimation period. This can be used to forecast values beyond the last case in the time series. The text boxes that are available for specifying the end of the prediction period depend on the currently defined date variables.

Autoregression Performance and Embedded Missing Data


When you request maximum-likelihood estimation with the Autoregression procedure, Trends uses the same algorithms as in ARIMA. This means that Autoregression can process series with embedded missing data when maximum-likelihood estimation is requested, but it may take a while. To reduce processing time when your series has embedded missing data, you can follow the same steps outlined for ARIMA. However, since the Autoregression Options dialog box does not have controls for initial values, you need to use command syntax to apply initial estimates the second time you run the Autoregression procedure (see step 3 of the process for improving the performance of the ARIMA procedure). The command is:
AREG /APPLY INITIAL.

Again, if processing time is not a consideration, you can simply use the Autoregression procedure directly on the series with embedded missing data. Alternatively, if you do not need the best-quality estimates, you can stop after identifying the model and estimating the coefficients for the series without missing data (see step 2 of the process for improving the performance of the ARIMA procedure) and get results as good as you would with most other packages.

AREG Command Additional Features


The SPSS command language also allows you to: Use the final estimate of rho from a previous execution of Autoregression as the initial estimate for iteration. Exercise more precise control over convergence criteria.

29 Autoregression

See the SPSS Command Syntax Reference for complete syntax information.

Chapter

ARIMA

This procedure estimates nonseasonal and seasonal univariate ARIMA (Autoregressive Integrated Moving Average) models (also known as Box-Jenkins models) with or without fixed regressor variables. The procedure produces maximum-likelihood estimates and can process time series with missing observations.
Example. You are in charge of quality control at a manufacturing plant and need to know if and when random fluctuations in product quality exceed their usual acceptable levels. Youve tried modeling product quality scores with an exponential smoothing model but foundpresumably because of the highly erratic nature of the datathat the model does little more than predict the overall mean and hence is of little use. ARIMA models are well suited for describing complex time series. After building an appropriate ARIMA model, you can plot the product quality scores along with the upper and lower confidence intervals produced by the model. Scores that fall outside of the confidence intervals may indicate a true decline in product quality. Statistics. For each iteration: seasonal and nonseasonal lags (autoregressive and

moving average), regression coefficients, adjusted sum of squares, and Marquardt constant. For the final maximum-likelihood parameter estimates: residual sum of squares, adjusted residual sum of squares, residual variance, model standard error, log-likelihood, Akaikes information criterion, Schwartzs Bayesian criterion, regression statistics, correlation matrix, and covariance matrix.
Data. The dependent variable and any independent variables should be numeric. Assumptions. The time series to be modeled should be stationary. Stationary. A condition that must be met by the time series to which you fit an

ARIMA model. Pure MA series will be stationary; however, AR and ARMA series might not be. A stationary series has a constant mean and a constant variance over time.

31

32 Chapter 5

Obtaining an ARIMA Analysis


E From the menus choose: Analyze Time Series ARIMA Figure 5-1 ARIMA dialog box

E Select one variable from the available list and move it into the Dependent list. Note

that the list includes only numeric variables.


E Specify values for the parameters of the ARIMA model: Autoregressive, Difference,

and Moving Average. These parameters are commonly referred to as p, d, and q, respectively.
Autoregressive. The number of autoregressive parameters in the model. Specify a

non-negative integer. Each parameter measures the independent effect of values with a specified lag. Thus, an autoregressive order of 2 means that a series value is affected by the preceding two values (independently of one another).

33 ARIMA

Difference. d is the number of times the series must be differenced to make

it stationary. Specify a non-negative integer. Enter 0 if the process is already stationary.


Moving Average. q is the order of moving average of the process. Specify a

non-negative integer. Enter 0 for an autoregressive process, 1 for a first-order moving average, 2 for a second-order moving average, etc.
Transform. To analyze the dependent variable in a logarithmic scale, select one of the alternatives on the drop-down list. If you select a log transformation, ARIMA transforms the predicted values (fit) and confidence intervals (lcl and ucl) that it creates back into the original metric but leaves the residuals (err) in the log metric for diagnostic purposes. Seasonal. The seasonal column contains text boxes in which you can specify the corresponding parameters (sp, sd, and sq) of the process at seasonal lags. These values can be 0 or a positive integer, usually 1. The Seasonal boxes are unavailable if no periodicity is defined. The current periodicity is displayed at the bottom of the dialog box.

Optionally, you can: Select one or more independent variables and move them into the Independent(s) list. Note that the list includes only numeric variables. Deselect Include constant in model if you do not want to estimate a constant term. Click Save to specify how new variables are to be saved or to select options for predicting cases. Click Options to select convergence criteria, set initial values for the model, select the forecasting method, or choose how to display parameters in the output.

34 Chapter 5

ARIMA Options
Figure 5-2 ARIMA Options dialog box

Convergence Criteria. Allows you to specify the criteria used to determine when

iteration ceases.
Maximum iterations. The iterative algorithm stops after the number of specified

iterations even if the algorithm has not converged. You can specify a positive integer in this text box.
Parameter change tolerance. Iteration stops if no parameter changes by more than

0.001 from one iteration to the next. You can choose a smaller or larger value for more or less precision in the parameter estimates. For greater precision, it may also be necessary to increase the maximum iterations.
Sum of squares change. The iterative algorithm stops if the adjusted sums of

squares does not decrease by 0.001% from one iteration to the next. You can choose a smaller or larger value for more or less precision in the parameter estimates. For greater precision, it may be necessary to increase the maximum iterations.

35 ARIMA

Initial Values for Estimation. Allows you to select how initial values for the model are

determined.
Automatic. ARIMA chooses initial values. Apply from previous model. The parameter estimates from the previous execution

of ARIMA (in the same session) are used as initial estimates. This can save time if the data and model are similar to the last one used.
Forecasting Method. Allows you to specify the forecasting method to use. Unconditional least squares. The forecasts are unconditional least squares

forecasts. They are also called finite memory forecasts.


Conditional least squares. The forecasts are conditional least squares forecasts.

They are also called infinite memory forecasts


Use model constant for initialization. The forecasts are computed by assuming

that the unobserved past errors are 0 and that the unobserved past values of the response series are equal to the mean.
Use beginning series values for initialization. The beginning series values are used

to initialize the recursive conditional least squares forecasting algorithm.


Display. Allows you to control the information that is printed in the output. Initial and final parameters with iteration summary. Displays the initial and final

parameter estimates, goodness-of-fit statistics, the number of iterations, and the reason that iteration terminated.
Initial and final parameters with iteration details. Displays the initial and final

parameter estimates, the parameter estimates after each iteration, goodness-of-fit statistics, the number of iterations, and the reason that iteration terminated.
Final parameters only. Displays the final parameter estimates and goodness-of-fit

statistics.

36 Chapter 5

Saving Predicted Values and Residuals


Figure 5-3 Save dialog box

Create Variables. Allows you to choose how to treat new variables. Add to file. The new series are saved as regular variables in your working data

file. Variable names are formed from a three-letter prefix, an underscore, and a number.
Replace existing. The new series are saved as temporary variables in the working

data file, and any existing temporary variables created by Trends commands are dropped. Variable names are formed from a three-letter prefix, a pound sign (#), and a number.
Do not create. The new series are not added to the working data file. Confidence intervals. Displays confidence limits for the forecast. As long as the

process remains the same, you should expect N% of the series to remain between the upper and lower confidence limits. Choose a level from the drop-down list.
Predict Cases. Allows you to use the estimated model to predict values of the dependent variable. Predict from estimation period through last case. Predicts values for all cases from

the estimation period through the end of the file but does not create new cases. If you are analyzing a range of cases that start after the beginning of the file, cases

37 ARIMA

prior to that range are not predicted. If no estimation period has been defined, all cases are used to predict values.
Predict through. Predicts values through the specified date, time, or observation

number, based on the cases in the estimation period. This can be used to forecast values beyond the last case in the time series. The text boxes that are available for specifying the end of the prediction period depend on the currently defined date variables.

ARIMA Performance and Embedded Missing Data


In the presence of embedded missing data, the SPSS Trends ARIMA procedure uses a technique called Kalman filtering, which requires considerably more calculation than the simpler technique used when no embedded missing data are present. Even a single embedded missing value increases ARIMA processing time greatlyin extreme cases, by a factor of 10. If you want to use ARIMA on a series that contains embedded missing data, you can use the following procedure to reduce processing time: 1. Make a copy of the series with valid data interpolated in place of the embedded missing data. 2. Identify the correct model and estimate the coefficients for the series without missing data. ARIMA can use a much faster algorithm when no embedded missing data are present. 3. Once you have found the correct model, run ARIMA on the original series to get the best possible estimates for the coefficients, using Kalman filtering to handle the missing data. This time, open the ARIMA Options dialog box and select Apply from previous model for Initial Values for Estimation. This should reduce the number of iterations needed this time. Most ARIMA packages allow only the first two steps. You can always stop there with Trends ARIMA too, but you have the option of using the Kalman filtering algorithm to get the best possible estimates.

38 Chapter 5

Note that the results obtained by following the steps above are the same as the results you would obtain if you used the ARIMA procedure directly on the series with embedded missing data, without first estimating initial values from interpolated data. The only difference is processing time.

ARIMA Command Additional Features


The SPSS command language also allows you to: Specify constrained models in which autoregressive or moving average parameters (either regular or seasonal) are estimated only for specified orders. Exercise more precise control over convergence criteria. See the SPSS Command Syntax Reference for complete syntax information.

Chapter

Seasonal Decomposition

The Seasonal Decomposition procedure decomposes a series into a seasonal component, a combined trend and cycle component, and an error component. The procedure is an implementation of the Census Method I, otherwise known as the ratio-to-moving-average method.
Example. A scientist is interested in analyzing monthly measurements of the ozone level at a particular weather station. The goal is to determine if there is any trend in the data. In order to uncover any real trend, the scientist first needs to account for the variation in readings due to seasonal effects. The Seasonal Decomposition procedure can be used to remove any systematic seasonal variations. The trend analysis is then performed on a seasonally adjusted series. Statistics. The set of seasonal factors. Data. The variables should be numeric. Assumptions. The variables should not contain any embedded missing data. At least one periodic date component must be defined. Estimating Seasonal Factors
E From the menus choose: Analyze Time Series Seasonal Decomposition...

39

40 Chapter 6 Figure 6-1 Seasonal Decomposition dialog box

E Select one or more variables from the available list and move them into the Variable(s)

list. Note that the list includes only numeric variables.


Model. The Seasonal Decomposition procedure offers two different approaches for modeling the seasonal factors: multiplicative or additive. Multiplicative. The seasonal component is a factor by which the seasonally

adjusted series is multiplied to yield the original series. In effect, Trends estimates seasonal components that are proportional to the overall level of the series. Observations without seasonal variation have a seasonal component of 1.
Additive. The seasonal adjustments are added to the seasonally adjusted series

to obtain the observed values. This adjustment attempts to remove the seasonal effect from a series in order to look at other characteristics of interest that may be "masked" by the seasonal component. In effect, Trends estimates seasonal components that do not depend on the overall level of the series. Observations without seasonal variation have a seasonal component of 0.
Moving Average Weight. The Moving Average Weight options allow you to specify how to treat the series when computing moving averages. These options are available only if the periodicity of the series is even. If the periodicity is odd, all points are weighted equally.

41 Seasonal Decomposition

All points equal. Moving averages are calculated with a span equal to the

periodicity and with all points weighted equally. This method is always used if the periodicity is odd.
Endpoints weighted by .5. Moving averages for series with even periodicity are

calculated with a span equal to the periodicity plus 1 and with the endpoints of the span weighted by 0.5. Optionally, you can: Click Save to specify how new variables should be saved.

Seasonal Decomposition Save


Figure 6-2 Season Save dialog box

Create Variables. Allows you to choose how to treat new variables. Add to file. The new series created by Seasonal Decomposition are saved as

regular variables in your working data file. Variable names are formed from a three-letter prefix, an underscore, and a number.
Replace existing. The new series created by Seasonal Decomposition are saved as

temporary variables in your working data file. At the same time, any existing temporary variables created by the Trends procedures are dropped. Variable names are formed from a three-letter prefix, a pound sign (#), and a number.
Do not create. The new series are not added to the working data file.

SEASON Command Additional Features


The SPSS command language also allows you to: Specify any periodicity within the SEASON command rather than select one of the alternatives offered by the Define Dates procedure.

42 Chapter 6

See the SPSS Command Syntax Reference for complete syntax information.

Chapter

Spectral Plots

The Spectral Plots procedure is used to identify periodic behavior in time series. Instead of analyzing the variation from one time point to the next, it analyzes the variation of the series as a whole into periodic components of different frequencies. Smooth series have stronger periodic components at low frequencies; random variation (white noise) spreads the component strength over all frequencies. Series that include missing data cannot be analyzed with this procedure.
Example. The rate at which new houses are constructed is an important barometer of the state of the economy. Data for housing starts typically exhibit a strong seasonal component. But are there longer cycles present in the data that analysts need to be aware of when evaluating current figures? Statistics. Sine and cosine transforms, periodogram value, and spectral density

estimate for each frequency or period component. When bivariate analysis is selected: real and imaginary parts of cross-periodogram, cospectral density, quadrature spectrum, gain, squared coherency, and phase spectrum for each frequency or period component.
Plots. For univariate and bivariate analyses: periodogram and spectral density.

For bivariate analyses: squared coherency, quadrature spectrum, cross amplitude, cospectral density, phase spectrum, and gain.
Data. The variables should be numeric.

43

44 Chapter 7

Assumptions. The variables should not contain any embedded missing data. The

time series to be analyzed should be stationary and any non-zero mean should be subtracted out from the series.
Stationary. A condition that must be met by the time series to which you fit an

ARIMA model. Pure MA series will be stationary; however, AR and ARMA series might not be. A stationary series has a constant mean and a constant variance over time.
Obtaining a Spectral Analysis
E From the menus choose: Graphs Time Series Spectral... Figure 7-1 Spectral Plots dialog box

E Select one or more variables from the available list and move them to the Variable(s)

list. Note that the list includes only numeric variables.


E Select one of the Spectral Window options to choose how to smooth the periodogram

in order to obtain a spectral density estimate. Available smoothing options are Tukey-Hamming, Tukey, Parzen, Bartlett, Daniell (Unit), and None.

45 Spectral Plots

Tukey-Hamming. The weights are Wk = .54Dp(2 pi fk) + .23Dp (2 pi fk + pi/p) +

.23Dp (2 pi fk - pi/p), for k = 0, ..., p, where p is the integer part of half the span and Dp is the Dirichlet kernel of order p.
Tukey. The weights are Wk = 0.5Dp(2 pi fk) + 0.25Dp (2 pi fk + pi/p) + 0.25Dp(2

pi fk - pi/p), for k = 0, ..., p, where p is the integer part of half the span and Dp is the Dirichlet kernel of order p.
Parzen. The weights are Wk = 1/p(2 + cos(2 pi fk)) (F[p/2] (2 pi fk))**2, for k= 0,

... p, where p is the integer part of half the span and F[p/2] is the Fejer kernel of order p/2.
Bartlett. The shape of a spectral window for which the weights of the upper half

of the window are computed as Wk = Fp (2*pi*fk), for k = 0, ... p, where p is the integer part of half the span and Fp is the Fejer kernel of order p. The lower half is symmetric with the upper half.
Daniell (Unit). The shape of a spectral window for which the weights are all

equal to 1.
None. No smoothing. If this option is chosen, the spectral density estimate is

the same as the periodogram.


Bivariate analysis first variable with each. If you have selected two or more variables, you can select this option to request bivariate spectral analyses.

The first variable in the Variable(s) list is treated as the independent variable, and all remaining variables are treated as dependent variables. Each series after the first is analyzed with the first series independently of other series named. Univariate analyses of each series are also performed.
Span. The range of consecutive values across which the smoothing is carried out.

Generally, an odd integer is used. Larger spans smooth the spectral density plot more than smaller spans.
Center variables. Adjusts the series to have a mean of 0 before calculating the

spectrum and to remove the large term that may be associated with the series mean.
Plot. Periodogram and spectral density are available for both univariate and bivariate

analyses. All other choices are available only for bivariate analyses.
Periodogram. Unsmoothed plot of spectral amplitude (plotted on a logarithmic

scale) against either frequency or period. Low-frequency variation characterizes a smooth series. Variation spread evenly across all frequencies indicates "white noise."

46 Chapter 7

Spectral density. A periodogram that has been smoothed to remove irregular

variation.
Cospectral density. The real part of the cross-periodogram, which is a measure of

the correlation of the in-phase frequency components of two time series.


Quadrature spectrum. The imaginary part of the cross-periodogram, which is a

measure of the correlation of the out-of-phase frequency components of two time series. The components are out of phase by pi/2 radians.
Cross amplitude. The square root of the sum of the squared cospectral density

and the squared quadrature spectrum.


Gain. The quotient of dividing the cross amplitude by the spectral density for one

of the series. Each of the two series has its own gain value.
Squared coherency. The product of the gains of the two series. Phase spectrum. A measure of the extent to which each frequency component

of one series leads or lags the other.


By frequency. All plots are produced by frequency, ranging from frequency 0 (the constant or mean term) to frequency 0.5 (the term for a cycle of two observations). By period. All plots are produced by period, ranging from 2 (the term for a cycle of two observations) to a period equal to the number of observations (the constant or mean term). Period is displayed on a logarithmic scale.

SPECTRA Command Additional Features


The SPSS command language also allows you to: Save computed spectral analysis variables to the working data file for later use. Specify custom weights for the spectral window. Produce plots by both frequency and period. Print a complete listing of each value shown in the plot. See the SPSS Command Syntax Reference for complete syntax information.

Part 2: Examples

Chapter

Exponential Smoothing

Using Exponential Smoothing to Predict Future Sales


A catalog company is interested in forecasting monthly sales of its mens clothing line. To this end, the company collected monthly sales of mens clothing for a 10-year period. This information is collected in catalog_seasfac.sav, found in the \tutorial\sample_files\ subdirectory of the directory in which you installed SPSS. Use the Exponential Smoothing procedure to predict monthly sales of mens clothing for the following year.

Preliminaries
In the examples that follow, it is more convenient to use variable names rather than variable labels.
E From the menus choose: Edit Options...

49

50 Chapter 8 Figure 8-1 Options dialog box

E Select Display names in the Variable Lists group. E Click OK.

Understanding Your Data


The first step in analyzing a time series is to plot it. Visual inspection of a time series can often be a powerful guide in choosing an appropriate exponential smoothing model. In particular: Does the series have an overall trend? Does the trend appear constant or does it appear to be dying out with time? Does the series show seasonality? Do the seasonal fluctuations seem to grow with time or do they appear constant over successive periods?

51 Exponential Smoothing

To obtain a plot of mens clothing sales over time:


E From the menus choose: Graphs Sequence... Figure 8-2 Sequence Charts dialog box

E Select men and move it into the Variables list. E Select date and move it into the Time Axis Labels list. E Click Time Lines.

52 Chapter 8 Figure 8-3 Sequence Charts Time Axis Reference Lines dialog box

E Select Line at each change of. E Select YEAR_ and move it into the Reference Variable list.

These choices result in a vertical reference line at the start of each year, which is useful for identifying annual seasonality.
E Click Continue. E Click OK in the Sequence Charts dialog box.

53 Exponential Smoothing Figure 8-4 Sales of mens clothing (in U.S. dollars)

The series shows a global upward trend; that is, the series values tend to increase over time. The upward trend is seemingly constant, which indicates a linear trend. The series also has a distinct seasonal pattern with annual highs in December. This is easy to see because of the vertical reference lines positioned at the start of each year. The seasonal variations appear to grow with the upward series trend, which suggests multiplicative rather than additive seasonality.

Building and Analyzing Exponential Smoothing Models


Building a best-fit exponential smoothing model involves determining the model typedoes the model need to include trend and/or seasonalityand then obtaining the best-fit parameters for the chosen model.

54 Chapter 8

The plot of mens clothing sales over time suggested a model with both a linear trend component and a multiplicative seasonality component. This implies a Winters model. First, however, we will explore a simple model (no trend and no seasonality) and then a Holt model (incorporates linear trend but no seasonality). This will give you practice in identifying when a model is not a good fit to the data, which is an essential skill in successful model building.

Building and Analyzing a Simple Model


To build an exponential smoothing model:
E From the menus choose: Analyze Time Series Exponential Smoothing... Figure 8-5 Exponential Smoothing dialog box

E Select men and move it into the Variables list. E Select Simple in the Model group. E Click Parameters.

55 Exponential Smoothing Figure 8-6 Exponential Smoothing Parameters dialog box

E Select Grid Search in the General (Alpha) group. Leave the Start, Stop, and By text

boxes with their default values of 0, 1, and 0.1, respectively. The Grid Search option provides a convenient method for determining the best-fit model parameters by calculating goodness-of-fit measures for each of the grid values. The current selections result in values of alpha ranging from 0 to 1 in increments of 0.1. Leave the default choice for displaying only the 10 best-fit models. The analysis is still performed for each of the points in the grid, but the output is limited to displaying results for the 10 best models.
E Click Continue. E Click OK in the Exponential Smoothing dialog box.

56 Chapter 8 Figure 8-7 Exponential smoothing, no trend or seasonality

The model output lists the 10 best-fitting values of alpha, along with the associated sums of squared errors (SSE) for each value. Each value of alpha corresponds to a different model, with models ranked according to their SSE value. A lower rank implies a smaller SSE and thus a model that gives a better fit to the data. The SSE measure of error is lowest when alpha is 0.1, indicating that an alpha of 0.1 gives the best fit to the data for this class of models. This low value of alpha indicates that the overall level of the series is best predicted when all observations have roughly equal weight. To see how well the simple model fits the data, youll need to reopen the Sequence Charts dialog box. A convenient shortcut for reopening previous dialog boxes is as follows:
E Click the Dialog Recall toolbar icon. E Select the entry for the desired dialog box, which in this case is Sequence Charts.

57 Exponential Smoothing Figure 8-8 Sequence Charts dialog box

E Click Reset. This will restore the dialog box to its default settings, thus removing

the timeline settings used previously.


E Select men and FIT_1 and move them into the Variables list. E Click OK.

58 Chapter 8 Figure 8-9 Predictions from exponential smoothing, no trend or seasonality

Notice that the model, as expected, does not account for the observed seasonality. In addition, it predicts an initial downward trend in contrast to the data, which exhibits a continuing upward trend. Examining the autocorrelations and partial autocorrelations for the residuals of the best-fit simple model provides more quantitative insight than viewing the sequence charts. Significant structure in either of these correlation functions would imply that the underlying model is incomplete.
E From the menus choose: Graphs Time Series Autocorrelations...

59 Exponential Smoothing Figure 8-10 Autocorrelations dialog box

E Select the error variable ERR_1 associated with the fit from the simple model. E Click OK.

60 Chapter 8 Figure 8-11 Residual autocorrelation plot for simple model

The autocorrelation function shows a significant peak at a lag of 12. This is no surprise, since the simple model doesnt account for seasonality and there is a strong annual seasonal component in the data.

61 Exponential Smoothing Figure 8-12 Residual autocorrelation statistics for simple model

Notice also that the Box-Ljung statistic at a lag of 12 is statistically significant. In fact, the associated p value is 0 to the stated precision. This underscores the fact that the lag 12 autocorrelation is significant and represents structure not accounted for in the present model.

62 Chapter 8 Figure 8-13 Residual partial autocorrelation plot for simple model

The partial autocorrelation function removes the indirect effect of all intervening lags, providing the best measure of a direct relationship between time series values separated by a given lag. The partial autocorrelation function for the simple model shows the same significant peak at a lag of 12 as the autocorrelation function. This provides definitive proof that the residuals of the simple model contain the structure of the annual seasonality of the time series.

Summary
Given the poor fit of the simple model to the data and the presence of significant structure in the models residuals, you can conclude that the simple model is not a good choice for modeling the data. The initial plot of the time series suggested a model with both trend and seasonality, neither of which is present in the simple model. Next, we will use the Holt model to extend the present model with a trend component.

63 Exponential Smoothing

Building and Analyzing a Holt Model


Figure 8-14 Exponential Smoothing dialog box

E To build a Holt model, open the Exponential Smoothing dialog box. E Select Holt in the Model group. E Click Parameters.

64 Chapter 8 Figure 8-15 Exponential Smoothing Parameters dialog box

E Select Grid Search in both the General (Alpha) and Trend (Gamma) groups. Leave the

Start, Stop, and By text boxes with their default values. The current grid search results in evaluating 66 models, one for each of the possible combinations of alpha and gamma11 values of alpha and 6 values of gamma.
E Click Continue. E Click OK in the Exponential Smoothing dialog box. Figure 8-16 Exponential smoothing, linear trend and no seasonality

65 Exponential Smoothing

The SSE measure of error is lowest when alpha is 0.1 and gamma is 0. The low values of alpha and gamma mean that all observations have roughly equal weight in modeling both the overall level of the series and the series trend. The SSE for the best-fitting Holt model is 3612687888, which is slightly better than the SSE for the best-fitting simple model of 3716115994. The Holt model then appears to give a better fit to the data than the simple model.
Figure 8-17 Sequence Charts dialog box

E To see how well the Holt model fits the data, open the Sequence Charts dialog box. E Deselect the variable associated with the simple model (FIT_1) from the Variables list

and select the variable corresponding to the Holt model (FIT_2).


E Click OK.

66 Chapter 8 Figure 8-18 Predictions from exponential smoothing, linear trend and no seasonality

Notice that the model does a good job of capturing the trend component of the data. As expected, though, it does not account for the observed seasonality. As in the case of the simple model, examination of the autocorrelations and partial autocorrelations of the residuals provides more quantitative insight.

67 Exponential Smoothing Figure 8-19 Autocorrelations dialog box

E Examine the autocorrelations of the residuals by opening the Autocorrelations

dialog box.
E Deselect the variable associated with the simple model (ERR_1) from the Variables

list and select the variable corresponding to the Holt model (ERR_2).
E Click OK.

68 Chapter 8 Figure 8-20 Residual autocorrelation plot for Holt model

The autocorrelation function shows a significant peak at a lag of 12, just as in the case of the simple model. This is no surprise, since the Holt model (like the simple model) doesnt account for seasonality.

69 Exponential Smoothing Figure 8-21 Residual partial autocorrelation plot for Holt model

The partial autocorrelation function for the Holt model exhibits the same significant peak at a lag of 12 as the autocorrelation function. This shows that the residuals of the Holt model, like those of the simple model, contain the structure of the annual seasonality of the time series.

Summary
Given the presence of significant structure in the residuals of the Holt model corresponding to the seasonality component, you can conclude that the Holt model, although better than the simple model, is not a good choice for modeling the data. Recall that the initial plot of mens clothing sales over time suggested a model incorporating a linear trend and multiplicative seasonalitya Winters model.

70 Chapter 8

Building and Analyzing a Winters Model


Figure 8-22 Exponential Smoothing dialog box

E To build a Winters model, open the Exponential Smoothing dialog box. E Select Winters in the Model group. E Select Seasonal_Factors_Men and move it into the Seasonal Factors list. E Click Parameters.

71 Exponential Smoothing Figure 8-23 Exponential Smoothing Parameters dialog box

E Select Grid Search in the General (Alpha), Trend (Gamma), and Seasonal (Delta)

groups. Leave the Start, Stop, and By text boxes with their default values.
E Click Continue. E Click OK in the Exponential Smoothing dialog box. Figure 8-24 Exponential smoothing, linear trend and multiplicative seasonality

72 Chapter 8

The SSE measure of error is lowest when alpha is 0.1, gamma is 0, and delta is 0. The low values of alpha, gamma, and delta mean that all observations have roughly equal weight in modeling the overall level of the series, the series trend, and the seasonality. The SSE for the best-fit Winters model is significantly less than the corresponding values for both the Holt (SSE of 3612687888) and simple (SSE of 3716115994) models. This is the first indication that the Winters model provides a substantially better fit to the data.
Figure 8-25 Sequence Charts dialog box

E To see how well the Winters model fits the data, open the Sequence Charts dialog box. E Deselect the variable associated with the Holt model (FIT_2) from the Variables list

and select the variable corresponding to the Winters model (FIT_3).


E Click OK.

73 Exponential Smoothing Figure 8-26 Predictions from exponential smoothing, linear trend and multiplicative seasonality

Notice that the model does a good job of capturing both the trend and the seasonality of the data. The data set covers a period of 10 years and includes 10 seasonal peaks occurring in December of each year. The 10 peaks present in the predicted results match up well with the 10 annual peaks in the real data. The results also underscore the limitations of the Exponential Smoothing procedure, since there is significant structure that is not accounted for. If you are primarily interested in modeling a long-term trend with seasonal variation, then exponential smoothing may be a good choice. To model a more complex structure, consider using the ARIMA procedure. An examination of the autocorrelations and partial autocorrelations of the residuals will provide stronger evidence for or against the Winters model.

74 Chapter 8 Figure 8-27 Autocorrelations dialog box

E Examine the autocorrelations of the residuals by opening the Autocorrelations

dialog box.
E Deselect the variable associated with the Holt model (ERR_2) from the Variables list

and select the variable corresponding to the Winters model (ERR_3).


E Click OK.

75 Exponential Smoothing Figure 8-28 Residual autocorrelation plot for Winters model

The Winters model has adequately described the annual seasonal component, so there is no longer a significant peak at a lag of 12 in the autocorrelation function; the peak is within the confidence bounds.

76 Chapter 8 Figure 8-29 Residual autocorrelation statistics for Winters model

Notice also that the Box-Ljung statistic at a lag of 12 is no longer statistically significant.

77 Exponential Smoothing Figure 8-30 Residual partial autocorrelation plot for Winters model

The partial autocorrelation function no longer has a significant peak at a lag of 12. The peak at a lag of 11, however, is outside the confidence limits and appears significant. This would suggest periodic behavior with an 11-month cycle. The Box-Ljung statistic at a lag of 11, however, isnt statistically significant. In addition, analysis of the catalog companys records shows no good business reason to expect an 11-month cycle. With reasonable confidence, you can then reject the lag 11 peak as irrelevant.

Summary of Model Building


Three models incorporating different assumptions about trend and seasonality have been considered, and the results clearly indicate a Winters model over a Holt or simple model. Not only does the best-fit Winters model give a good visual fit to the data, but the residuals, as analyzed through their autocorrelations and partial autocorrelations, show no significant structure.

78 Chapter 8

Although the initial plot of the time series suggested linear trend and multiplicative seasonality, a more thorough analysis would also consider a custom model with linear trend and additive seasonality. The results of that analysis show, however, that the SSE for the best-fit Winters model is lower, giving us confidence that the Winters model is truly the best exponential smoothing model for our data.

Testing the Predictive Ability of the Model


You can judge the forecasting performance of your model by using holdouts. A holdout is a historical series point that is not used in the computation of the model parameters, thus removing its effect on the computation of forecasts. By forcing the model to predict values that you actually know, you get an idea of how well the model forecasts. This method can be illustrated by holding out the data from January, 1998, through December, 1998. The data prior to January, 1998, are used to build the model, and the model is then used to forecast sales in 1998. To perform a holdout analysis, first select the modeling period:
E From the menus choose: Data Select Cases...

79 Exponential Smoothing Figure 8-31 Select Cases dialog box

E Select Based on time or case range in the Select group. E Click Range. Figure 8-32 Select Cases Range dialog box

E Enter 1989 for the year associated with the first case and 1 for the month associated

with the first case.

80 Chapter 8 E Enter 1997 for the year associated with the last case and 12 for the month associated

with the last case. These choices will result in a model based on the period 01/1989 through 12/1997.
E Click Continue. E Click OK in the Select Cases dialog box. Figure 8-33 Exponential Smoothing dialog box

E Open the Exponential Smoothing dialog box. E Click OK.

This results in rerunning the Winters model using the data from 01/1989 to 12/1997 to determine the best-fit parameters. The analysis also includes predictions of sales of mens clothing during the holdout period (01/1998 to 12/1998) using the parameters from the best-fit model.

81 Exponential Smoothing

Comparison of the model predictions for the holdout period with the actual data is best done by limiting the cases to the holdout period itself.
Figure 8-34 Select Cases dialog box

E Open the Select Cases dialog box. E Click Range.

82 Chapter 8 Figure 8-35 Select Cases Range dialog box

E Enter 1998 for the year associated with the first case and 1 for the month associated

with the first case.


E Enter 1998 for the year associated with the last case and 12 for the month associated

with the last case.


E Click Continue. E Click OK in the Select Cases dialog box.

83 Exponential Smoothing Figure 8-36 Sequence Charts dialog box

E Open the Sequence Charts dialog box. E Deselect the variable associated with the original Winters model (FIT_3) from

the Variables list and select the variable corresponding to the Winters model with holdouts (FIT_4).
E Click OK.

84 Chapter 8 Figure 8-37 Forecast from Winters model with holdouts

Other than the value for August, 1998, the forecast values show good agreement with the known values, indicating that the model has satisfactory predictive ability. A review of the company records shows that a 20%-off promotion was in effect for the mailing of August, 1998, which may explain why sales were unusually high for that month. You expect that the model will behave well in the absence of special events such as occasional promotions but may need to be modified if accounting for special events is important.

Using the Model to Predict Future Sales


Youve determined that a Winters model is the best exponential smoothing model for your data and obtained satisfactory results regarding its predictive ability. Now its time to use the model to predict future sales.

85 Exponential Smoothing Figure 8-38 Select Cases dialog box

To predict future values of sales of mens clothing, first include all cases by performing the following steps:
E Open the Select Cases dialog box. E Select All Cases in the Select group. E Click OK.

86 Chapter 8 Figure 8-39 Exponential Smoothing dialog box

E Open the Exponential Smoothing dialog box. E Click Save. Figure 8-40 Exponential Smoothing Save dialog box

E Select Add to file in the Create Variables group.

This causes the forecast values to be saved as regular variables in your working data file.
E Select Predict through in the Predict Cases group. E Enter 1999 for the year and 12 for the month.

87 Exponential Smoothing E Click Continue. E Click OK in the Exponential Smoothing dialog box. Figure 8-41 Sequence Charts dialog box

E To see the sales predictions through 1999, open the Sequence Charts dialog box. E Deselect the current variables and select the variable corresponding to the final

prediction (FIT_5).
E Click Time Lines.

88 Chapter 8 Figure 8-42 Sequence Charts Time Axis Reference Lines dialog box

E Select Line at Date. E Enter 1998 for the year and 12 for the month.

These choices result in a vertical reference line positioned at December, 1998, which acts to separate the future forecasts from the rest of the model predictions.
E Click Continue. E Click OK in the Sequence Charts dialog box.

89 Exponential Smoothing Figure 8-43 Forecasts for sales of mens clothing

As expected, the forecast values appear to continue the trend and seasonal behavior of the series.

Summary
Using the Exponential Smoothing procedure, you have constructed a model for predicting future sales of mens clothing for a catalog company. Three models with various assumptions about trend and seasonality were considered, with the conclusion that a Winters model gave the best overall fit to the data. Analysis of the autocorrelations and partial autocorrelations of a models residuals was seen to be a powerful tool in uncovering structure unaccounted for by an incomplete modelin this case, an annual seasonal component. As part of the model-building process, you used a grid-search method to determine the set of model parameters that gave the best fit, as measured by the sum of squared errors (SSE). A possible refinement of the present model would involve a more

90 Chapter 8

precise determination of the best-fit model parameters by using successively finer gridsfor example, grids with steps of 0.01, 0.001, and so on.

Related Procedures
The Exponential Smoothing procedure is useful for modeling time series when a detailed understanding of the data is lacking, such as when a set of predictor variables is not available. It works well for time series that exhibit any combination of a slowly varying level, trend, or seasonal component of known periodicity. When a set of predictor variables is available, consider using the Autoregression procedure. For more information, see Chapter 9. For time series with a complex, nonrandom structure, with or without a set of predictor variables, use the ARIMA procedure. For more information, see Chapter 10. To learn how to obtain the initial set of seasonal factors used for the Winters model, see Chapter 11. To verify assumptions about the presence of a seasonal component, use the Spectral Plots procedure. For more information, see Chapter 12.

Recommended Readings
See the following texts for more information on exponential smoothing: Gardner, E. S. 1985. Exponential smoothing: The state of the art. Journal of Forecasting, 4, 1-28. Makridakis, S. G., S. C. Wheelwright, and R. J. Hyndman. 1997. Forecasting: Methods and Applications. New York: John Wiley & Sons.

Chapter

Autoregression
Determining Significant Predictors in the Presence of Autocorrelation

The management agency for an up and coming band is interested in determining the most significant predictors of sales of the bands most current CD, which was released just prior to a year-long tour. The agency collected a years worth of data on weekly CD sales along with three possible predictor variables: number of performances per week, number of weekly hits to the bands Web site that included a download of a song clip from the CD, and number of promotional flyers mailed each week. This information is stored in band.sav, found in the \tutorial\sample_files\ subdirectory of the directory in which you installed SPSS. Use regression techniques to determine the best predictors of CD sales.

Preliminaries
In the examples that follow, it is more convenient to use variable names rather than variable labels.
E From the menus choose: Edit Options...

91

92 Chapter 9 Figure 9-1 Options dialog box

E Select Display names in the Variable Lists group. E Click OK.

Analysis from Ordinary Least-Squares Regression


Is the series of weekly CD sales amenable to analysis using ordinary least-squares regression? The only way to find out is to perform the analysis and examine the model residuals.

93 Autoregression

Running the Analysis


To run a linear regression analysis:
E From the menus choose: Analyze Regression Linear... Figure 9-2 Linear Regression dialog box

E Select sales for the Dependent variable. E Select performances, web, and flyers for the Independent variables. E Click Statistics.

94 Chapter 9 Figure 9-3 Linear Regression Statistics dialog box

E Select Durbin-Watson in the Residuals group. E Click Continue. E Click Plots in the Linear Regression dialog box. Figure 9-4 Linear Regression Plots dialog box

E Select Normal probability plot in the Standardized Residual Plots group. E Click Continue. E Click Save in the Linear Regression dialog box.

95 Autoregression Figure 9-5 Linear Regression Save dialog box

E Select Unstandardized in the Residuals group. E Click Continue. E Click OK in the Linear Regression dialog box.

96 Chapter 9

Coefficients
Figure 9-6 Coefficients table

The coefficients table shows the coefficients of the regression line along with their significance levels. It shows that none of the chosen predictors are statistically significant; that is, all of the values in the Sig. column are greater than 0.05.

Checking the Normality of the Residuals


Figure 9-7 Normal probability plot

97 Autoregression

The normal probability plot allows you to check the assumption of normality of the residuals. It shows the residuals on the horizontal axis and the expected valueif the residuals were normally distributedon the vertical axis. If the residuals are normally distributed, the cases fall near the diagonal, as they do here.

Checking the Autocorrelation of the Residuals


The Durbin-Watson statistic is a measure of the first-order autocorrelation of the residuals and thus provides a check of the assumption of uncorrelated residuals. Values of this statistic range from 0 to 4, with values less than 2 indicating positively correlated residuals and values greater than 2 indicating negatively correlated residuals.
Figure 9-8 Model summary table

The Durbin-Watson statistic is 0.773. From the Durbin-Watson Significance Tables in Appendix A, you find that this value is significant at the 1% level for concluding that the residuals show positive first-order autocorrelation. In addition to obtaining the Durbin-Watson statistic for the residuals, its also a good idea to plot the partial autocorrelation function of the residuals. This will uncover the full structure of the autocorrelation of the residuals. First-order autocorrelation is detected as a significant peak at lag 1 in the partial autocorrelation function. To obtain the partial autocorrelations:
E From the menus choose: Graphs Time Series Autocorrelations...

98 Chapter 9 Figure 9-9 Autocorrelations dialog box

E Select RES_1 for the variable. E Click OK. Figure 9-10 Residual partial autocorrelation plot for linear regression model

99 Autoregression

The partial autocorrelation function shows a significant spike at a lag of 1, confirming the conclusions drawn from the Durbin-Watson statistic. The peak at a lag of 15 can be discounted as spurious, since there is no reason to expect autocorrelation at this lag number. The presence of first-order autocorrelated residuals violates the assumption of uncorrelated residuals that underlies the ordinary least-squares regression method. The failure of ordinary least-squares is evident in the fact that none of the chosen predictors is statistically significantnot even the number of performances! The conclusion is that ordinary least-squares regression is not an appropriate tool for analyzing this time series.

Applying Autoregression to the Problem


The presence of first-order autocorrelated residuals suggests modeling the time series with one of the methods from the Autoregression procedure, which is specifically designed to handle such series. Of the available methods, the Prais-Winsten method is generally preferable to the Cochrane-Orcutt method. The more computationally complex maximum-likelihood method is necessary only when there are missing data within the time series or when one of the independent variables is the lagged dependent variable, neither of which is the case here. So the Prais-Winsten method should be fine for the present time series.

Running the Analysis


To run the Autoregression procedure:
E From the menus choose: Analyze Time Series Autoregression...

100 Chapter 9 Figure 9-11 Autoregression dialog box

E Select sales for the Dependent variable. E Select performances, web, and flyers for the Independent variables. E Select Prais-Winsten in the Method group. E Click OK.

Coefficients
Figure 9-12 Regression coefficients table

101 Autoregression

The regression coefficients table shows the coefficients calculated from the Autoregression procedure. It shows that the variable representing promotional flyers is not statistically significant. Notice that the two variables representing performances and web hits are now statistically significant, whereas neither was significant according to the results of the ordinary least squares regression. This is a dramatic change and a direct result of the failure of ordinary least squares regression in the presence of first-order autocorrelated residuals.

Checking the Autocorrelation of the Residuals


Figure 9-13 Model fit summary table

The Durbin-Watson statistic is 1.756. From the Durbin-Watson Significance Tables in Appendix A, you find that this value is consistent with an absence of first-order autocorrelation in the residuals. Examination of the partial autocorrelation function will determine if the residuals show any other structure. To obtain the partial autocorrelations:
E From the menus choose: Graphs Time Series Autocorrelations...

102 Chapter 9 Figure 9-14 Autocorrelations dialog box

E Deselect RES_1 (residuals from ordinary least squares) from the Variables list and

select ERR_1.
E Click OK.

103 Autoregression Figure 9-15 Residual partial autocorrelation plot for autoregression model

The partial autocorrelation at a lag of 1 is not significant, in agreement with our findings from the Durbin-Watson statistic. In fact, none of the partial autocorrelations are significant. This confirms the absence of any structure in the residuals.

Rerunning the Analysis with the Significant Predictors


The results of an initial Autoregression analysis imply that the significant predictors are performances and web. This can be confirmed by rerunning the analysis with an appropriately modified regression model.

104 Chapter 9 Figure 9-16 Autoregression dialog box

To rerun the Autoregression procedure:


E Open the Autoregression dialog box. E Deselect flyers from the list of independent variables. E Click OK. Figure 9-17 Regression coefficients table

The regression coefficients table shows that by removing the nonsignificant predictor, the significance value of the variable representing Web hits has decreased from 0.041 to 0.027. A decrease in this significance value implies a stronger correlation between CD sales and Web hits.

105 Autoregression Figure 9-18 Model fit summary table

The large value of R shown in the model fit summary table indicates a strong relationship between the observed and model-predicted values of the dependent variable. Likewise, the value of R2 shows that almost 50% of the variation in sales is explained by the model. Taken together, these values indicate a good fit of the model to the data. You conclude that the best predictors of weekly CD sales are the number of weekly performances and the number of weekly Web hits that include a song download. Promotional flyers seem to have little effect on CD sales.

Summary
Using the Autoregression procedure, you have determined the set of significant predictors from a set of candidate predictors for a time series exhibiting first-order autocorrelation. You also witnessed the failure of ordinary least-squares regression in the presence of autocorrelationpredictors characterized as nonsignificant, according to ordinary regression, were seen to be significant once the autocorrelation was properly handled.

Related Procedures
The Autoregression procedure is useful for performing regression analysis on series exhibiting first-order autocorrelation. For autocorrelation due to seasonal effects, consider using the Seasonal Decomposition procedure to remove seasonality as a precursor to using the Autoregression procedure. See Chapter 11 for details. For series with more complex structure than simply first-order autocorrelation, use the ARIMA procedure. See Chapter 10 for details.

ARIMA
The ARIMA Model

10

Chapter

There are three basic components to an ARIMA model: autoregression (AR), differencing or integration (I), and moving-average (MA). All three are based on the simple concept of random disturbances or shocks. Between two observations in a series, a disturbance occurs that somehow affects the level of the series. These disturbances can be mathematically described by ARIMA models. Each of the three types of processes has its own characteristic way of responding to a random disturbance. In its simplest form, an ARIMA model is typically expressed as: ARIMA(p,d,q) where p is the order of autoregression, d is the order of differencing (or integration), and q is the order of moving-average involved. These components are used to explain significant correlations found in the autocorrelation (ACF) and partial autocorrelation (PACF) plots and to handle trends.

Autoregression (ARIMA)
The first of the three processes included in ARIMA models is autoregression. In an autoregressive (AR) process, each value in a series is a linear function of the preceding value or values. In a first-order autoregressive process, only the single preceding value is used; in a second-order process, the two preceding values are used, and so on. These processes are commonly indicated by the notation AR(n) or ARIMA(n,0,0), where the number in parentheses indicates the order. An AR(1) or ARIMA(1,0,0) process has the following functional form: Valuet = Coefficient * Valuet1 + disturbancet
107

108 Chapter 10

where: Valuet = the value of the series at time t. Coefficient = a value that indicates how strongly each value depends on the preceding value. The sign and magnitude of the coefficient are directly related to the sign and magnitude of the partial autocorrelation at lag 1. When the coefficient is greater than 1 and less than +1, the influence of earlier observations dies out exponentially. disturbancet = the chance error associated with the series value at time t. Conceptually, an autoregressive process is one with a memory, in that each value is correlated with all preceding values. In an AR(1) process, the current value is a function of the preceding value, which is a function of the one preceding it, and so on. Thus, each shock or disturbance to the system has a diminishing effect on all subsequent time periods. (In this respect, autoregressive forecasts are similar to those made with exponential smoothing. The algorithm used in ARIMA is quite different, however, from that used in exponential smoothing.)

Differencing (ARIMA)
The differencing or integration component of an ARIMA model tries, through differencing, to make a series stationary. Time series often reflect the cumulative effect of some process that is responsible for changes in the level of the series but is not responsible for the level itself. A series that measures the cumulative effect of something is called integrated. You can study an integrated series by looking at the changes, or differences, from one observation to the next. When a series wanders, the difference from one observation to the next is often small. Thus, the differences of even a wandering series often remain fairly constant. This steadiness, or stationarity, of the differences is highly desirable from a statistical point of view. The standard shorthand for integrated models, or models that need to be differenced, is I(1) or ARIMA(0,1,0). Occasionally you will need to look at differences of the differences; such models are termed I(2) or ARIMA(0,2,0). Differencing beyond the second or third order is rare. Usually, when a series exhibits such extreme trends, it is not stationary due to a nonconstant variance. Applying a log or square root transformation to the series before estimating the model will generally stabilize the variance.

109 ARIMA

An integration of order 1 is equivalent to an order 1 autoregression in which the coefficient equals 1.0. You could try to simply ignore the integration component and let the software estimate AR coefficients near 1.0, but this is not recommended. Estimates of other parameters are generally better when an appropriate integration component is specified rather than when coefficients near 1.0 are left to the autoregression.

Moving-average (ARIMA)
The moving-average (MA) component of an ARIMA model tries to predict future values of the series based on deviations from the series mean observed for previous values. In a moving-average process, each value is determined by the weighted average of the current disturbance and one or more previous disturbances. The order of the moving-average process specifies how many previous disturbances are averaged into the new value. In the standard notation, an MA(n) or ARIMA(0,0,n) process uses n previous disturbances along with the current one. An MA(1) or ARIMA(0,0,1) has the functional form: Valuet = Coefficient * disturbancet1 + disturbancet where: Valuet = the value of the series at time t. Coefficient = a term that indicates how strongly each value depends on the preceding disturbance terms. The sign and magnitude of the coefficient are directly related to the sign and magnitude of the autocorrelation at lag 1. disturbancet = the chance error associated with the series value at time t. The difference between an autoregressive process and a moving-average process is subtle but important. Each value in a moving-average series is a weighted average of the most recent random disturbances, while each value in an autoregression is a weighted average of the recent values of the series. Since these values in turn are weighted averages of the previous ones, the effect of a given disturbance in an autoregressive process dwindles as time passes. In a moving-average process,

110 Chapter 10

a disturbance affects the system for a finite number of periods (the order of the moving-average) and then abruptly ceases to affect it. In practical terms, MA processes are more useful for modeling short-term fluctuations, while AR processes are more useful for modeling longer-term effects.

Seasonal Orders
The full notation for an ARIMA model is ARIMA(p,d,q)(P,D,Q), where P, D, and Q are the seasonal AR, I, and MA components. Seasonal components work just like their nonseasonal counterparts, but they skip over the seasonal interval. For example, if you have monthly data, a nonseasonal order 1 AR process would model Decembers value based on Novembers value, while a seasonal order 1 AR process would model Decembers value based on the previous Decembers value.

Steps in Using ARIMA


Since the three types of random processes in ARIMA models are closely related, there is no algorithm that can determine the correct model. Instead, there is a model-building procedure that allows you to construct the best possible model for a series. This procedure consists of three stepsidentification, estimation, and diagnosiswhich you repeat until your model is satisfactory.

Identification
The first and most subjective step is the identification of the processes underlying the series. You must determine the three integers p, d, and q, representing respectively the number of autoregressive orders, the number of differencing orders, and the number of moving-average orders of the ARIMA model. For a seasonal model, you must also specify the seasonal counterparts to these parameters. The identification process for the autoregressive and moving-average components requires a stationary series. A stationary series has the same mean and variance throughout. Autoregressive and moving-average processes are inherently stationary, whereas integrated series typically are not.

111 ARIMA

If a series is not stationary, you must transform it until you obtain one that is stationary. The most common transformation is differencing, which replaces each value in the series by the difference between that value and the preceding value (for seasonal differencing, preceding means the value one seasonal lag prior to the current value). Differencing is necessary when the mean is not stationary. Logarithmic and square-root transformations are useful when the variance is not stationary, such as when there is more short-term variation with large series values than with small series values. Once you have obtained a stationary series, you know the second ARIMA parameter, dit is simply the number of times you had to difference the series to make it stationary. Next you must identify p and q, the orders of autoregression and moving-average. Pure autoregressive and moving-average processes have characteristic signatures in the autocorrelation and partial autocorrelation functions. AR(p) models have exponentially declining values of the ACF (possibly with alternating positive and negative values) and have precisely p spikes in the first p values of the PACF. MA(q) models have precisely q spikes in the first q values of the ACF and exponentially declining values of the PACF. If the ACF declines very slowly, you need to take differences before identifying the AR and MA components. Mixed AR and MA models have more complex ACF and PACF patterns. Identifying them often takes several cycles of identification-estimation-diagnosis. For plots of the theoretical ACF and PACF functions for the most common AR and MA models, see Appendix B.

Estimation
The Trends ARIMA procedure estimates the coefficients of the model you have tentatively identified. You supply the parameters p, d, and q, and ARIMA performs the iterative calculations needed to determine maximum-likelihood coefficients and adds new series to your file representing the fit or predicted value, the error (residual), and the confidence limits for the fit. You use these new series in the next step, the diagnosis of your model.

112 Chapter 10

Diagnosis
Diagnosing an ARIMA model is a crucial part of the model-building process and involves analyzing the model residuals. A residual is the difference, or error, between the observed value and the model-predicted value. A large residual means that the model did a poor job of fitting that particular point. If the model is a good fit for the series, the residuals should be random. The following checks are essential: The autocorrelation function and partial autocorrelation function of the residual series should not be significantly different from 0. One or two high-order correlations may exceed the 95% confidence level by chance; but if the first- or second-order correlation is large, you have probably incorrectly specified the model. The residuals should be without pattern. A common test for this is the Box-Ljung Q statistic, also called the modified Box-Pierce statistic. You should look at Q at a lag of about one-quarter of the sample size (but no more than 50). This statistic should not be significant. If the autocorrelation at a particular lag exceeds the confidence level but the Box-Ljung statistic at that lag isnt significant, then you can ignore the autocorrelation as a chance occurrence.

Preliminaries
In the examples that follow, it is more convenient to use variable names rather than variable labels.
E From the menus choose: Edit Options...

113 ARIMA Figure 10-1 Options dialog box

E Select Display names in the Variable Lists group. E Click OK.

Using Seasonal ARIMA with Predictors to Model Catalog Sales


A catalog company, interested in developing a forecasting model, has collected data on monthly sales of mens clothing along with several series that might be used to explain some of the variation in sales. Possible predictors include the number of catalogs mailed and the number of pages in the catalog, the number of phone lines open for ordering, the amount spent on print advertising, and the number of customer service representatives. This information is collected in catalog_seasfac.sav, found in the \tutorial\sample_files\ subdirectory of the directory in which you installed SPSS.

114 Chapter 10

Are any of the predictors useful for forecasting? Is a model with predictors really better than one without? Use the ARIMA procedure to create forecasting models with and without predictors, and see if there is a significant difference in predictive ability.

Plotting the Catalog Sales Series


The first step in the model-building process is to plot the series and look for any evidence that the mean or variance is not stationary. Remember, the ARIMA procedure assumes that your original series is stationary. To obtain a plot of mens clothing sales over time:
E From the menus choose: Graphs Sequence... Figure 10-2 Sequence Charts dialog box

E Select men and move it into the Variables list. E Select date and move it into the Time Axis Labels list.

115 ARIMA E Click OK. Figure 10-3 Sales of mens clothing (in U.S. dollars)

The series shows a global upward trend, making it clear that the level of the series is not stationary. Some degree of differencing will be necessary to stabilize the series level. The variance of the series appears stationary. The series also exhibits numerous peaks, many of which appear to be equally spaced. This suggests the presence of a periodic component to the time series. Given the seasonal nature of sales, with highs typically occurring during the holiday season, you shouldnt be surprised to find an annual seasonal component to the data.

116 Chapter 10

Identifying a Model
Youve established that the series has a trend, so some amount of differencing will be required to obtain a stationary series. The likely presence of a seasonal component means that seasonal differencing may be needed. A plot of the autocorrelation function will tell if seasonal differencing is required. If there is a slow decrease in autocorrelations separated by the seasonal intervalfor example, a separation of 12 for annual seasonalitythen seasonal differencing is necessary to stabilize the series. To obtain a plot of the autocorrelation function:
E From the menus choose: Graphs Time Series Autocorrelations... Figure 10-4 Autocorrelations dialog box

E Select men and move it into the Variables list.

To allow for an investigation of the need for seasonal differencing, the scope of the ACF plot has to be extended beyond the default of 16 lags.
E Click Options.

117 ARIMA Figure 10-5 Autocorrelations Options dialog box

E Type 24 in the Maximum Number of Lags text box. E Click Continue. E Click OK in the Autocorrelations dialog box. Figure 10-6 Autocorrelation plot for men

118 Chapter 10

The autocorrelation function exhibits significant peaks at lags 1 and 2 as well as significant peaks at lags 12 and 24. Since each data point represents one month, the lag 12 and 24 peaks confirm the presence of an annual seasonal component. The small drop in the ACF at lag 24 relative to the value at lag 12 reflects the fact that the series level is not stationary and indicates that seasonal differencing is necessary. Nonseasonal differencing may also be necessary but will be easier to detect once the series has been seasonally differenced.
Figure 10-7 Sequence Charts dialog box

E Open the Sequence Charts dialog box. E Select Seasonally difference in the Transform group and leave the text box value

at its default of 1.
E Click OK.

119 ARIMA Figure 10-8 Seasonally differenced plot for men

Seasonally differencing the data once stabilizes the series level. Notice that the mean of the differenced series appears to be 0. The global upward trend, present in the original series, has been removed. The ACF plot of the seasonally differenced series will show if additional differencing is required.

120 Chapter 10 Figure 10-9 Autocorrelations dialog box

E Open the Autocorrelations dialog box. E Select Seasonally difference in the Transform group and leave the text box value

at its default of 1.
E Click OK.

121 ARIMA Figure 10-10 Seasonally differenced autocorrelation plot for men

Seasonal differencing has removed the slow decay of the ACF over seasonal lags. And there is no evidence that further differencing, either seasonal or nonseasonal, is required. The conclusion is that one order of seasonal differencing is sufficient for stabilizing the series. Now determine any autoregressive and/or moving-average orders needed to model the series. The strong seasonality of the data suggests that seasonal ARIMA orders are present. An effective approach for isolating seasonal orders is to examine the ACF and PACF plots at the seasonal lags, ignoring, for the moment, the correlations at nonseasonal lags.

122 Chapter 10 Figure 10-11 Autocorrelations Options dialog box

E Open the Autocorrelations dialog box. E Click Options. E Type 48 in the Maximum Number of Lags text box. E Select Display autocorrelations at periodic lags. E Click Continue. E Click OK in the Autocorrelations dialog box.

123 ARIMA Figure 10-12 Seasonally differenced partial autocorrelation plot at seasonal lags only

The PACF plot shows a significant peak at a lag of 12, followed by evidence of a tail extending beyond lag 48.

124 Chapter 10 Figure 10-13 Seasonally differenced autocorrelation plot at seasonal lags only

The ACF plot shows a significant peak at a lag of 12 without strong evidence of a substantial tail. The characteristic ACF and PACF patterns produced by seasonal processes are the same as those shown for nonseasonal processes in Appendix B, except that the patterns occur in the first few seasonal lags rather than the first few lags. The spikes in the ACF/PACF plots at the first seasonal lag (lag 12), coupled with a tail in the PACF plot, indicate a seasonal moving-average ARIMA component of order 1. Given that youve already identified a seasonal differencing component of order 1, this suggests that an ARIMA(0,0,0)(0,1,1) model may be most appropriate for this series.

Creating the Model


The general ARIMA model includes a constant term, whose interpretation depends on the model you are using: In MA models, the constant is the mean level of the series.

125 ARIMA

In AR(1) models, the constant is a trend parameter. When a series has been differenced, the above interpretations apply to the differences. Youve determined that a candidate model is ARIMA(0,0,0)(0,1,1), which is an MA model of a differenced series. Therefore, the constant term will represent the mean level of the differences. Since you know that the mean level of the differences is about 0 for the series of mens clothing sales, the constant term in the ARIMA model should be 0. The Trends implementation of ARIMA lets you suppress the estimation of the constant term. This speeds up the computation, simplifies the model, and yields slightly smaller standard errors of the other estimates. To build an ARIMA model:
E From the menus choose: Analyze Time Series ARIMA... Figure 10-14 ARIMA dialog box

E Select men for the Dependent variable.

126 Chapter 10 E Type 1 in the Difference text box in the Seasonal column. E Type 1 in the Moving Average text box in the Seasonal column. E Deselect Include constant in model. E Click OK.

Model Diagnosis
Diagnosing an ARIMA model is a crucial part of the model-building process and involves verifying that the residuals are random. The most direct evidence of random residuals is the absence of significant values of the Box-Ljung Q statistic at lags of about one quarter of the sample size. Since the current sample size is 120, you should analyze values in the region of the lag 30 statistic.
Figure 10-15 Autocorrelations dialog box

E Open the Autocorrelations dialog box. E Deselect men from the Variables list and select ERR_1. E Deselect Seasonally difference in the Transform group. E Click Options.

127 ARIMA Figure 10-16 Autocorrelations Options dialog box

E Type 36 in the Maximum Number of Lags text box. E Deselect Display autocorrelations at periodic lags. E Click Continue. E Click OK in the Autocorrelations dialog box. Figure 10-17 Residual autocorrelation statistics for ARIMA(0,0,0)(0,1,1)

None of the Box-Ljung values in the vicinity of lag 30 are significant. This confirms that the residuals for the ARIMA(0,0,0)(0,1,1) model are random, which also means that no essential components have been omitted from the model.

128 Chapter 10

Adding Predictors to the Model


Youve determined that an ARIMA(0,0,0)(0,1,1) model does a good job of capturing the structure of the time series; however, the model is based only on the series itself and doesnt incorporate information about the possible predictor series included with the original data set. Can you build a better forecasting model by treating sales of mens clothing as a dependent variable and treating variables, such as the number of catalogs mailed and the number of phone lines open for ordering, as independent variables? ARIMA treats these predictor, or independent, variables much like predictor variables in regression analysisit estimates the coefficients for them that best fit the data.
Figure 10-18 ARIMA dialog box

To build an ARIMA model with predictors:


E Open the ARIMA dialog box. E Select mail, page, phone, print, and service for the Independent variables. E Click OK.

129 ARIMA Figure 10-19 Parameter estimates table

The parameter estimates table provides estimates of the model parameters and associated significance values, including both the AR and MA orders as well as any predictors. Notice that the parameter representing the seasonal moving-average component (labeled SMA1) is significant. This is expected, since youve already determined that it should be part of the model. The variable representing the number of pages in a catalog is not significant. In fact, the only significant predictors are the number of catalogs mailed and the number of phone lines open for ordering.

Testing the Predictive Ability of the Model


Is the model with predictors really better than the one without predictors? You can test the predictive ability of a model by using holdouts. A holdout is a historical series point that is not used in the computation of the model parameters, thus removing its effect on the computation of forecasts. By forcing the model to predict values you actually know, you get an idea of how well the model forecasts. This method can be illustrated by holding out the data from January, 1998, through December, 1998. The data prior to January, 1998, are used to build the model, and the model is then used to forecast sales in 1998. To perform a holdout analysis, first select the modeling period:
E From the menus choose: Data Select Cases...

130 Chapter 10 Figure 10-20 Select Cases dialog box

E Select Based on time or case range in the Select group. E Click Range. Figure 10-21 Select Cases Range dialog box

E Type 1989 for the year associated with the first case and 1 for the month associated

with the first case.

131 ARIMA E Type 1997 for the year associated with the last case and 12 for the month associated

with the last case. These choices will result in a model based on the period 01/1989 through 12/1997.
E Click Continue. E Click OK in the Select Cases dialog box. Figure 10-22 ARIMA dialog box

E Open the ARIMA dialog box. E Deselect the nonsignificant predictorspage, print, and servicefrom the list of

independent variables. Note that you can always view the variable label associated with a variable name by clicking on the variable, right-clicking, and selecting Variable Information.
E Click OK.

This results in rerunning the ARIMA procedure, with the significant predictors, using the data from 01/1989 to 12/1997 to determine the best-fit parameters. The analysis also includes predictions of sales of mens clothing during the holdout period

132 Chapter 10

(01/1998 to 12/1998), using the parameters from the best-fit model. You now need to rerun the ARIMA procedure for a model with no predictors.
Figure 10-23 ARIMA dialog box

E Open the ARIMA dialog box. E Deselect mail and phone from the list of independent variables so that the

Independent(s) list is empty.


E Click OK.

This results in rerunning the ARIMA procedure, with no predictors, using the data from 01/1989 to 12/1997 to determine the best-fit parameters. The results are best viewed after modifying labels for the fit variables created by the ARIMA procedure. To modify variable labels:
E Click on the Variable View tab in the Data Editor window. E Click on the Label column in the row for FIT_3 and enter the text Fit from ARIMA

with predictors.

133 ARIMA E Click on the Label column in the row for FIT_4 and enter the text Fit from ARIMA

without predictors.

Comparison of the model predictions for the holdout period with the actual data is best done by limiting the cases to the holdout period itself.
Figure 10-24 Select Cases dialog box

E Open the Select Cases dialog box. E Click Range.

134 Chapter 10 Figure 10-25 Select Cases Range dialog box

E Type 1998 for the year associated with the first case and 1 for the month associated

with the first case.


E Type 1998 for the year associated with the last case and 12 for the month associated

with the last case.


E Click Continue. E Click OK in the Select Cases dialog box.

135 ARIMA Figure 10-26 Sequence Charts dialog box

To plot the actual data along with the predictions from the ARIMA models:
E Open the Sequence Charts dialog box. E Add FIT_3 (ARIMA with predictors) and FIT_4 (ARIMA without predictors) to the

list of variables.
E Deselect Seasonally difference in the Transform group. E Click OK.

136 Chapter 10 Figure 10-27 Forecasts for sales of mens clothingmodels with and without predictor series

It is clear from the plot that the ARIMA model with predictors fits the actual data much better than the model without predictors.

Summary
You have learned how to build a seasonal ARIMA model using the autocorrelation and partial autocorrelation functions to identify the ARIMA orders. A number of candidate predictor variables were added to the model and evaluated based on their statistical significance. The final model, keeping only significant predictors, was compared to the model with no predictors. Results clearly showed that the model with predictors did a better job of explaining the variance of the data.

137 ARIMA

Using Intervention Analysis to Determine Change in Market Share


The retail grocery market in a medium sized metropolitan area is dominated by two supermarket chains: Nortons and EdMart. Nortons was recently purchased by a large national grocery chain that then introduced its own brand of products, most of which sell for substantially less than the name brand products offered at EdMart. For a number of years, EdMart has maintained about a 5% edge in market share over Nortons, primarily due to its superior customer service. During their first two months of ownership, the new parent company of Nortons launched an aggressive campaign advertising their own product line. The result was a rapid and dramatic increase in market share. Was the increase in market share solely at the expense of EdMarts share, or is some of the increase due to losses by the small mom-and-pop groceries that make up the rest of the local market? Monthly market share data for Nortons and EdMart is stored in stores.sav. The data consists of the six years prior to the buyout and the two years following the buyout. Use intervention analysis in the context of an ARIMA model to analyze the effect of the buyout on market share. Before developing an intervention model, you should examine the market share time series to get a feel for the effect of the buyout.

Plotting the Market Share Series


To obtain a plot of the market share time series:
E From the menus choose: Graphs Sequence...

138 Chapter 10 Figure 10-28 Sequence Charts dialog box

E Select nortons and edmart and move them into the Variables list. E Select MONTH_ and move it into the Time Axis Labels list. E Click OK.

139 ARIMA Figure 10-29 Market shares of EdMart and Nortons

The plot of the market share data clearly shows the 5% advantage enjoyed by EdMart for roughly the first six years of data. The impact of the buyout is evident in the sharp drop of the EdMart market share and the sharp rise of the Nortons market share occurring near the six-year mark. Other than the level shift caused by the buyout, both series appear to have a constant level as well as a constant variance, indicative of stationary series.

Intervention Analysis Strategy


The impact of the Nortons buyout on the market share series is called an intervention. The basic strategy of intervention analysis is: Develop a model for the series before intervention. Add one or more dummy variables that represent the timing of the intervention.

140 Chapter 10

Reestimate the model, including the new dummy variables, for the entire series. Interpret the coefficients of the dummy variables as measures of the effect of the intervention. This strategy will be implemented for both the Nortons and EdMart data. As a first step, you need to develop a model for each series prior to the intervention. In this case, the intervention period begins in the 73rd month of data, when Nortons was purchased by the national chain and the aggressive ad campaign began.

Identifying a Model
Choosing a good ARIMA model involves looking at the series to decide whether a transformation, log or square root, is necessary to stabilize the series and then looking at plots of the autocorrelation function (ACF) and partial autocorrelation function (PACF) to determine the ARIMA orders. The plot of the market shares showed that, other than a one-time change in level, both series are stationary. No transformations of the data then appear necessary. To determine the ARIMA orders from the autocorrelation functions, you first need to limit the cases to the period prior to the interventionthat is, the first 72 cases.
E From the menus choose: Data Select Cases...

141 ARIMA Figure 10-30 Select Cases dialog box

E Select Based on time or case range in the Select group. E Click Range. Figure 10-31 Select Cases Range dialog box

E Type 1 for the first case. E Type 72 for the last case.

142 Chapter 10 E Click Continue. E Click OK in the Select Cases dialog box.

Since there is no reason to assume different underlying processes for the two market share series, you only need to look at the autocorrelations and partial autocorrelations for onesay, nortons.
E From the menus choose: Graphs Time Series Autocorrelations... Figure 10-32 Autocorrelations dialog box

E Select nortons and move it into the Variables list. E Click OK.

143 ARIMA Figure 10-33 Autocorrelation plot for Nortons

The autocorrelation function shows a single significant peak at a lag of 1.

144 Chapter 10 Figure 10-34 Partial autocorrelation plot for Nortons

The partial autocorrelation function shows a significant peak at a lag of 1 accompanied by a tail. From Appendix B, you find that the spikes in the ACF/PACF plots at lag 1 with a tail in the PACF plot indicate a moving-average ARIMA component of order 1, or an ARIMA(0,0,1) model.

Determining the Intervention Period


Youve determined that the series prior to the buyout of Nortons follow an ARIMA(0,0,1) model. Now you must have a way of accounting for the change in market share due to the intervention. The first task is to determine the period during which the market share series showed significant level changes. A plot of the market series before and after the buyout should provide the answer. Its best, however, to limit the cases in order to gain a clearer picture of the intervention period.

145 ARIMA Figure 10-35 Select Cases dialog box

E Open the Select Cases dialog box. E Select Based on time or case range in the Select group. E Click Range. Figure 10-36 Select Cases Range dialog box

E Type 60 for the first case.

146 Chapter 10 E Click Continue. E Click OK in the Select Cases dialog box.

This will limit the cases to the range from 60 (a year prior to the intervention) to the end of the series.
Figure 10-37 Sequence Charts dialog box

To plot the selected points:


E Open the Sequence Charts dialog box.

Recall that the national chain launched an aggressive campaign advertising their budget product line in the first two months after the buyout. It may prove useful to visually delineate the end of the ad campaign on the market share plot.
E Click Time Lines.

147 ARIMA Figure 10-38 Sequence Charts Time Axis Reference Lines dialog box

E Select Line at date. E Type 74 for the month.

This will result in a vertical line at month 74.


E Click Continue. E Click OK in the Sequence Charts dialog box.

148 Chapter 10 Figure 10-39 Plot of pre- and post-intervention market shares

The plot makes it clear that both series reach their new levels by month 74. The intervention period is thus given by the two months of the ad campaign, months 73 and 74.

Creating Intervention Variables


Both market share series are characterized by a statistically constant level prior to the intervention, followed by a statistically constant level after the end of the intervention period. The intervention simply causes the edmart series to drop by a fixed value and the nortons series to increase by a possibly different fixed value. A constant shift in the level of a series can be modeled with a variable that is 0 until some point in the series and 1 thereafter. If the coefficient of the variable is positive, the variable acts to increase the level of the series, and if the coefficient is negative the variable acts to decrease the level of the series. Such variables are referred to as

149 ARIMA

dummy variables and this particular type of dummy variable is referred to as a step function because it abruptly steps up from a value of 0 to a value of 1 and then remains at 1. So, qualitatively, the drop in the edmart series can be modeled by a step function with a negative coefficient, and the rise in the nortons series can be modeled by a step function with a positive coefficient. The only complication in the present case is that the series change levels over a two-month period. This requires the use of two step functions, one to model the level change in month 73 and one to model the change in month 74.
Figure 10-40 Select Cases dialog box

Before creating the dummy variables, restore all cases.


E Open the Select Cases dialog box. E Select All cases in the Select group. E Click OK.

150 Chapter 10

To create the step variables:


E From the menus choose: Transform Compute... Figure 10-41 Compute Variable dialog box

E In the Target Variable text box, type step73. E Click the Numeric Expression text box and type MONTH_>=73. E Click OK.

This creates a variable that has a value of 1 for cases in which it is true that MONTH_ is greater than or equal to 73 and a value of 0 for all other cases. Now, repeat this process, creating another variable, step74, with the expression MONTH_>=74.

151 ARIMA

Running the Analysis


You have determined that the series prior to the intervention follows an ARIMA(0,0,1) model, and youve created two dummy variables to model the intervention. Youre now ready to run the full ARIMA analysis using the two dummy variables as predictors. ARIMA treats these predictors much like predictor variables in regression analysisit estimates the coefficients for them that best fit the data. Youll use the same two predictor variables, step73 and step74 for both the edmart series and the nortons series. Build the intervention model for nortons first.
E From the menus choose: Analyze Time Series ARIMA... Figure 10-42 ARIMA dialog box

E Select nortons for the Dependent variable. E Select step73 and step74 for the Independent variables.

152 Chapter 10 E Type 1 in the Moving Average text box. E Click OK.

To repeat the analysis for edmart:


E Open the ARIMA dialog box. E Deselect nortons for the Dependent variable and choose edmart. E Click OK.

Model Diagnosis
Diagnosing an ARIMA model is a crucial part of the model-building process and involves analyzing the model residuals. If the model is a good fit for the series, the residuals should be random.
Figure 10-43 Autocorrelations dialog box

To analyze the model residuals for Nortons:


E Open the Autocorrelations dialog box. E Deselect nortons from the Variables list and select ERR_1.

153 ARIMA E Click OK. Figure 10-44 Residual autocorrelation plot for ARIMA(0,0,1)

The autocorrelation function shows no significant values.

154 Chapter 10 Figure 10-45 Residual partial autocorrelation plot for ARIMA(0,0,1)

The partial autocorrelation function shows no significant values.

155 ARIMA Figure 10-46 Residual autocorrelation statistics for ARIMA(0,0,1)

The Box-Ljung statistic has no significant values. This is consistent with the hypothesis that the residuals are random. Coupled with the results from the ACF and PACF plots, you conclude that the model provides a good fit to the data. You can analyze the residuals for EdMart in the same way, by opening the Autocorrelations dialog box, deselecting ERR_1, and selecting ERR_2 as the variable. The results show that the residuals for the EdMart model are random, confirming that the ARIMA(0,0,1) model with two step functions is a good fit to the data.

Assessing the Intervention


Youve constructed a model thats a good fit to the data and are now in a position to analyze what the model has to say about the intervention. You expect positive coefficients for both predictor variables in the Nortons model and negative coefficients in the EdMart model. The sum of the Nortons coefficients will represent the total increase in Nortons market share over the two-month period, and the sum of the EdMart coefficients will represent the total decrease in the EdMart market share during that period.

156 Chapter 10 Figure 10-47 Parameter estimates table for the Nortons series

First, look at the parameter estimates table for the Nortons model. The coefficient for the dummy variable step73 is 1.610. This means that the Nortons market share increased by about 1.6% in month 73. Likewise, the coefficient for step74 indicates an additional increase of about 1.8% in month 74, on top of the existing level. So, the Nortons market share increased by about 3.4% during the two-month ad campaign and then remained at that new higher level.
Figure 10-48 Parameter estimates table for the EdMart series

Now examine the parameter estimates table for the EdMart model. The coefficient for the dummy variable step73 is 1.668. This means that the EdMart market share fell by about 1.7% in month 73. Likewise, the coefficient for step74 indicates an additional drop of about 0.7% in month 74. In all, then, EdMart market share dropped by about 2.4% during the two month ad campaign. You conclude that about 70% of Nortons gain in market share came at the expense of EdMart; the remaining 30% is due to losses felt by the small mom-and-pop groceries.

Summary
Using intervention analysis in the context of ARIMA modeling, you have analyzed an abrupt shift in market share in a local market dominated by two competitors. The pre-intervention series was determined to follow a first-order moving-average process, which is uniquely suited to analysis with an ARIMA model. Modeling the

157 ARIMA

intervention period with step functions allowed you to make detailed statements about the gains and losses of market share experienced by the two competitors.

Related Procedures
The ARIMA procedure is useful for developing complex models of time series behavior. If you suspect that the time series is well-described by a trend and/or single seasonal component and youre not interested in including predictor variables, you may want to consider the computationally simpler Exponential Smoothing procedure. For more information, see Chapter 8. If you are primarily interested in a regression analysis on a series exhibiting first-order autocorrelation, consider using the Autoregression procedure. For more information, see Chapter 9.

Recommended Readings
See the following texts for more information on ARIMA modeling: Box, G. E. P., and G. C. Tiao. 1975. Intervention analysis with applications to economic and environmental problems. Journal of the American Statistical Association, 70:3, 70-79. Box, G. E. P., and G. M. Jenkins. 1976. Time series analysis: Forecasting and control. San Francisco: Holden-Day. Makridakis, S. G., S. C. Wheelwright, and R. J. Hyndman. 1997. Forecasting: Methods and Applications. New York: John Wiley & Sons. McCleary, R., and R. A. Hay. 1980. Applied time series analysis for the social sciences. Beverly Hills, Calif.: Sage Publications.

Seasonal Decomposition

11

Chapter

Removing Seasonality from Sales Data


A catalog company is interested in modeling the upward trend of sales of its mens clothing line on a set of predictor variables such as the number of catalogs mailed and the number of phone lines open for ordering. To this end, the company collected monthly sales of mens clothing for a 10-year period. This information is collected in catalog.sav, found in the \tutorial\sample_files\ subdirectory of the directory in which you installed SPSS. To perform a trend analysis (for example, with an autoregression procedure) its necessary to remove any seasonal variations present in the data. This is easily accomplished with the Seasonal Decomposition procedure.

Preliminaries
In the examples that follow, it is more convenient to use variable names rather than variable labels.
E From the menus choose: Edit Options...

159

160 Chapter 11 Figure 11-1 Options dialog box

E Select Display names in the Variable Lists group. E Click OK.

Determining and Setting the Periodicity


The Seasonal Decomposition procedure requires the presence of a periodic date component in the data filefor example, a yearly periodicity of 12 (months), a weekly periodicity of 7 (days), and so on. Its usually a good idea to plot your time series first, since viewing a time series plot often leads to a reasonable guess about the underlying periodicity.

161 Seasonal Decomposition

To obtain a plot of mens clothing sales over time:


E From the menus choose: Graphs Sequence... Figure 11-2 Sequence Charts dialog box

E Select men and move it into the Variables list. E Select date and move it into the Time Axis Labels list. E Click OK.

162 Chapter 11 Figure 11-3 Sales of mens clothing (in U.S. dollars)

The series exhibits a number of peaks, but they do not appear to be equally spaced. This suggests that if the series has a periodic component, it also has fluctuations that are not periodicthe typical case for real time series. Aside from the small scale fluctuations, the significant peaks appear to be separated by more than a few months. Given the seasonal nature of sales, with typical highs during the December holiday season, its probably a good guess that the time series has an annual periodicity. Also notice that the seasonal variations appear to grow with the upward series trend, suggesting that the seasonal variations may be proportional to the level of the series. This would imply a multiplicative rather than an additive model. Examining the autocorrelations and partial autocorrelations of a time series provides a more quantitative conclusion about the underlying periodicity.

163 Seasonal Decomposition E From the menus choose: Graphs Time Series Autocorrelations... Figure 11-4 Autocorrelations dialog box

E Select men and move it into the Variables list. E Click OK.

164 Chapter 11 Figure 11-5 Autocorrelation plot for men

The autocorrelation function shows a significant peak at a lag of 1 with a long exponential taila typical pattern for time series. The significant peak at a lag of 12 suggests the presence of an annual seasonal component in the data. Examination of the partial autocorrelation function will allow a more definitive conclusion.

165 Seasonal Decomposition Figure 11-6 Partial autocorrelation plot for men

The significant peak at a lag of 12 in the partial autocorrelation function confirms the presence of an annual seasonal component in the data. To set an annual periodicity:
E From the menus choose: Data Define Dates...

166 Chapter 11 Figure 11-7 Define Dates dialog box

E Select Years, months in the Cases Are list. E Enter 1989 for the year and 1 for the month. E Click OK.

This sets the periodicity to 12 and creates a set of date variables designed to work with Trends procedures.

Running the Analysis


To run the Seasonal Decomposition procedure:
E From the menus choose: Analyze Time Series Seasonal Decomposition...

167 Seasonal Decomposition Figure 11-8 Seasonal Decomposition dialog box

E Select men and move it into the Variables list. E Select Multiplicative in the Model group. E Click OK.

Understanding the Output


The Seasonal Decomposition procedure creates four new variables for each of the original variables analyzed by the procedure. By default, the new variables are added to the working data file. The new series have names beginning with the following prefixes:
SAF. Seasonal adjustment factors, representing seasonal variation. For the

multiplicative model, the value 1 represents the absence of seasonal variation; for the additive model, the value 0 represents the absence of seasonal variation. Seasonal factors can be used as input to an exponential smoothing model.
SAS. Seasonally adjusted series, representing the original series with seasonal

variations removed. Working with a seasonally adjusted series, for example, allows a trend component to be isolated and analyzed independent of any seasonal component.
STC. Smoothed trend-cycle component, a smoothed version of the seasonally adjusted series that shows both trend and cyclic components.

168 Chapter 11

ERR. The residual component of the series for a particular observation.

For the present case, the seasonally adjusted series is the most appropriate, since it represents the original series with the seasonal variations removed.
Figure 11-9 Sequence Charts dialog box

To plot the seasonally adjusted series:


E Open the Sequence Charts dialog box. E Click Reset to clear any previous selections, and then select SAS_1 and move it

into the Variables list.


E Click OK.

169 Seasonal Decomposition Figure 11-10 Seasonally adjusted series

The seasonally adjusted series shows a clear upward trend. A number of peaks are evident, but they appear at random intervals showing no evidence of an annual pattern.

Summary
Using the Seasonal Decomposition procedure, you have removed the seasonal component of a periodic time series to produce a series more suitable for trend analysis. Examination of the autocorrelations and partial autocorrelations of the time series was useful in determining the underlying periodicityin this case, annual. The seasonal adjustment factors, given by the variable SAF_1, can be used as the set of seasonal factors that are optional input to an Exponential Smoothing procedure. For more information, see Building and Analyzing a Winters Model on p. 70 in Chapter 8.

170 Chapter 11

Related Procedures
The Seasonal Decomposition procedure is useful for removing a single seasonal component from a periodic time series. You can use the seasonal adjustment factors, generated by the Seasonal Decomposition procedure, as the optional seasonal factors for the Exponential Smoothing procedure. For more information, see Chapter 8. To perform a more in-depth analysis of the periodicity of a time series than is provided by the partial correlation function, use the Spectral Plots procedure. For more information, see Chapter 12. For a regression analysis of the trend component of a time series, you can use the seasonally adjusted series or the smoothed trend-cycle component as the dependent variable in an Autoregression procedure. For more information, see Chapter 9.

Spectral Plots

12

Chapter

Using Spectral Plots to Verify Expectations about Periodicity


Time series representing retail sales typically have an underlying annual periodicity due to the usual peak in sales during the holiday season. Producing sales projections means building a model of the time series, which means identifying any periodic components. A plot of the time series may not always uncover the annual periodicity because time series contain random fluctuations which often mask the underlying structure. Monthly sales data for a catalog company is stored in catalog.sav, found in the \tutorial\sample_files\ subdirectory of the directory in which you installed SPSS. You expect the sales data to exhibit an annual periodicity and would like to confirm this before proceeding with sales projections. A plot of the time series shows many peaks with an irregular spacing, so any underlying periodicity is not clearly evident. Use the Spectral Plots procedure to identify any periodicity in the sales data.

Running the Analysis


To run the Spectral Plots procedure:
E From the menus choose: Graphs Time Series Spectral...

171

172 Chapter 12 Figure 12-1 Spectral Plots dialog box

E Select Sales of Mens Clothing for the variable. E Select Spectral density in the Plot group. E Click OK.

173 Spectral Plots

Understanding the Periodogram and Spectral Density


Figure 12-2 Periodgram

The plot of the periodogram shows a sequence of peaks that stand out from the background noise, with the lowest frequency peak at a frequency of just less than 0.1. You suspect that the data contain an annual periodic component, so consider the contribution that an annual component would make to the periodogram. Each of the data points in the time series represents a month, so an annual periodicity corresponds to a period of 12 in the current data set. Since period and frequency are reciprocals of each other, a period of 12 corresponds to a frequency of 1/12 or 0.083. So an annual component implies a peak in the periodogram at 0.083, which seems consistent with the presence of the peak just below a frequency of 0.1.

174 Chapter 12 Figure 12-3 Univariate statistics table

The univariate statistics table contains the data points used to plot the periodogram. Notice that for frequencies of less than 0.1, the largest value in the Periodogram column occurs at a frequency of 0.08333precisely what you expect to find if there is an annual periodic component. This confirms the identification of the lowest frequency peak with an annual periodic component. But what about the other peaks at higher frequencies?

175 Spectral Plots Figure 12-4 Spectral density

The remaining peaks are best analyzed with the spectral density function, which is simply a smoothed version of the periodogram. Smoothing provides a means of eliminating the background noise from a periodogram, allowing the underlying structure to be more clearly isolated. The spectral density consists of five distinct peaks that appear to be equally spaced. The lowest frequency peak simply represents the smoothed version of the peak at 0.08333. To understand the significance of the four higher frequency peaks, remember that the periodogram is calculated by modeling the time series as the sum of cosine and sine functions. Periodic components that have the shape of a sine or cosine function (sinusoidal) show up in the periodogram as single peaks. Periodic components that are not sinusoidal show up as a series of equally spaced peaks of different heights, with the lowest frequency peak in the series occurring at the frequency of the periodic component. So the four higher frequency peaks in the spectral density simply tell us that the annual periodic component is not sinusoidal. You have now accounted for all of the discernible structure in the spectral density plot and conclude that the data contain a single periodic component with a period of 12 months.

176 Chapter 12

Summary
Using the Spectral Plots procedure, you have confirmed the existence of an annual periodic component of a time series, and verified that no other significant periodicities are present. The spectral density was seen to be more useful than the periodogram for uncovering all of the underlying structure because it smoothes out the fluctuations caused by the non-periodic component of the data.

Related Procedures
The Spectral Plots procedure is useful for identifying the periodic components of a time series. To remove a periodic component from a time seriesfor instance, to perform a trend analysisuse the Seasonal Decomposition procedure. See Chapter 11 for details.

Appendix

Durbin-Watson Significance Tables

The Durbin-Watson statistic tests the null hypothesis that the residuals from an ordinary least-squares regression are not autocorrelated against the alternative that the residuals follow an AR1 process. The Durbin-Watson statistic ranges in value from 0 to 4. A value near 2 indicates non-autocorrelation, a value toward 0 indicates positive autocorrelation, and a value toward 4 indicates negative autocorrelation. Because of the dependence of any computed Durbin-Watson value on the associated data matrix, exact critical values of the Durbin-Watson statistic are not tabulated for all possible cases. Instead, Durbin and Watson established upper and lower bounds for the critical values. Typically, tabulated bounds are used to test the hypothesis of zero autocorrelation against the alternative of positive first-order autocorrelation, since positive autocorrelation is seen much more frequently in practice than negative autocorrelation. To use the table, you must cross-reference the sample size against the number of regressors, excluding the constant from the count of the number of regressors. The conventional Durbin-Watson tables are not applicable when you do not have a constant term in the regression. Instead, you must refer to an appropriate set of Durbin-Watson tables. The conventional Durbin-Watson tables are also not applicable when a lagged dependent variable appears among the regressors. Durbin has proposed alternative test procedures for this case. Statisticians have compiled Durbin-Watson tables from some special cases, including: Regressions with a full set of quarterly seasonal dummies Regressions with an intercept and a linear trend variable (CURVEFIT MODEL=LINEAR) Regressions with a full set of quarterly seasonal dummies and a linear trend variable

177

178 Appendix A

In addition to obtaining the Durbin-Watson statistic for residuals from REGRESSION, you should plot the ACF and PACF of the residuals series. The plots might suggest either that the residuals are random or that they follow some ARMA process. If the residuals resemble an AR1 process, you can estimate an appropriate regression using the AREG procedure. If the residuals follow any ARMA process, you can estimate an appropriate regression using the ARIMA procedure. In this appendix, we have reproduced two sets of tables. Savin and White (1977) present tables for sample sizes ranging from 6 to 200 and for 1 to 20 regressors for models in which an intercept is included. Farebrother (1980) presents tables for sample sizes ranging from 2 to 200 and for 0 to 21 regressors for models in which an intercept is not included. Lets consider an example of how to use the tables. Take the case of a sample size of 69, a model with two regressors, and an intercept term. Assume that the value of the Durbin-Watson test statistic is 0.24878. We want to test the null hypothesis of zero autocorrelation in the residuals against the alternative that the residuals are positively autocorrelated at the 1% level of significance. If you examine the Savin and White tables, you will not find a row for sample size 69, so go to the next lowest sample size with a tabulated row, namely N=65. Since there are two regressors, find the column labeled k=2. Cross-referencing the indicated row and column, you will find that the printed bounds are dL = 1.377 and dU = 1.500. If the observed value of the test statistic is less than the tabulated lower bound, then you should reject the null hypothesis of non-autocorrelated errors in favor of the hypothesis of positive first-order autocorrelation. Since 0.24878 is less than 1.377, we reject the null hypothesis. If the test statistic value were greater than dU, we would not reject the null hypothesis. A third outcome is also possible. If the test statistic value lies between dL and dU, the test is inconclusive. In this context, you might err on the side of conservatism and not reject the null hypothesis. For models with an intercept, if the observed test statistic value is greater than 2, then you want to test the null hypothesis against the alternative hypothesis of negative first-order autocorrelation. To do this, compute the quantity 4-d and compare this value with the tabulated values of dL and dU as if you were testing for positive autocorrelation. When the regression does not contain an intercept term, refer to Farebrothers tabulated values of the minimal bound instead of Savin and Whites lower bound dL. In this instance, the upper bound is the conventional bound dU found in the Savin and White tables. To test for positive first-order autocorrelation, use Farebrothers

179 Durbin-Watson Significance Tables

Positive Serial Correlation tables. To test for negative first-order autocorrelation, use Farebrothers Negative Serial Correlation tables. To continue with our example, had we run a regression with no intercept term, we would cross-reference N equals 65 and k equals 2 in Farebrothers table. The tabulated 1% minimal bound is 1.348.

Table A-1 Models with an intercept (from Savin and White)


Durbin-Watson Statistic: 1 Per Cent Significance Points of dL and dU k=2 dL ----0.294 0.345 0.408 1.875 1.733 0.150 0.193 2.453 2.280 2.665 2.490 2.354 2.244 2.153 2.078 2.015 0.476 1.736 1.712 0.666 1.543 1.535 1.527 0.832 0.855 1.413 0.878 1.521 1.517 1.514 0.699 0.728 0.805 0.756 0.782 0.808 1.691 1.674 1.659 1.645 1.635 1.625 0.515 0.552 0.587 0.620 0.652 0.682 0.711 0.738 1.963 1.918 1.881 1.849 1.821 1.797 1.776 1.759 1.743 0.183 0.226 0.269 0.313 0.355 0.396 0.436 0.474 0.510 0.545 0.578 0.610 0.640 0.669 0.140 0.105 2.150 2.049 0.257 0.303 0.349 0.393 0.435 1.967 1.901 1.847 1.803 1.767 0.211 0.164 3.053 2.838 2.667 2.530 2.416 2.319 2.238 2.169 2.110 2.059 2.015 1.977 1.944 1.915 1.889 1.867 0.244 0.294 0.343 0.390 0.437 0.481 0.522 0.561 0.598 0.634 0.124 2.892 ----------------0.090 0.122 0.161 0.200 0.241 0.282 0.322 0.362 0.400 0.437 0.473 0.507 0.540 0.572 0.602 1.640 1.575 1.526 1.490 1.757 1.705 1.664 1.631 1.604 1.583 1.567 1.554 1.465 1.447 0.532 0.574 0.614 0.650 0.684 0.718 0.748 0.777 1.432 1.422 1.416 1.410 1.408 1.407 1.407 1.407 1.408 1.410 0.487 0.441 0.391 1.826 0.339 1.913 0.286 2.030 0.230 2.193 2.690 --------------------0.466 0.519 0.569 0.616 0.660 1.254 1.252 1.253 1.255 1.259 1.264 1.270 1.276 1.284 1.290 0.858 0.881 0.906 0.928 0.948 1.298 1.305 1.311 1.318 0.832 0.803 0.774 0.742 0.708 0.672 0.633 0.591 0.547 0.700 0.738 0.773 0.805 0.835 0.862 0.889 0.915 0.938 0.959 0.981 1.000 1.019 1.261 0.499 1.274 0.449 1.297 0.396 1.333 0.340 1.389 0.279 0.183 2.433 --------------------------------------------3.182 2.981 2.817 2.681 2.566 2.467 2.381 2.308 2.244 2.188 2.140 2.097 2.059 2.026 1.997 1.489 0.229 2.102 ----------------------------------------------------------------0.078 0.107 0.142 0.179 0.216 0.255 0.294 0.331 0.368 0.404 0.439 0.473 0.505 0.536 1.676 --------------------------------------------------------------------------------------------------------------------------------------------3.287 3.101 2.944 2.811 2.697 2.597 2.510 2.434 2.367 2.308 2.255 2.209 2.168 2.131 dU dL dU dL dU dL dU dL dU dL dU dL dU dL dU dL ------------------------------------0.068 0.094 0.127 0.160 0.196 0.232 0.268 0.304 0.340 0.375 0.409 0.441 0.473 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10 dU ------------------------------------3.374 3.201 3.053 2.925 2.813 2.174 2.625 2.548 2.479 2.417 2.362 2.313 2.269

k=1

dL

dU

0.390

1.142

0.435

1.036

0.497

1.003

0.554

0.998

10

0.604

1.001

11

0.653

1.010

12

0.697

1.023

13

0.738

1.038

14

0.776

1.054

180

15

0.811

1.070

16

0.844

1.086

17

0.873

1.102

18

0.902

1.118

19

0.928

1.133

20

0.952

1.147

21

0.975

1.161

22

0.997

1.174

23

1.017

1.186

24

1.037

1.199

25

1.055

1.210

26

1.072

1.222

27

1.088

1.232

Durbin-Watson Statistic: 1 Per Cent Significance Points of dL and dU k=3 dU 1.325 1.332 1.511 1.510 1.509 1.509 1.510 1.511 1.512 1.513 1.514 1.515 1.517 1.518 1.528 1.537 1.548 1.559 1.568 1.577 1.586 1.595 1.603 1.611 1.446 1.618 1.313 1.340 1.364 1.386 1.406 1.425 1.283 1.604 1.611 1.617 1.624 1.630 1.636 1.641 1.248 1.598 1.209 1.592 1.172 1.214 1.251 1.283 1.313 1.338 1.362 1.383 1.403 1.164 1.587 1.123 1.111 1.583 1.065 1.643 1.639 1.638 1.639 1.642 1.645 1.649 1.653 1.657 1.661 1.666 1.047 1.583 1.652 0.997 0.946 1.019 1.081 1.134 1.179 1.218 1.253 1.284 1.312 1.337 1.360 1.381 1.033 1.583 1.655 0.982 0.930 1.019 1.584 1.658 0.966 0.913 1.735 1.729 1.724 1.704 1.692 1.685 1.682 1.680 1.680 1.682 1.683 1.685 1.687 1.690 1.004 1.585 0.950 0.895 1.662 1.742 0.841 0.860 0.878 0.895 0.974 1.039 1.095 1.144 1.186 1.223 1.256 1.285 1.312 1.336 1.358 0.987 1.587 0.932 1.666 0.877 1.749 0.821 0.971 1.589 0.857 1.757 0.914 1.671 0.800 1.847 1.836 1.825 1.816 1.807 1.799 1.768 1.748 1.734 1.726 1.720 1.716 1.714 1.714 1.714 1.714 1.715 0.954 1.591 0.896 1.677 0.837 1.766 0.779 1.860 0.935 1.594 0.876 0.757 1.683 0.816 1.776 1.874 0.698 0.722 0.744 0.766 0.787 0.807 0.826 0.844 0.927 0.997 1.057 1.108 1.153 1.192 1.227 1.259 1.287 1.312 1.336 0.917 1.597 0.856 1.690 0.794 1.788 0.734 1.889 0.674 1.995 1.975 1.957 1.940 1.925 1.911 1.899 1.887 1.876 1.834 1.805 1.785 1.771 1.761 1.754 1.748 1.745 1.743 1.741 1.741 0.897 1.601 0.834 1.698 0.772 1.800 0.710 1.906 0.649 2.017 0.877 1.925 1.606 0.812 1.707 0.748 1.814 0.684 0.622 2.041 0.562 0.589 0.615 0.641 0.665 0.689 0.711 0.733 0.754 0.774 0.749 0.881 0.955 1.018 1.072 1.120 1.162 1.199 1.232 1.262 1.288 1.313 0.855 0.658 0.595 0.533 1.339 1.345 1.351 1.358 1.364 1.370 1.085 1.098 1.112 1.058 1.072 1.085 1.098 1.156 1.206 1.246 1.283 1.314 1.343 1.368 1.390 1.411 1.429 1.124 1.137 1.452 1.456 1.474 1.491 1.505 1.520 1.534 1.546 1.557 1.568 1.577 1.587 1.596 1.149 1.201 1.245 1.284 1.317 1.346 1.372 1.395 1.416 1.434 1.452 1.468 1.449 1.446 1.442 1.043 1.376 1.383 1.388 1.392 1.398 1.424 1.445 1.466 1.484 1.500 1.514 1.529 1.541 1.553 1.563 1.573 1.439 1.028 1.070 1.436 1.012 1.055 1.432 0.995 1.039 1.428 0.978 1.022 1.425 0.960 1.006 1.421 0.941 0.988 1.418 0.921 1.611 0.788 1.718 0.723 1.830 1.947 2.068 2.193 2.160 2.131 2.104 2.080 2.057 2.037 2.018 2.001 1.985 1.970 1.956 1.902 1.864 1.837 1.817 1.802 1.792 1.783 1.777 1.773 1.769 1.767 0.969 1.512 0.566 0.504 1.414 0.901 0.832 1.618 0.764 1.729 0.696 1.847 0.630 1.970 2.098 2.229 dL dU dL dU dL dU dL dU dL dU dL dU dL dU dL dU k=4 k=5 k=6 k=7 k=8 k=9 k=10

k=1

k=2

dL

dU

dL

28

1.104

1.244

1.036

29

1.119

1.254

1.053

30

1.134

1.264

1.070

31

1.147

1.274

1.085

32

1.160

1.283

1.100

33

1.171

1.291

1.114

34

1.184

1.298

1.128

35

1.195

1.307

1.141

36

1.205

1.315

1.153

37

1.217

1.322

1.164

38

1.227

1.330

1.176

39

1.237

1.337

1.187

40

1.246

1.344

1.197

45

1.288

1.376

1.245

50

1.324

1.403

1.285

55

1.356

1.428

1.320

60

1.382

1.449

1.351

65

1.407

1.467

1.377

70

1.429

1.485

1.400

75

1.448

1.501

1.422

80

1.465

1.514

1.440

85

1.481

1.529

1.458

90

1.496

1.541

1.474

Durbin-Watson Significance Tables

181

95

1.510

1.552

1.489

182

Durbin-Watson Statistic: 1 Per Cent Significance Points of dL and dU k=3 dU 1.582 1.651 1.693 1.715 1.725 1.735 1.592 1.757 1.582 1.571 1.643 1.704 1.633 1.623 1.613 1.603 1.746 1.768 1.779 1.584 1.665 1.571 1.679 1.557 1.543 1.530 1.515 1.501 1.752 1.693 1.708 1.722 1.737 1.486 1.767 1.482 1.625 1.357 1.335 1.765 1.604 1.461 1.441 1.647 1.421 1.670 1.400 1.693 1.378 1.717 1.741 dL dU dL dU dL dU dL dU dL dU dL dU dL dU dL dU k=4 k=5 k=6 k=7 k=8 k=9 k=10

k=1

k=2

Appendix A

dL

dU

dL

100

1.522

1.562

1.502

150

1.611

1.637

1.598

200

1.664

1.684

1.653

k=11 dU ----3.506 3.358 3.227 3.109 3.004 3.185 3.084 3.252 3.155 3.065 2.982 0.152 0.180 3.050 2.976 2.907 2.843 2.785 0.322 2.563 0.350 2.730 2.680 0.208 0.237 0.266 0.294 0.141 0.167 0.194 0.222 0.249 0.277 0.304 2.906 2.836 2.772 2.713 2.659 2.609 3.131 0.116 0.125 3.218 0.092 3.363 3.274 3.191 3.113 3.040 2.972 2.909 2.851 2.797 0.100 3.459 3.311 0.070 0.046 0.065 0.085 0.107 0.131 0.156 0.182 0.208 0.234 0.261 0.050 3.562 2.991 2.906 0.165 0.194 0.224 0.253 0.283 0.313 0.342 0.371 0.399 2.829 2.758 2.694 2.635 2.582 2.533 2.487 2.446 0.136 0.109 0.077 3.412 0.032 3.358 0.055 3.521 0.035 2.909 2.822 2.744 2.674 2.610 2.552 2.499 2.451 2.407 2.367 2.330 0.450 0.422 0.393 0.363 0.333 0.303 0.272 0.240 0.209 0.178 0.148 0.119 0.084 3.671 --------3.700 3.597 3.501 3.410 3.325 3.245 3.169 3.098 3.032 2.970 2.912 0.092 3.297 0.061 3.474 0.038 3.639 ----------------------------0.029 0.043 0.060 0.079 0.100 0.122 0.146 0.171 0.193 0.221 0.067 3.420 0.043 3.601 ----------------------------0.047 3.557 --------------------------------------------------------3.725 3.629 3.538 3.452 3.371 3.294 3.220 3.152 3.087 3.026 ----------------------------------------------------------------------------0.027 0.039 0.055 0.073 0.093 0.114 0.137 0.160 0.184 ----------------------------------------------------dL dU dL dU dL dU dL dU dL dU dL dU dL dU --------------------------------3.747 3.657 3.572 3.490 3.412 3.338 3.267 3.201 3.137

k=12

k=13

k=14

k=15

k=16

k=17

k=18

k=19

k=20 dL ------------------------------------0.025 0.036 0.051 0.068 0.087 0.107 0.128 0.151 dU ------------------------------------3.766 3.682 3.602 3.524 3.450 3.379 3.311 3.246

dL

dU

dL

16

0.060

3.446

-----

17

0.084

3.286

0.053

18

0.113

3.146

0.075

19

0.145

3.023

0.102

20

0.178

2.914

0.131

21

0.212

2.817

0.162

22

0.246

2.729

0.194

23

0.281

2.651

0.227

24

0.315

2.580

0.260

25

0.348

2.517

0.292

26

0.381

2.460

0.324

27

0.413

2.409

0.356

28

0.444

2.363

0.387

29

0.474

2.321

0.417

30

0.503

2.283

0.447

31

0.531

2.248

0.475

32

0.558

2.216

0.503

k=11 dU 2.296 2.520 2.481 2.590 2.550 2.512 2.477 2.576 2.540 2.507 2.476 2.566 2.424 0.528 0.625 0.711 0.786 2.123 0.943 0.993 1.984 1.965 1.141 1.905 1.895 1.414 1.824 1.518 1.847 1.836 1.174 1.203 1.225 1.400 1.507 1.948 1.943 1.922 1.863 1.847 1.039 1.080 1.116 1.150 1.181 1.385 1.495 2.082 2.049 2.022 1.999 1.979 1.963 1.949 1.880 1.860 0.852 0.911 0.964 1.011 1.053 1.091 1.126 1.158 1.370 1.484 2.318 2.237 2.173 2.346 0.570 0.665 0.748 0.822 0.886 2.250 2.176 2.120 2.075 2.038 2.009 0.461 0.418 2.657 2.503 2.387 2.298 2.227 2.172 2.127 2.090 2.059 2.033 2.012 1.993 1.977 1.897 1.871 0.438 0.395 2.600 2.694 0.414 2.637 0.371 2.733 0.330 0.354 0.377 0.488 0.586 0.674 0.751 0.819 0.880 0.934 0.983 1.027 1.066 1.102 1.136 1.355 1.474 2.675 2.445 2.414 2.386 0.505 0.612 0.705 0.786 0.857 0.919 0.974 1.023 1.066 1.106 2.269 2.182 2.117 2.067 2.027 1.995 1.970 1.949 1.931 1.917 0.482 0.458 0.434 0.389 0.347 2.774 0.306 2.872 2.828 2.787 2.748 2.582 2.456 2.359 2.283 2.221 2.172 2.131 2.097 2.068 2.044 2.023 2.006 1.913 1.883 0.409 2.614 0.364 2.717 0.322 2.818 0.282 2.919 0.383 2.655 2.865 0.257 0.339 2.761 0.297 2.969 0.221 0.244 0.268 0.291 0.315 0.338 0.448 0.548 0.637 0.716 0.789 0.849 0.905 0.955 1.000 1.041 1.079 1.113 1.340 1.462 0.357 2.915 2.444 2.410 0.455 0.480 0.504 0.528 0.551 0.655 0.746 0.825 0.893 0.953 1.005 1.052 1.094 1.132 1.166 1.197 2.379 2.350 2.323 2.297 2.193 2.116 2.059 2.015 1.980 1.953 1.931 1.913 1.898 1.886 1.876 1.868 1.830 0.430 0.404 2.699 0.313 2.808 0.272 0.233 3.022 0.197 3.126 3.071 3.019 2.969 2.923 2.879 2.838 2.661 2.526 2.421 2.338 2.272 2.217 2.172 2.135 2.104 2.077 2.054 2.034 1.931 1.896 2.858 2.266 0.503 0.529 0.554 0.578 0.601 2.256 2.232 0.575 0.597 0.700 0.787 0.863 0.929 0.986 1.037 1.082 1.122 1.158 1.191 1.221 1.248 1.429 1.528 2.210 2.118 2.051 2.002 1.964 1.934 1.911 1.893 1.878 1.866 1.856 1.848 1.841 1.814 1.813 0.552 0.623 0.645 0.744 0.829 0.902 0.965 1.020 1.068 1.111 1.150 1.184 1.215 1.244 1.270 1.444 1.539 2.282 0.528 2.310 0.504 2.340 0.478 0.452 2.237 2.210 2.186 2.164 2.143 2.123 2.044 1.987 1.945 1.914 1.889 1.870 1.856 1.844 1.834 1.827 1.821 1.816 1.799 1.801 2.373 0.477 2.408 0.426 0.377 2.633 0.331 2.746 0.287 0.246 2.969 0.209 3.078 0.174 3.184 dL dU dL dU dL dU dL dU dL dU dL dU dL dU dL dU

k=12

k=13

k=14

k=15

k=16

k=17

k=18

k=19

k=20

dL

dU

dL

33

0.585

2.187

0.530

34

0.610

2.160

0.556

35

0.634

2.136

0.581

36

0.658

2.113

0.605

37

0.680

2.092

0.628

38

0.702

2.073

0.651

39

0.723

2.055

0.673

40

0.744

2.039

0.694

45

0.835

1.972

0.790

50

0.913

1.925

0.871

55

0.979

1.891

0.940

60

1.037

1.865

1.001

65

1.087

1.845

1.053

70

1.131

1.831

1.099

75

1.170

1.819

1.141

80

1.205

1.810

1.177

85

1.236

1.803

1.210

90

1.264

1.798

1.240

95

1.290

1.793

1.267

100

1.314

1.790

1.292

150

1.473

1.783

1.458

Durbin-Watson Significance Tables

200

1.561

1.791

1.550

183

Table A-2 Models with an intercept (from Savin and White)


Durbin-Watson Statistic: 5 Per Cent Significance Points of dL and dU k=2 dL ----0.467 0.559 0.629 0.455 0.525 0.595 0.658 0.715 0.767 0.505 0.562 0.615 0.664 0.710 0.752 0.792 0.829 0.863 1.785 1.775 1.038 1.062 1.652 1.651 1.084 1.767 1.759 1.753 0.895 0.925 0.953 0.979 1.004 2.023 1.991 1.964 1.940 1.920 1.902 1.886 1.873 1.861 2.060 2.104 0.554 0.603 0.649 0.691 0.731 0.769 0.804 0.837 0.868 0.897 0.925 2.157 0.502 2.220 0.447 2.471 2.388 2.318 2.258 2.206 2.162 2.124 2.090 2.061 2.035 2.013 1.992 1.974 2.572 0.814 1.750 1.728 1.935 1.900 1.872 1.848 1.828 1.812 1.797 1.710 1.696 1.685 1.676 1.669 1.664 0.958 0.986 1.013 1.660 1.656 1.654 0.927 0.894 0.859 0.820 0.779 0.734 0.685 0.857 0.897 0.933 0.967 0.998 1.026 1.053 1.078 1.101 1.123 1.143 1.162 1.977 1.779 0.632 2.030 2.296 0.389 0.286 0.343 0.398 0.451 0.502 0.549 0.595 0.637 0.677 0.715 0.750 0.784 0.816 0.845 1.816 0.574 2.094 0.444 2.390 0.328 2.692 0.230 1.864 0.512 2.506 2.177 0.380 0.268 2.832 0.171 3.149 2.985 2.848 2.727 2.624 2.537 2.461 2.396 2.339 2.290 2.246 2.208 2.174 2.144 2.117 2.093 1.928 0.315 2.645 0.444 2.283 0.203 3.004 ----------------0.147 0.200 0.251 0.304 0.356 0.407 0.456 0.502 0.546 0.588 0.628 0.666 0.702 0.735 0.767 2.016 0.376 2.414 0.243 2.822 --------------------2.588 0.697 0.758 0.812 1.579 1.562 1.551 1.543 1.539 1.536 1.535 1.536 1.537 1.538 1.541 1.543 1.546 1.550 1.553 1.556 0.861 0.905 0.946 0.982 1.015 1.046 1.074 1.100 1.125 1.147 1.168 1.188 1.206 1.224 1.240 1.604 1.641 1.699 2.128 0.296 --------------------------------------------3.266 3.111 2.979 2.860 2.757 2.668 2.589 2.521 2.461 2.407 2.360 2.318 2.280 2.246 2.216 1.777 0.367 2.287 ----------------------------------------------------------------0.127 0.175 0.222 0.272 0.321 0.369 0.416 0.461 0.504 0.545 0.584 0.621 0.657 0.691 1.896 --------------------------------------------------------------------------------------------------------------------------------------------3.360 3.216 3.090 2.975 2.873 2.783 2.704 2.633 2.571 2.514 2.464 2.419 2.379 2.342 dU dL dU dL dU dL dU dL dU dL dU dL dU dL dU dL ------------------------------------0.111 0.155 0.198 0.244 0.290 0.336 0.380 0.424 0.465 0.506 0.544 0.581 0.616 k=3 k=4 k=5 k=6 k=7 k=8 k=9 k=10 dU ------------------------------------3.438 3.304 3.184 3.073 2.974 2.885 2.806 2.735 2.670 2.613 2.560 2.513 2.470

k=1

dL

dU

0.610

1.400

0.700

1.356

0.763

1.332

0.824

1.320

10

0.879

1.320

11

0.927

1.324

12

0.971

1.331

13

1.010

1.340

14

1.045

1.350

184

15

1.077

1.361

16

1.106

1.371

17

1.133

1.381

18

1.158

1.391

19

1.180

1.401

20

1.201

1.411

21

1.221

1.420

22

1.239

1.429

23

1.257

1.437

24

1.273

1.446

25

1.288

1.454

26

1.302

1.461

27

1.316

1.469

Durbin-Watson Statistic: 5 Per Cent Significance Points of dL and dU k=3 dU 1.560 1.563 1.567 1.570 1.574 1.577 1.580 1.584 1.587 1.590 1.594 1.597 1.600 1.659 1.666 1.674 1.335 1.374 1.408 1.438 1.464 1.487 1.507 1.747 1.751 1.579 1.755 1.525 1.542 1.557 1.767 1.768 1.770 1.772 1.774 1.776 1.566 1.778 1.767 1.768 1.334 1.372 1.404 1.433 1.458 1.480 1.500 1.518 1.535 1.681 1.689 1.696 1.703 1.735 1.739 1.743 1.709 1.515 1.534 1.550 1.715 1.721 1.726 1.732 1.494 1.471 1.731 1.444 1.727 1.414 1.724 1.378 1.721 1.771 1.291 1.336 1.720 1.287 1.776 1.238 1.835 1.822 1.814 1.808 1.805 1.802 1.801 1.801 1.801 1.801 1.802 1.285 1.175 1.854 1.615 1.628 1.641 1.452 1.480 1.503 1.525 1.543 1.560 1.575 1.589 1.602 1.652 1.662 1.672 1.680 1.688 1.696 1.703 1.709 1.421 1.383 1.338 1.721 1.230 1.786 1.120 1.189 1.246 1.294 1.335 1.370 1.401 1.428 1.453 1.474 1.494 1.512 1.328 1.658 1.859 1.273 1.722 1.218 1.789 1.161 1.104 1.318 1.656 1.261 1.722 1.204 1.792 1.146 1.864 1.088 1.939 1.932 1.924 1.895 1.875 1.861 1.850 1.843 1.838 1.834 1.831 1.829 1.827 1.827 1.307 1.655 1.795 1.249 1.723 1.190 1.131 1.870 1.071 1.948 1.011 1.029 1.047 1.064 1.139 1.201 1.253 1.298 1.336 1.369 1.399 1.425 1.448 1.469 1.489 1.295 1.654 1.236 1.175 1.053 1.957 1.724 1.799 1.114 1.876 0.991 1.283 1.653 1.222 1.726 1.160 1.803 1.097 1.884 1.034 1.967 0.971 2.054 2.041 2.029 2.017 2.007 1.997 1.958 1.930 1.909 1.894 1.882 1.874 1.867 1.861 1.857 1.854 1.852 1.271 1.652 1.015 0.950 1.208 1.728 1.144 1.808 1.079 1.891 1.978 2.069 1.258 1.651 1.193 2.085 1.730 1.127 1.813 1.061 1.900 0.994 1.991 0.927 0.861 0.885 0.908 0.930 0.951 0.970 0.990 1.008 1.089 1.156 1.212 1.260 1.301 1.337 1.369 1.397 1.422 1.445 1.465 1.244 1.650 1.177 1.732 1.109 1.819 1.041 1.909 0.972 2.004 0.904 2.102 0.836 2.203 2.181 2.162 2.144 2.127 2.112 2.098 2.085 2.072 2.022 1.986 1.959 1.939 1.923 1.910 1.901 1.893 1.886 1.881 1.877 1.229 1.650 1.735 1.825 0.950 1.160 1.090 1.020 1.920 2.018 0.879 2.120 0.810 2.226 1.214 1.650 0.854 2.251 1.143 1.739 1.071 1.833 0.998 1.931 0.926 2.034 2.141 0.782 0.712 0.741 0.769 0.796 0.821 0.845 0.868 0.891 0.912 0.932 0.952 1.038 1.110 1.170 1.222 1.266 1.305 1.339 1.369 1.396 1.420 1.442 1.198 1.650 1.050 0.975 2.052 0.753 1.124 1.743 1.841 1.944 0.900 0.826 2.164 2.278 0.681 2.396 2.363 2.333 2.306 2.281 2.257 2.236 2.216 2.197 2.180 2.164 2.149 2.088 2.044 2.010 1.984 1.964 1.948 1.935 1.925 1.916 1.909 1.903 1.181 1.650 1.850 0.951 1.959 1.104 1.747 1.028 0.874 2.071 0.798 2.188 0.723 2.309 0.649 2.431 dL dU dL dU dL dU dL dU dL dU dL dU dL dU dL dU k=4 k=5 k=6 k=7 k=8 k=9 k=10

k=1

k=2

dL

dU

dL

28

1.328

1.476

1.255

29

1.341

1.483

1.270

30

1.352

1.489

1.284

31

1.363

1.496

1.297

32

1.373

1.502

1.309

33

1.383

1.508

1.321

34

1.393

1.514

1.333

35

1.402

1.519

1.343

36

1.411

1.525

1.354

37

1.419

1.530

1.364

38

1.427

1.535

1.373

39

1.435

1.540

1.382

40

1.442

1.544

1.391

45

1.475

1.566

1.430

50

1.503

1.585

1.462

55

1.528

1.601

1.490

60

1.549

1.616

1.514

65

1.567

1.629

1.536

70

1.583

1.641

1.554

75

1.598

1.652

1.571

80

1.611

1.662

1.586

85

1.624

1.671

1.600

90

1.635

1.679

1.612

Durbin-Watson Significance Tables

185

95

1.645

1.687

1.623

186

Durbin-Watson Statistic: 5 Per Cent Significance Points of dL and dU k=3 dU 1.715 1.760 1.665 1.718 1.852 1.675 1.665 1.820 1.707 1.831 1.697 1.841 1.686 1.863 1.874 1.651 1.593 1.789 1.738 1.799 1.728 1.809 1.693 1.774 1.679 1.788 1.802 1.817 1.637 1.832 1.622 1.846 1.608 1.862 1.877 1.613 1.592 1.758 1.571 1.550 1.528 1.506 1.850 1.736 1.780 1.803 1.826 1.484 1.874 1.462 1.898 dL dU dL dU dL dU dL dU dL dU dL dU dL dU dL dU k=4 k=5 k=6 k=7 k=8 k=9 k=10

k=1

k=2

Appendix A

dL

dU

dL

100

1.654

1.694

1.634

150

1.720

1.747

1.706

200

1.758

1.779

1.748

k=11 dU ----3.557 3.441 3.335 3.234 0.145 0.182 3.583 3.495 3.409 3.535 3.454 3.376 3.303 0.205 0.238 0.271 3.050 2.992 2.937 0.425 2.733 0.457 2.887 2.840 0.305 0.337 0.370 0.401 3.233 3.168 3.107 3.050 2.996 2.946 3.327 3.251 3.179 3.112 0.172 0.141 0.110 0.076 0.101 0.130 0.160 0.191 0.222 0.254 0.286 0.317 0.349 0.083 0.052 3.619 0.058 3.705 0.220 3.358 3.272 0.153 0.186 0.221 0.256 0.291 0.325 0.359 0.392 3.193 3.119 3.051 2.987 2.928 2.874 2.823 2.776 0.259 0.297 3.053 2.983 0.275 0.312 0.348 0.383 0.418 0.451 0.484 0.515 2.919 2.859 2.805 2.755 2.708 2.665 2.625 0.335 0.373 0.409 0.445 0.479 0.512 0.545 0.576 0.239 3.128 0.202 3.211 0.166 0.120 3.300 0.132 3.448 0.091 --------3.731 3.650 3.572 3.494 3.420 3.349 3.283 3.219 3.160 3.103 3.050 3.395 3.542 3.141 3.057 2.979 2.908 2.844 2.784 2.730 2.680 2.634 2.592 2.553 2.517 0.100 0.063 3.676 ----------------------------0.048 0.070 0.094 0.120 0.149 0.178 0.208 0.238 0.269 0.299 0.111 3.496 0.070 3.642 ----------------------------0.078 3.603 --------------------------------------------------------3.753 3.678 3.604 3.531 3.460 3.392 3.327 3.266 3.208 3.153 ----------------------------------------------------------------------------0.044 0.065 0.087 0.112 0.138 0.166 0.195 0.224 0.253 ----------------------------------------------------dL dU dL dU dL dU dL dU dL dU dL dU dL dU --------------------------------3.773 3.702 3.632 3.563 3.495 3.431 3.368 3.309 3.252

k=12

k=13

k=14

k=15

k=16

k=17

k=18

k=19

k=20 dL ------------------------------------0.041 0.060 0.081 0.104 0.129 0.156 0.183 0.211 dU ------------------------------------3.790 3.724 3.658 3.592 3.528 3.465 3.406 3.348

dL

dU

dL

16

0.098

3.503

-----

17

0.138

3.378

0.087

18

0.177

3.265

0.123

19

0.220

3.159

0.160

20

0.263

3.063

0.200

21

0.307

2.976

0.240

22

0.349

2.897

0.281

23

0.391

2.826

0.322

24

0.431

2.761

0.362

25

0.470

2.702

0.400

26

0.508

2.649

0.438

27

0.544

2.600

0.475

28

0.578

2.555

0.510

29

0.612

2.515

0.544

30

0.643

2.477

0.577

31

0.674

2.443

0.608

32

0.703

2.411

0.638

k=11 dU 2.484 2.588 2.554 2.521 2.492 2.586 2.555 2.526 2.499 0.653 0.678 2.557 2.439 2.512 2.414 2.338 2.278 0.951 1.016 1.072 1.121 2.156 2.129 2.105 1.266 2.055 2.040 2.026 1.519 1.919 1.610 1.956 1.931 1.296 1.324 1.504 1.599 2.085 2.068 2.053 1.972 1.943 1.165 1.205 1.240 1.271 1.301 1.489 1.588 2.229 2.189 0.877 2.396 2.330 2.276 2.232 2.195 2.165 2.139 2.116 2.097 2.080 1.989 1.955 0.792 2.479 0.747 0.836 0.913 0.980 1.038 1.090 1.136 1.177 1.213 1.247 1.277 1.474 1.576 2.586 2.350 2.281 2.227 2.183 1.052 1.105 1.153 1.195 1.232 2.148 2.118 2.093 2.073 0.990 0.919 0.836 0.740 0.692 0.644 0.575 0.525 0.788 0.882 0.961 1.029 1.088 1.139 1.184 1.224 1.260 1.292 1.321 1.347 0.626 2.641 2.724 2.808 2.659 2.544 2.454 2.382 2.323 2.275 2.235 2.201 2.172 2.148 2.126 2.108 2.006 1.967 2.585 0.549 2.757 2.473 2.367 2.287 2.225 2.177 2.138 2.106 2.080 2.059 2.040 2.025 2.012 2.000 1.940 0.600 2.671 0.499 2.843 0.628 0.575 0.522 2.614 2.703 2.792 0.472 2.880 0.424 0.451 0.477 0.598 0.703 0.795 0.874 0.944 1.005 1.058 1.106 1.149 1.187 1.222 1.253 1.458 1.565 0.602 0.548 0.495 0.445 2.646 2.738 2.829 2.920 0.397 3.009 2.968 2.929 2.829 2.733 2.610 2.512 2.434 2.371 2.318 2.275 2.238 2.206 2.179 2.156 2.135 2.023 1.979 0.575 0.520 3.053 2.464 0.657 0.683 0.707 0.731 0.838 0.927 1.003 1.068 1.124 1.172 1.215 1.253 1.287 1.318 1.345 1.371 1.535 1.621 2.438 2.413 2.391 2.296 2.225 2.170 2.127 2.093 2.066 2.043 2.024 2.009 1.995 1.984 1.974 1.924 1.908 0.631 2.680 2.774 0.467 2.868 0.417 2.961 0.369 0.604 0.547 3.005 2.619 2.716 0.492 2.813 0.439 2.910 0.388 0.340 3.099 0.295 0.323 0.351 0.378 0.404 0.430 0.553 0.660 0.754 0.836 0.908 0.971 1.027 1.076 1.121 1.160 1.197 1.229 1.443 1.554 0.575 2.654 0.518 2.754 0.462 2.854 2.954 0.359 3.051 0.409 0.312 3.147 0.267 3.240 3.190 3.142 3.097 3.054 3.013 2.974 2.807 2.675 2.571 2.487 2.419 2.362 2.315 2.275 2.241 2.211 2.186 2.164 2.040 1.991 0.546 2.454 2.425 2.398 2.374 2.351 2.329 2.309 0.785 0.887 0.973 1.045 1.106 1.160 1.206 1.247 1.283 1.315 1.344 1.370 1.393 1.550 1.632 2.225 2.163 2.116 2.079 2.049 2.026 2.006 1.991 1.977 1.966 1.956 1.948 1.908 1.896 0.763 0.739 0.714 0.689 0.662 0.634 0.606 2.692 0.488 2.796 0.432 2.899 0.379 3.000 0.329 3.100 0.283 3.198 0.239 3.293 dL dU dL dU dL dU dL dU dL dU dL dU dL dU dL dU

k=12

k=13

k=14

k=15

k=16

k=17

k=18

k=19

k=20

dL

dU

dL

33

0.731

2.382

0.668

34

0.758

2.355

0.695

35

0.783

2.330

0.722

36

0.808

2.306

0.748

37

0.831

2.285

0.772

38

0.854

2.265

0.796

39

0.875

2.246

0.819

40

0.896

2.228

0.840

45

0.988

2.156

0.938

50

1.064

2.103

1.019

55

1.129

2.062

1.087

60

1.184

2.031

1.145

65

1.231

2.006

1.195

70

1.272

1.987

1.239

75

1.308

1.970

1.277

80

1.340

1.957

1.311

85

1.369

1.946

1.342

90

1.395

1.937

1.369

95

1.418

1.930

1.394

100

1.439

1.923

1.416

150

1.579

1.892

1.564

Durbin-Watson Significance Tables

200

1.654

1.885

1.643

187

Table A-3 Models with no intercept (from Farebrother): Positive serial correlation
Durbin-Watson One Per Cent Minimal Bound K=2 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------K=3 K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------

K=0

K=1

0.001

-----

0.034 0.000

0.127 0.022 0.000

0.233 0.089 0.014 0.000

0.322 0.175 0.065 0.010 0.000

0.398 0.253 0.135 0.049 0.008 0.000

0.469 0.324 0.202 0.106 0.038 0.006 0.000

0.534 0.394 0.268 0.164 0.086 0.031 0.005 0.000

10

0.591 0.457 0.333 0.223 0.136 0.070 0.025 0.004 0.000

11

0.643 0.515 0.394 0.284 0.189 0.114 0.059 0.021 0.003 0.000

188

12

0.691 0.568 0.451 0.341 0.244 0.161 0.097 0.050 0.018 0.003 0.000

13

0.733 0.617 0.503 0.396 0.298 0.212 0.139 0.083 0.043 0.015 0.002 0.000

14

0.773 0.662 0.552 0.448 0.350 0.262 0.185 0.121 0.072 0.037 0.013 0.002 0.000

15

0.809 0.703 0.598 0.496 0.400 0.311 0.232 0.163 0.107 0.063 0.032 0.011 0.002 0.000

16

0.842 0.741 0.640 0.541 0.447 0.358 0.278 0.206 0.145 0.094 0.056 0.028 0.010 0.002 0.000

17

0.873 0.776 0.679 0.583 0.491 0.404 0.323 0.249 0.184 0.129 0.084 0.050 0.025 0.009 0.001 0.000

18

0.901 0.808 0.715 0.623 0.533 0.447 0.366 0.292 0.225 0.166 0.116 0.075 0.044 0.023 0.008 0.001 0.000

19

0.928 0.839 0.749 0.660 0.572 0.488 0.408 0.333 0.265 0.204 0.150 0.105 0.068 0.040 0.020 0.007 0.001 0.000

20

0.952 0.867 0.780 0.694 0.609 0.527 0.448 0.374 0.304 0.241 0.185 0.136 0.095 0.062 0.036 0.018 0.006 0.001 0.000

21

0.976 0.893 0.810 0.727 0.644 0.564 0.486 0.413 0.343 0.279 0.221 0.169 0.124 0.087 0.056 0.033 0.017 0.006 0.001 0.000

22

0.997 0.918 0.838 0.757 0.677 0.599 0.523 0.450 0.381 0.316 0.257 0.203 0.155 0.114 0.079 0.051 0.030 0.015 0.005 0.001 0.000

23

1.018 0.942 0.864 0.786 0.709 0.632 0.558 0.486 0.417 0.352 0.292 0.237 0.187 0.143 0.104 0.073 0.047 0.027 0.014 0.005 0.001 0.000

24

1.037 0.964 0.889 0.813 0.738 0.664 0.591 0.520 0.452 0.387 0.327 0.270 0.219 0.172 0.131 0.096 0.067 0.043 0.025 0.013 0.004 0.001

Durbin-Watson One Per Cent Minimal Bound K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21

K=0

K=1

K=2

K=3

25

1.056 0.984 0.912 0.839 0.766 0.693 0.622 0.553 0.486 0.421 0.361 0.304 0.251 0.203 0.160 0.122 0.089 0.062 0.040 0.023 0.012 0.004

26

1.073 1.004 0.934 0.863 0.792 0.722 0.652 0.584 0.518 0.454 0.394 0.336 0.283 0.233 0.189 0.148 0.113 0.083 0.057 0.037 0.022 0.011

27

1.089 1.023 0.955 0.886 0.817 0.749 0.681 0.614 0.549 0.486 0.426 0.368 0.314 0.264 0.218 0.176 0.138 0.105 0.077 0.053 0.034 0.020

28

1.105 1.040 0.974 0.908 0.841 0.774 0.708 0.643 0.579 0.517 0.457 0.400 0.345 0.294 0.247 0.204 0.164 0.129 0.098 0.071 0.050 0.032

29

1.120 1.057 0.993 0.929 0.864 0.798 0.734 0.670 0.607 0.546 0.487 0.430 0.376 0.324 0.276 0.232 0.191 0.154 0.120 0.091 0.067 0.046

30

1.134 1.073 1.011 0.948 0.885 0.822 0.759 0.696 0.635 0.574 0.516 0.460 0.405 0.354 0.305 0.260 0.217 0.179 0.144 0.113 0.086 0.062

31

1.147 1.088 1.028 0.967 0.905 0.844 0.782 0.721 0.661 0.602 0.544 0.488 0.434 0.383 0.334 0.288 0.244 0.205 0.168 0.135 0.106 0.080

32

1.160 1.103 1.044 0.985 0.925 0.865 0.805 0.745 0.686 0.628 0.571 0.516 0.462 0.411 0.362 0.315 0.271 0.230 0.193 0.158 0.127 0.100

33

1.173 1.117 1.060 1.002 0.944 0.885 0.826 0.768 0.710 0.653 0.597 0.542 0.489 0.438 0.389 0.342 0.298 0.256 0.218 0.182 0.149 0.120

34

1.185 1.130 1.075 1.018 0.961 0.904 0.847 0.790 0.733 0.677 0.622 0.568 0.516 0.465 0.416 0.369 0.324 0.282 0.243 0.206 0.172 0.141

35

1.196 1.143 1.089 1.034 0.978 0.923 0.867 0.811 0.755 0.700 0.646 0.593 0.541 0.491 0.442 0.395 0.350 0.308 0.268 0.230 0.195 0.163

36

1.207 1.155 1.102 1.049 0.995 0.940 0.886 0.831 0.777 0.723 0.669 0.617 0.566 0.516 0.467 0.421 0.376 0.333 0.292 0.254 0.218 0.185

37

1.217 1.167 1.116 1.063 1.010 0.957 0.904 0.850 0.797 0.744 0.692 0.640 0.590 0.540 0.492 0.446 0.401 0.358 0.317 0.278 0.241 0.207

38

1.228 1.178 1.128 1.077 1.026 0.974 0.921 0.869 0.817 0.765 0.713 0.663 0.613 0.564 0.516 0.470 0.425 0.382 0.341 0.302 0.265 0.230

39

1.237 1.189 1.140 1.090 1.040 0.989 0.938 0.887 0.836 0.785 0.734 0.684 0.635 0.587 0.540 0.494 0.449 0.406 0.365 0.325 0.288 0.252

40

1.247 1.200 1.152 1.103 1.054 1.004 0.954 0.904 0.854 0.804 0.754 0.705 0.657 0.609 0.562 0.517 0.473 0.430 0.388 0.349 0.311 0.275

45

1.289 1.247 1.204 1.160 1.116 1.071 1.026 0.981 0.936 0.890 0.845 0.800 0.755 0.710 0.666 0.623 0.581 0.539 0.499 0.459 0.421 0.384

50

1.325 1.287 1.248 1.208 1.168 1.128 1.087 1.046 1.004 0.963 0.921 0.880 0.838 0.797 0.756 0.715 0.675 0.636 0.597 0.559 0.521 0.485

55

1.356 1.321 1.286 1.250 1.213 1.176 1.139 1.101 1.063 1.025 0.987 0.948 0.910 0.872 0.833 0.796 0.758 0.721 0.684 0.647 0.611 0.576

60

1.383 1.351 1.319 1.285 1.252 1.218 1.183 1.149 1.114 1.078 1.043 1.008 0.972 0.936 0.901 0.865 0.830 0.795 0.760 0.725 0.691 0.657

65

1.408 1.378 1.348 1.317 1.286 1.254 1.222 1.190 1.158 1.125 1.092 1.059 1.026 0.993 0.960 0.927 0.894 0.861 0.828 0.795 0.762 0.730

70

1.429 1.401 1.373 1.345 1.316 1.286 1.257 1.227 1.197 1.166 1.136 1.105 1.074 1.043 1.012 0.981 0.950 0.919 0.888 0.857 0.826 0.795

75

1.448 1.423 1.396 1.369 1.342 1.315 1.287 1.260 1.231 1.203 1.174 1.146 1.117 1.088 1.058 1.029 1.000 0.971 0.941 0.912 0.883 0.854

80

1.466 1.442 1.417 1.392 1.367 1.341 1.315 1.289 1.262 1.236 1.209 1.182 1.155 1.127 1.100 1.072 1.045 1.017 0.989 0.962 0.934 0.907

Durbin-Watson Significance Tables

189

85

1.482 1.459 1.436 1.412 1.388 1.364 1.340 1.315 1.290 1.265 1.240 1.214 1.189 1.163 1.137 1.111 1.085 1.059 1.033 1.006 0.980 0.954

190

Durbin-Watson One Per Cent Minimal Bound K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21

K=0

K=1

K=2

K=3

Appendix A

90

1.497 1.475 1.453 1.431 1.408 1.385 1.362 1.339 1.315 1.292 1.268 1.244 1.220 1.195 1.171 1.146 1.121 1.097 1.072 1.047 1.022 0.997

95

1.510 1.490 1.469 1.448 1.426 1.405 1.383 1.361 1.338 1.316 1.293 1.271 1.248 1.225 1.201 1.178 1.155 1.131 1.108 1.084 1.060 1.037

100

1.523 1.503 1.483 1.463 1.443 1.422 1.402 1.381 1.359 1.338 1.317 1.295 1.273 1.251 1.229 1.207 1.185 1.162 1.140 1.118 1.095 1.072

150

1.611 1.598 1.585 1.571 1.558 1.544 1.530 1.516 1.502 1.488 1.474 1.460 1.445 1.431 1.416 1.402 1.387 1.372 1.357 1.342 1.327 1.312

200

1.664 1.654 1.644 1.634 1.624 1.613 1.603 1.593 1.582 1.572 1.561 1.551 1.540 1.529 1.519 1.508 1.497 1.486 1.475 1.434 1.453 1.442

Table A-4 Models with no intercept (from Farebrother): Positive serial correlation
Durbin-Watson Five Per Cent Minimal Bound K=2 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------K=3 K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------

K=0

K=1

0.012

-----

0.168 0.006

0.355 0.105 0.004

0.478 0.248 0.070 0.002

0.584 0.358 0.180 0.050 0.002

0.677 0.462 0.275 0.136 0.037 0.001

0.754 0.556 0.371 0.217 0.106 0.029 0.001

0.820 0.635 0.460 0.303 0.175 0.085 0.023 0.001

10

0.877 0.706 0.539 0.385 0.251 0.143 0.069 0.019 0.001

11

0.927 0.768 0.610 0.460 0.326 0.211 0.120 0.058 0.016 0.001

191

12

0.972 0.823 0.674 0.530 0.397 0.279 0.180 0.101 0.049 0.013 0.001

13

1.012 0.872 0.731 0.593 0.464 0.345 0.241 0.154 0.087 0.042 0.011 0.001

14

1.047 0.916 0.783 0.651 0.525 0.408 0.302 0.210 0.134 0.075 0.036 0.010 0.001

15

1.079 0.955 0.829 0.704 0.583 0.467 0.361 0.266 0.185 0.118 0.066 0.031 0.008 0.001

16

1.109 0.992 0.872 0.752 0.635 0.523 0.418 0.322 0.237 0.164 0.104 0.058 0.028 0.007 0.000

17

1.136 1.024 0.911 0.797 0.684 0.575 0.472 0.376 0.288 0.211 0.146 0.093 0.052 0.025 0.007 0.000

18

1.160 1.055 0.946 0.837 0.729 0.624 0.523 0.427 0.339 0.260 0.190 0.131 0.083 0.046 0.022 0.006 0.000

19

1.183 1.082 0.979 0.875 0.771 0.669 0.570 0.476 0.388 0.307 0.235 0.171 0.118 0.075 0.041 0.020 0.005 0.000

20

1.204 1.108 1.010 0.910 0.810 0.711 0.615 0.523 0.436 0.354 0.280 0.213 0.156 0.107 0.067 0.037 0.018 0.005 0.000

21

1.224 1.132 1.038 0.942 0.846 0.751 0.657 0.567 0.481 0.400 0.324 0.256 0.195 0.142 0.097 0.061 0.034 0.016 0.004 0.000

22

1.242 1.154 1.064 0.972 0.879 0.787 0.697 0.609 0.524 0.443 0.368 0.298 0.235 0.178 0.130 0.089 0.056 0.031 0.015 0.004 0.000

23

1.259 1.175 1.088 1.000 0.911 0.822 0.734 0.648 0.565 0.485 0.410 0.339 0.274 0.216 0.164 0.119 0.081 0.051 0.028 0.014 0.004 0.000

24

1.275 1.194 1.111 1.026 0.940 0.854 0.769 0.685 0.604 0.525 0.450 0.380 0.314 0.254 0.199 0.151 0.110 0.075 0.047 0.026 0.012 0.003

192

Durbin-Watson Five Per Cent Minimal Bound K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21

K=0

K=1

K=2

K=3

Appendix A

25

1.290 1.212 1.132 1.050 0.967 0.884 0.802 0.720 0.641 0.563 0.489 0.419 0.353 0.291 0.235 0.184 0.140 0.101 0.069 0.044 0.024 0.011

26

1.304 1.229 1.152 1.073 0.993 0.913 0.833 0.753 0.676 0.600 0.527 0.457 0.390 0.328 0.271 0.218 0.171 0.130 0.094 0.064 0.040 0.022

27

1.318 1.245 1.171 1.094 1.017 0.940 0.862 0.785 0.709 0.635 0.563 0.493 0.427 0.365 0.306 0.252 0.203 0.159 0.120 0.087 0.060 0.037

28

1.330 1.260 1.188 1.115 1.040 0.965 0.889 0.815 0.741 0.668 0.597 0.529 0.463 0.400 0.341 0.286 0.236 0.190 0.148 0.112 0.081 0.055

29

1.342 1.275 1.205 1.134 1.062 0.989 0.916 0.843 0.770 0.699 0.630 0.562 0.497 0.435 0.376 0.320 0.268 0.221 0.177 0.139 0.105 0.076

30

1.354 1.288 1.221 1.152 1.082 1.011 0.940 0.869 0.799 0.729 0.661 0.595 0.530 0.468 0.409 0.353 0.301 0.252 0.207 0.166 0.130 0.098

31

1.365 1.301 1.236 1.169 1.101 1.033 0.964 0.895 0.826 0.758 0.691 0.626 0.562 0.501 0.442 0.386 0.333 0.283 0.237 0.195 0.156 0.122

32

1.375 1.313 1.250 1.185 1.120 1.053 0.986 0.919 0.852 0.785 0.720 0.653 0.593 0.532 0.474 0.418 0.364 0.314 0.267 0.223 0.183 0.147

33

1.385 1.325 1.264 1.201 1.137 1.072 1.007 0.942 0.876 0.811 0.747 0.684 0.623 0.563 0.504 0.449 0.395 0.344 0.297 0.252 0.211 0.173

34

1.394 1.336 1.277 1.216 1.153 1.091 1.027 0.963 0.900 0.836 0.774 0.712 0.651 0.592 0.534 0.479 0.425 0.374 0.326 0.280 0.238 0.199

35

1.403 1.347 1.289 1.230 1.169 1.108 1.046 0.984 0.922 0.860 0.799 0.738 0.678 0.620 0.563 0.508 0.455 0.404 0.355 0.309 0.266 0.225

36

1.412 1.357 1.301 1.243 1.184 1.125 1.064 1.004 0.943 0.883 0.823 0.763 0.705 0.647 0.591 0.536 0.483 0.432 0.384 0.337 0.293 0.252

37

1.420 1.367 1.312 1.256 1.199 1.141 1.082 1.023 0.964 0.905 0.846 0.787 0.730 0.673 0.618 0.564 0.511 0.460 0.412 0.365 0.321 0.279

38

1.428 1.376 1.323 1.268 1.212 1.156 1.099 1.041 0.983 0.925 0.868 0.811 0.754 0.698 0.644 0.590 0.538 0.488 0.439 0.392 0.347 0.305

39

1.436 1.385 1.333 1.280 1.225 1.170 1.114 1.058 1.002 0.945 0.889 0.833 0.778 0.723 0.669 0.616 0.564 0.514 0.466 0.419 0.374 0.331

40

1.443 1.394 1.343 1.291 1.238 1.184 1.130 1.075 1.020 0.965 0.909 0.854 0.800 0.746 0.693 0.641 0.590 0.540 0.492 0.445 0.400 0.357

45

1.476 1.432 1.387 1.341 1.294 1.246 1.197 1.148 1.099 1.049 1.000 0.950 0.900 0.851 0.802 0.753 0.706 0.658 0.612 0.567 0.523 0.480

50

1.504 1.464 1.424 1.382 1.340 1.297 1.253 1.209 1.164 1.120 1.075 1.029 0.984 0.939 0.894 0.849 0.804 0.760 0.717 0.674 0.631 0.590

55

1.528 1.492 1.455 1.417 1.379 1.340 1.300 1.260 1.219 1.179 1.138 1.096 1.055 1.013 0.972 0.930 0.889 0.848 0.807 0.766 0.726 0.687

60

1.549 1.516 1.482 1.447 1.412 1.376 1.340 1.303 1.266 1.229 1.191 1.153 1.115 1.077 1.038 1.000 0.962 0.923 0.885 0.847 0.810 0.772

65

1.568 1.537 1.505 1.474 1.441 1.408 1.375 1.341 1.307 1.272 1.238 1.202 1.167 1.132 1.096 1.061 1.025 0.989 0.953 0.918 0.882 0.847

70

1.584 1.555 1.526 1.497 1.467 1.436 1.405 1.374 1.342 1.310 1.278 1.245 1.213 1.180 1.147 1.113 1.080 1.047 1.013 0.980 0.947 0.914

75

1.599 1.572 1.545 1.517 1.489 1.461 1.432 1.403 1.373 1.344 1.313 1.283 1.253 1.222 1.191 1.160 1.129 1.098 1.066 1.035 1.004 0.972

80

1.612 1.587 1.561 1.536 1.509 1.483 1.456 1.429 1.401 1.373 1.345 1.317 1.288 1.259 1.230 1.201 1.172 1.143 1.113 1.084 1.054 1.025

85

1.624 1.600 1.576 1.552 1.527 1.502 1.477 1.452 1.426 1.400 1.373 1.347 1.320 1.293 1.266 1.238 1.211 1.183 1.155 1.128 1.100 1.072

Durbin-Watson Five Per Cent Minimal Bound K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21

K=0

K=1

K=2

K=3

90

1.635 1.613 1.590 1.567 1.544 1.520 1.497 1.472 1.448 1.423 1.399 1.373 1.348 1.323 1.297 1.271 1.245 1.219 1.193 1.167 1.141 1.114

95

1.645 1.624 1.603 1.581 1.559 1.537 1.514 1.491 1.468 1.445 1.422 1.398 1.374 1.350 1.326 1.301 1.277 1.252 1.227 1.202 1.177 1.152

100

1.654 1.634 1.614 1.593 1.573 1.551 1.530 1.508 1.487 1.465 1.442 1.420 1.397 1.374 1.352 1.328 1.305 1.282 1.258 1.235 1.211 1.187

150

1.720 1.706 1.693 1.679 1.666 1.652 1.638 1.624 1.609 1.595 1.580 1.566 1.551 1.536 1.521 1.506 1.491 1.476 1.461 1.445 1.430 1.414

200

1.759 1.748 1.738 1.728 1.718 1.708 1.697 1.687 1.676 1.666 1.655 1.644 1.633 1.622 1.611 1.600 1.589 1.578 1.567 1.556 1.544 1.533

Durbin-Watson Significance Tables

193

Table A-5 Models with no intercept (from Farebrother): Negative serial correlation
Durbin-Watson Ninety Five Per Cent Minimal Bound K=2 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------K=3 K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------

K=0

K=1

1.988

-----

2.761 0.994

2.871 1.836 0.582

2.857 2.178 1.267 0.380

2.844 2.320 1.655 0.917 0.266

2.828 2.398 1.871 1.283 0.690 0.197

2.805 2.453 2.008 1.521 1.017 0.537 0.151

2.783 2.483 2.110 1.687 1.251 0.823 0.429 0.120

10

2.762 2.501 2.181 1.816 1.427 1.044 0.678 0.350 0.097

11

2.742 2.511 2.231 1.913 1.569 1.218 0.881 0.567 0.291 0.080

194

12

2.723 2.516 2.268 1.987 1.682 1.364 1.049 0.752 0.481 0.245 0.068

13

2.705 2.518 2.296 2.044 1.771 1.484 1.193 0.911 0.649 0.413 0.210 0.058

14

2.688 2.517 2.316 2.090 1.843 1.582 1.316 1.051 0.797 0.565 0.358 0.181 0.050

15

2.672 2.515 2.332 2.126 1.902 1.664 1.419 1.172 0.931 0.703 0.497 0.314 0.158 0.043

16

2.657 2.512 2.344 2.155 1.950 1.732 1.506 1.276 1.049 0.829 0.624 0.439 0.277 0.139 0.038

17

2.644 2.508 2.353 2.179 1.990 1.789 1.580 1.367 1.153 0.944 0.743 0.557 0.391 0.246 0.124 0.034

18

2.631 2.504 2.359 2.199 2.024 1.838 1.644 1.445 1.244 1.045 0.852 0.669 0.501 0.351 0.220 0.110 0.030

19

2.618 2.499 2.364 2.215 2.053 1.880 1.699 1.513 1.324 1.136 0.951 0.773 0.605 0.452 0.316 0.198 0.099 0.027

20

2.607 2.494 2.368 2.228 2.077 1.916 1.747 1.573 1.395 1.216 1.040 0.868 0.704 0.550 0.410 0.286 0.179 0.090 0.025

21

2.596 2.489 2.370 2.239 2.098 1.947 1.789 1.625 1.457 1.289 1.120 0.955 0.796 0.644 0.502 0.373 0.260 0.162 0.081 0.022

22

2.585 2.484 2.372 2.249 2.116 1.974 1.825 1.671 1.513 1.353 1.193 1.034 0.880 0.731 0.591 0.460 0.341 0.238 0.148 0.074 0.020

23

2.575 2.479 2.373 2.257 2.131 1.998 1.858 1.712 1.563 1.411 1.258 1.107 0.957 0.813 0.674 0.544 0.422 0.313 0.218 0.136 0.068 0.019

24

2.566 2.474 2.373 2.263 2.145 2.019 1.886 1.749 1.607 1.463 1.318 1.172 1.029 0.888 0.753 0.623 0.502 0.389 0.289 0.201 0.125 0.062

Durbin-Watson Ninety Five Per Cent Minimal Bound K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21

K=0

K=1

K=2

K=3

25

2.557 2.470 2.373 2.269 2.156 2.037 1.912 1.782 1.647 1.510 1.371 1.232 1.094 0.958 0.826 0.699 0.578 0.465 0.360 0.267 0.185 0.115

26

1.073 1.004 0.934 0.863 0.792 0.722 0.652 0.584 0.518 0.454 0.394 0.336 0.283 0.233 0.189 0.148 0.113 0.083 0.057 0.037 0.022 0.011

27

1.089 1.023 0.955 0.886 0.817 0.749 0.681 0.614 0.549 0.486 0.426 0.368 0.314 0.264 0.218 0.176 0.138 0.105 0.077 0.053 0.034 0.020

28

1.105 1.040 0.974 0.908 0.841 0.774 0.708 0.643 0.579 0.517 0.457 0.400 0.345 0.294 0.247 0.204 0.164 0.129 0.098 0.071 0.050 0.032

29

1.120 1.057 0.993 0.929 0.864 0.798 0.734 0.670 0.607 0.546 0.487 0.430 0.376 0.324 0.276 0.232 0.191 0.154 0.120 0.091 0.067 0.046

30

1.134 1.073 1.011 0.948 0.885 0.822 0.759 0.696 0.635 0.574 0.516 0.460 0.405 0.354 0.305 0.260 0.217 0.179 0.144 0.113 0.086 0.062

31

1.147 1.088 1.028 0.967 0.905 0.844 0.782 0.721 0.661 0.602 0.544 0.488 0.434 0.383 0.334 0.288 0.244 0.205 0.168 0.135 0.106 0.080

32

1.160 1.103 1.044 0.985 0.925 0.865 0.805 0.745 0.686 0.628 0.571 0.516 0.462 0.411 0.362 0.315 0.271 0.230 0.193 0.158 0.127 0.100

33

1.173 1.117 1.060 1.002 0.944 0.885 0.826 0.768 0.710 0.653 0.597 0.542 0.489 0.438 0.389 0.342 0.298 0.256 0.218 0.182 0.149 0.120

34

1.185 1.130 1.075 1.018 0.961 0.904 0.847 0.790 0.733 0.677 0.622 0.568 0.516 0.465 0.416 0.369 0.324 0.282 0.243 0.206 0.172 0.141

35

1.196 1.143 1.089 1.034 0.978 0.923 0.867 0.811 0.755 0.700 0.646 0.593 0.541 0.491 0.442 0.395 0.350 0.308 0.268 0.230 0.195 0.163

36

1.207 1.155 1.102 1.049 0.995 0.940 0.886 0.831 0.777 0.723 0.669 0.617 0.566 0.516 0.467 0.421 0.376 0.333 0.292 0.254 0.218 0.185

37

1.217 1.167 1.116 1.063 1.010 0.957 0.904 0.850 0.797 0.744 0.692 0.640 0.590 0.540 0.492 0.446 0.401 0.358 0.317 0.278 0.241 0.207

38

1.228 1.178 1.128 1.077 1.026 0.974 0.921 0.869 0.817 0.765 0.713 0.663 0.613 0.564 0.516 0.470 0.425 0.382 0.341 0.302 0.265 0.230

39

1.237 1.189 1.140 1.090 1.040 0.989 0.938 0.887 0.836 0.785 0.734 0.684 0.635 0.587 0.540 0.494 0.449 0.406 0.365 0.325 0.288 0.252

40

1.247 1.200 1.152 1.103 1.054 1.004 0.954 0.904 0.854 0.804 0.754 0.705 0.657 0.609 0.562 0.517 0.473 0.430 0.388 0.349 0.311 0.275

45

1.289 1.247 1.204 1.160 1.116 1.071 1.026 0.981 0.936 0.890 0.845 0.800 0.755 0.710 0.666 0.623 0.581 0.539 0.499 0.459 0.421 0.384

50

1.325 1.287 1.248 1.208 1.168 1.128 1.087 1.046 1.004 0.963 0.921 0.880 0.838 0.797 0.756 0.715 0.675 0.636 0.597 0.559 0.521 0.485

55

1.356 1.321 1.286 1.250 1.213 1.176 1.139 1.101 1.063 1.025 0.987 0.948 0.910 0.872 0.833 0.796 0.758 0.721 0.684 0.647 0.611 0.576

60

1.383 1.351 1.319 1.285 1.252 1.218 1.183 1.149 1.114 1.078 1.043 1.008 0.972 0.936 0.901 0.865 0.830 0.795 0.760 0.725 0.691 0.657

65

1.408 1.378 1.348 1.317 1.286 1.254 1.222 1.190 1.158 1.125 1.092 1.059 1.026 0.993 0.960 0.927 0.894 0.861 0.828 0.795 0.762 0.730

70

1.429 1.401 1.373 1.345 1.316 1.286 1.257 1.227 1.197 1.166 1.136 1.105 1.074 1.043 1.012 0.981 0.950 0.919 0.888 0.857 0.826 0.795

75

1.448 1.423 1.396 1.369 1.342 1.315 1.287 1.260 1.231 1.203 1.174 1.146 1.117 1.088 1.058 1.029 1.000 0.971 0.941 0.912 0.883 0.854

80

1.466 1.442 1.417 1.392 1.367 1.341 1.315 1.289 1.262 1.236 1.209 1.182 1.155 1.127 1.100 1.072 1.045 1.017 0.989 0.962 0.934 0.907

Durbin-Watson Significance Tables

85

1.482 1.459 1.436 1.412 1.388 1.364 1.340 1.315 1.290 1.265 1.240 1.214 1.189 1.163 1.137 1.111 1.085 1.059 1.033 1.006 0.980 0.954

195

196

Durbin-Watson Ninety Five Per Cent Minimal Bound K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21

K=0

K=1

K=2

K=3

Appendix A

90

1.497 1.475 1.453 1.431 1.408 1.385 1.362 1.339 1.315 1.292 1.268 1.244 1.220 1.195 1.171 1.146 1.121 1.097 1.072 1.047 1.022 0.997

95

1.510 1.490 1.469 1.448 1.426 1.405 1.383 1.361 1.338 1.316 1.293 1.271 1.248 1.225 1.201 1.178 1.155 1.131 1.108 1.084 1.060 1.037

100

1.523 1.503 1.483 1.463 1.443 1.422 1.402 1.381 1.359 1.338 1.317 1.295 1.273 1.251 1.229 1.207 1.185 1.162 1.140 1.118 1.095 1.072

150

1.611 1.598 1.585 1.571 1.558 1.544 1.530 1.516 1.502 1.488 1.474 1.460 1.445 1.431 1.416 1.402 1.387 1.372 1.357 1.342 1.327 1.312

200

1.664 1.654 1.644 1.634 1.624 1.613 1.603 1.593 1.582 1.572 1.561 1.551 1.540 1.529 1.519 1.508 1.497 1.486 1.475 1.464 1.453 1.442

Table A-6 Models with no intercept (from Farebrother): Negative serial correlation
Durbin-Watson Ninety Nine Per Cent Minimal Bound K=2 ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------K=3 K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21 ---------------------------------------------------------------------------------------------------------------------------------------------------------------------

K=0

K=1

1.999

-----

2.951 0.999

3.221 1.967 0.586

3.261 2.462 1.359 0.382

3.235 2.682 1.878 0.983 0.268

3.198 2.776 2.177 1.459 0.740 0.198

3.166 2.817 2.347 1.776 1.158 0.576 0.153

3.133 2.837 2.448 1.983 1.465 0.937 0.460 0.121

10

3.101 2.847 2.514 2.121 1.684 1.224 0.773 0.375 0.098

11

3.071 2.847 2.560 2.220 1.842 1.441 1.035 0.647 0.312 0.081

197

12

3.043 2.843 2.592 2.294 1.961 1.607 1.244 0.885 0.549 0.263 0.069

13

3.017 2.836 2.612 2.349 2.054 1.737 1.410 1.082 0.764 0.471 0.225 0.059

14

2.992 2.828 2.626 2.391 2.127 1.842 1.544 1.244 0.948 0.666 0.409 0.195 0.051

15

2.969 2.818 2.635 2.423 2.185 1.928 1.656 1.379 1.104 0.837 0.585 0.358 0.170 0.044

16

2.948 2.808 2.640 2.447 2.231 1.997 1.749 1.494 1.237 0.985 0.743 0.517 0.316 0.150 0.039

17

2.927 2.797 2.643 2.466 2.269 2.055 1.827 1.591 1.351 1.114 0.883 0.664 0.461 0.281 0.133 0.035

18

2.908 2.787 2.644 2.480 2.299 2.102 1.893 1.675 1.451 1.227 1.007 0.796 0.597 0.413 0.251 0.119 0.031

19

2.890 2.776 2.643 2.492 2.324 2.142 1.948 1.746 1.538 1.327 1.118 0.915 0.721 0.539 0.372 0.226 0.107 0.028

20

2.874 2.766 2.641 2.500 2.344 2.176 1.996 1.807 1.613 1.415 1.217 1.022 0.834 0.656 0.489 0.337 0.204 0.096 0.025

21

2.858 2.756 2.638 2.506 2.361 2.204 2.036 1.861 1.678 1.492 1.305 1.119 0.937 0.763 0.598 0.446 0.307 0.185 0.087 0.023

22

2.842 2.746 2.635 2.511 2.375 2.228 2.071 1.907 1.736 1.561 1.384 1.207 1.032 0.862 0.700 0.548 0.408 0.280 0.169 0.080 0.021

23

2.828 2.736 2.631 2.515 2.387 2.249 2.102 1.947 1.786 1.621 1.454 1.285 1.118 0.954 0.796 0.645 0.504 0.374 0.257 0.155 0.073 0.019

24

2.814 2.727 2.627 2.517 2.396 2.267 2.128 1.983 1.831 1.675 1.516 1.356 1.196 1.038 0.884 0.736 0.596 0.465 0.345 0.237 0.143 0.067

198

Durbin-Watson Ninety Nine Per Cent Minimal Bound K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21

K=0

K=1

K=2

K=3

Appendix A

25

2.801 2.717 2.623 2.518 2.404 2.282 2.151 2.014 1.871 1.723 1.572 1.420 1.267 1.115 0.966 0.821 0.683 0.552 0.430 0.319 0.218 0.132

26

2.789 2.709 2.618 2.519 2.411 2.295 2.171 2.042 1.906 1.766 1.623 1.478 1.331 1.186 1.042 0.901 0.765 0.635 0.512 0.399 0.295 0.202

27

2.777 2.700 2.614 2.519 2.416 2.306 2.189 2.066 1.938 1.805 1.669 1.530 1.390 1.250 1.111 0.975 0.842 0.714 0.592 0.477 0.371 0.274

28

2.766 2.692 2.609 2.519 2.421 2.316 2.205 2.088 1.966 1.839 1.710 1.577 1.444 1.309 1.176 1.043 0.914 0.788 0.667 0.553 0.445 0.346

29

2.755 2.684 2.604 2.518 2.425 2.325 2.219 2.107 1.991 1.871 1.747 1.621 1.493 1.364 1.235 1.107 0.981 0.858 0.739 0.625 0.517 0.416

30

2.745 2.676 2.600 2.517 2.428 2.332 2.231 2.125 2.014 1.899 1.781 1.660 1.537 1.414 1.290 1.166 1.044 0.924 0.807 0.695 0.587 0.485

31

2.735 2.668 2.595 2.515 2.430 2.339 2.242 2.140 2.035 1.925 1.812 1.696 1.579 1.460 1.340 1.221 1.102 0.986 0.872 0.761 0.654 0.552

32

2.725 2.661 2.590 2.514 2.432 2.344 2.252 2.155 2.053 1.948 1.840 1.729 1.616 1.502 1.387 1.272 1.157 1.043 0.932 0.823 0.718 0.617

33

2.716 2.654 2.586 2.512 2.433 2.349 2.260 2.167 2.070 1.970 1.866 1.759 1.651 1.541 1.430 1.319 1.208 1.097 0.989 0.882 0.779 0.678

34

2.707 2.647 2.581 2.510 2.434 2.353 2.268 2.179 2.086 1.989 1.889 1.787 1.683 1.577 1.470 1.363 1.255 1.148 1.042 0.938 0.836 0.737

35

2.699 2.640 2.576 2.508 2.435 2.357 2.275 2.189 2.100 2.007 1.911 1.813 1.713 1.611 1.507 1.404 1.299 1.196 1.093 0.991 0.891 0.794

36

2.690 2.634 2.572 2.506 2.435 2.360 2.281 2.199 2.113 2.023 1.931 1.837 1.740 1.642 1.542 1.442 1.341 1.240 1.140 1.041 0.943 0.847

37

2.683 2.627 2.567 2.503 2.435 2.363 2.287 2.207 2.124 2.038 1.950 1.859 1.765 1.670 1.574 1.477 1.379 1.282 1.184 1.088 0.992 0.898

38

2.675 2.621 2.563 2.501 2.435 2.365 2.292 2.215 2.135 2.052 1.967 1.879 1.789 1.697 1.604 1.510 1.416 1.321 1.226 1.132 1.039 0.947

39

2.667 2.615 2.559 2.499 2.435 2.367 2.296 2.222 2.145 2.065 1.982 1.898 1.811 1.722 1.632 1.541 1.450 1.358 1.266 1.174 1.083 0.993

40

2.660 2.609 2.555 2.496 2.434 2.369 2.300 2.229 2.154 2.077 1.997 1.915 1.831 1.746 1.659 1.570 1.482 1.392 1.303 1.213 1.124 1.036

45

2.628 2.583 2.535 2.484 2.430 2.374 2.315 2.253 2.190 2.124 2.056 1.986 1.914 1.841 1.767 1.691 1.614 1.537 1.459 1.381 1.302 1.224

50

2.600 2.559 2.516 2.471 2.424 2.374 2.323 2.269 2.214 2.157 2.098 2.037 1.975 1.911 1.847 1.781 1.714 1.646 1.578 1.509 1.439 1.370

55

2.575 2.538 2.500 2.459 2.417 2.373 2.327 2.280 2.231 2.180 2.128 2.075 2.020 1.964 1.907 1.849 1.790 1.730 1.669 1.608 1.546 1.484

60

2.553 2.519 2.484 2.448 2.409 2.370 2.329 2.286 2.242 2.197 2.151 2.103 2.054 2.004 1.954 1.902 1.849 1.796 1.742 1.687 1.631 1.576

65

2.534 2.503 2.470 2.437 2.402 2.366 2.329 2.290 2.250 2.210 2.168 2.125 2.081 2.036 1.990 1.944 1.896 1.848 1.799 1.750 1.700 1.650

70

2.516 2.487 2.458 2.427 2.395 2.361 2.327 2.292 2.256 2.219 2.181 2.142 2.102 2.061 2.020 1.977 1.934 1.891 1.846 1.802 1.756 1.710

75

2.500 2.473 2.446 2.417 2.387 2.357 2.325 2.293 2.260 2.226 2.191 2.155 2.118 2.081 2.043 2.005 1.965 1.926 1.885 1.844 1.802 1.760

80

2.486 2.461 2.436 2.408 2.380 2.352 2.323 2.293 2.262 2.231 2.198 2.165 2.132 2.098 2.063 2.027 1.991 1.954 1.917 1.879 1.841 1.803

85

2.473 2.449 2.425 2.399 2.374 2.347 2.320 2.292 2.263 2.234 2.204 2.174 2.143 2.111 2.079 2.046 2.012 1.979 1.944 1.909 1.874 1.838

Durbin-Watson Ninety Nine Per Cent Minimal Bound K=4 K=5 K=6 K=7 K=8 K=9 K=10 K=11 K=12 K=13 K=14 K=15 K=16 K=17 K=18 K=19 K=20 K=21

K=0

K=1

K=2

K=3

90

2.460 2.438 2.415 2.391 2.367 2.342 2.317 2.291 2.264 2.237 2.209 2.181 2.152 2.122 2.092 2.061 2.030 1.999 1.967 1.935 1.902 1.869

95

2.449 2.428 2.406 2.384 2.361 2.338 2.314 2.289 2.264 2.239 2.212 2.186 2.159 2.131 2.103 2.075 2.046 2.016 1.986 1.956 1.926 1.895

100

2.438 2.418 2.398 2.377 2.355 2.333 2.310 2.287 2.264 2.240 2.215 2.190 2.165 2.139 2.113 2.086 2.059 2.031 2.003 1.975 1.946 1.917

150

2.363 2.349 2.336 2.322 2.308 2.294 2.279 2.265 2.250 2.235 2.220 2.204 2.188 2.173 2.156 2.140 2.124 2.107 2.090 2.073 2.056 2.039

200

2.317 2.307 2.296 2.286 2.276 2.265 2.255 2.244 2.233 2.222 2.211 2.200 2.189 2.177 2.166 2.154 2.142 2.131 2.119 2.106 2.094 2.082

Durbin-Watson Significance Tables

199

Appendix

Guide to ACF/PACF Plots

The plots shown here are those of pure or theoretical ARIMA processes. Here are some general guidelines for identifying the process: Nonstationary series have an ACF that remains significant for half a dozen or more lags, rather than quickly declining to zero. You must difference such a series until it is stationary before you can identify the process. Autoregressive processes have an exponentially declining ACF and spikes in the first one or more lags of the PACF. The number of spikes indicates the order of the autoregression. Moving average processes have spikes in the first one or more lags of the ACF and an exponentially declining PACF. The number of spikes indicates the order of the moving average. Mixed (ARMA) processes typically show exponential declines in both the ACF and the PACF. At the identification stage, you do not need to worry about the sign of the ACF or PACF, or about the speed with which an exponentially declining ACF or PACF approaches zero. These depend upon the sign and actual value of the AR and MA coefficients. In some instances, an exponentially declining ACF alternates between positive and negative values. ACF and PACF plots from real data are never as clean as the plots shown here. You must learn to pick out what is essential in any given plot. Always check the ACF and PACF of the residuals, in case your identification is wrong. Bear in mind that: Seasonal processes show these patterns at the seasonal lags (the multiples of the seasonal period).

201

202 Appendix B

You are entitled to treat nonsignificant values as zero. That is, you can ignore values that lie within the confidence intervals on the plots. You do not have to ignore them, however, particularly if they continue the pattern of the statistically significant values. An occasional autocorrelation will be statistically significant by chance alone. You can ignore a statistically significant autocorrelation if it is isolated, preferably at a high lag, and if it does not occur at a seasonal lag. Consult any text on ARIMA analysis for a more complete discussion of ACF and PACF plots.
ARIMA(0,0,1), >0 ACF PACF

203 Guide to ACF/PACF Plots

ARIMA(0,0,1), <0 ACF PACF

ARIMA(0,0,2), 12>0 ACF PACF

204 Appendix B

ARIMA(1,0,0), >0 ACF PACF

ARIMA(1,0,0), <0 ACF PACF

205 Guide to ACF/PACF Plots

ARIMA(1,0,1), <0, >0 ACF PACF

ARIMA(2,0,0), 12>0 ACF PACF

206 Appendix B

ARIMA(0,1,0) (integrated series) ACF

Bibliography

Box, G. E. P., and G. M. Jenkins. 1976. Time series analysis: Forecasting and control. San Francisco: Holden-Day. Box, G. E. P., and G. C. Tiao. 1975. Intervention analysis with applications to economic and environmental problems. Journal of the American Statistical Association, 70:3, 70-79. Gardner, E. S. 1985. Exponential smoothing: The state of the art. Journal of Forecasting, 4, 1-28. Makridakis, S. G., S. C. Wheelwright, and R. J. Hyndman. 1997. Forecasting: Methods and Applications. New York: John Wiley & Sons. McCleary, R., and R. A. Hay. 1980. Applied time series analysis for the social sciences. Beverly Hills, Calif.: Sage Publications.

207

Index

ARIMA, 31, 34, 38 and missing values, 37 assumptions, 31 autoregression component, 107 constant term, 124 convergence criteria, 34 diagnosis, 112, 126, 152 differencing/integration component, 108 display options, 35 estimation, 111, 124, 151 forecast settings, 35 identification, 110, 116, 140 initial parameter values, 35 models, 32, 107 moving-average component, 109 predictors, 128 related procedures, 157 seasonal orders, 110 steps, 110 transforming values, 32 autocorrelation plots, 201 Autoregression, 23, 25, 28 and missing values, 28 assumptions, 24 Cochrane-Orcutt method, 24 convergence criteria, 26 display options, 26 Exact maximum-likelihood, 24 methods, 23, 24 Prais-Winsten method, 24 related procedures, 105 rho value, 26

Durbin-Watson statistic, 177

efficiency, 12 creating new variables, 12 err variable, 8 Exponential Smoothing, 15, 17, 18, 20, 21 assumptions, 15 create variables, 20 Holt model, 63 initial parameter values, 19 models, 15 predict cases, 20 related procedures, 90 saving new variables, 20 seasonal components, 17 seasonal factor estimates, 15 simple model, 54 trend components, 17 Winters model, 70

fit variable, 8 forecasting, 9 n-step-ahead forecasts, 9 one-step-ahead forecasts, 9

harmonic analysis, 43 historical period defining, 9

charts, 43 spectral plots, 43 time series, 43

209

210 Index

intervention analysis, 139 creating dummy variables, 148

Kalman filtering, 37

lcl variable, 8

maximum-likelihood estimation, 23 in Autoregression, 23 models reusing, 10

n-step-ahead forecasts, 9

saf variable, 8 sas variable, 8 Save (Time Series), 20, 27, 36 Seasonal Decomposition, 39, 41, 41 assumptions, 39 computing moving averages, 39 create variables, 41 models, 39 new variables, 167 periodic date component, 160 related procedures, 170 saving new variables, 41 sep variable, 8 Spectral Plots, 43, 46 assumptions, 43 bivariate spectral analysis, 45 centering transformation, 45 periodogram, 173 related procedures, 176 spectral density, 173 spectral windows, 44 stc variable, 8 step function, 148

one-step-ahead forecasts, 9

ucl variable, 8

partial autocorrelation plots, 201 performance considerations, 12 creating new variables, 12 in ARIMA, 37 in Autoregression, 28 plots autocorrelation, 201 partial autocorrelation, 201

validation period defining, 9 variables created by Trends, 8, 12

Potrebbero piacerti anche