Chapter 3

King Abdulaziz University Faculty of Engineering
Industrial Engineering Dept.
IE 436 Dynamic Forecasting

1
CHAPTER 3 Exploring Data Patterns and an Introduction to Forecasting techniques

Cross-sectional
data:
collected at a single point in time. A Time series: collected, and recorded over successive increments of time. (Page 62)
2
Exploring Time Series Data Patterns

Horizontal
Trend. Cyclical. Seasonal.
(stationary).
A Stationary Series
Its mean and variance remain constant over time
3
The long-term component that represents the growth or decline in the time series.
The Trend
The Cyclical component

The wavelike fluctuation around the trend.
25
Cost
Cyclical Peak 20 15
Trend Line
Cyclical Valley
10
10
Year
20
Page (63)
4
FIGURE 3-2 Trend and Cyclical Components of an Annual Time Series Such as Housing Costs
The Seasonal Component

A pattern of change that repeats itself year after year.
Seasonal data
800 700 600 500
500 450 400 350 350 300 250 200 200 150 250 350 400 350 400 350 400 600 550 550 550 500 750
650
Y
400 300 200 100 2
10
12 14 Index
16
18
20
22
24
Page (64)
5
FIGURE 3-3 Electrical Usage for Washington water Power Company, 1980-1991
Exploring Data Patterns with Autocorrelation Analysis Autocorrelation:

The correlation between a variable lagged one or more periods and itself.
rk
t k 1
(Y Y )(Y
t n t 1 t
t k 2
Y) k 0,1, 2, .... (3.1)
(Y Y )
rk = autocorrelation coefficient for a lag

of k periods
Yt
= observation in time period t
Yt k = observation at time period t-k

(Pages 64-65)
6
Y = mean of the values of the series
Autocorrelation Function (Correlogram)

A graph of the autocorrelations for various lags.
Computation of the lag 1 autocorrelation coefficient
Table 3-1
(page 65)
7
Example 3.1
Data are presented in Table 3-1 (page 65). Table 3-2 shows the computations that lead to the calculation of the lag 1 autocorrelation coefficient. Figure 3-4 contains a scatter diagram of the pairs of observations (Yt, Yt-1). Using the totals from Table 3-2 and Equation 3.1:
r1
t 11
(Y Y )(Y
t n t 1 t
t 1 2
Y)
(Y Y )
843 0.572 1474

8
Autocorrelation Function (Correlogram) (Cont.)

Minitab instructions: Stat > Time Series > Autocorrelation
Autocorrelation Function
1.0 0.8 0.6
Autocorrelation
0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 2 Lag 3
FIGURE 3-5 Correlogram or Autocorrelation Function for the Data Used in Example 3.1
Questions to be Answered using Autocorrelation Analysis

Are
Do
the data random?
the data have a trend? the data stationary?
Are Are
the data seasonal?

(Page 68)
10
Are the data random?

If a series is random:
The
successive values are not related to each other. Almost all the autocorrelation coefficients are significantly different from zero.
11
Is an autocorrelation coefficient significantly different from zero?

- The autocorrelation coefficients of random data have an approximate normal sampling distribution.
-At a specified confidence level, a series can be considered random if the autocorrelation coefficients are within the interval [0 t SE(rk)],
(z instead of t for large samples).

rk - The following t statistic can be used: t SE ( rk )
12
- Standard error of the autocorrelation at lag k:

SE (rk ) 1 2 ri 2
i 1 k 1
(3.2)
Where:
ri = the autocorrelation at time lag k. k = the time lag n = the number of observations in the time series
13
Example 3.2
A hypothesis test:
(Page 69)
Is a particular autocorrelation coefficient is significantly different from zero?

At significant level = 0.05: the critical values 2.2 are the t upper and lower points for n-1 = 11 degrees of freedom.
Decision Rule:
If t < -2.2 or t > 2.2, reject H: rk = 0
Note: t is given directly in the Minitab output under the heading T.
14
Is an autocorrelation coefficient different from zero? (Cont.)

The Modified Box-Pierce Q statistic
(developed by: Ljung, and Box) LBQ

A portmanteau test: Whether a whole set of autocorrelation coefficients at once.
15
rk2 Q n(n 2) k 1 n k
m
(3.3)
Where:
n= number of observations K= the time lag m= number of time lags to be considered rk= kth autocorrelation coefficient lagged k time periods
The value of Q can be compared with the chi-square with m degrees of freedom.
16
Example 3.3 (Page 70)

t
1 2 3 4 5 6 7 8 9 10
Yt
343 574 879 728 37 227 613 157 571 72
t
11 12 13 14 15 16 17 18 19 20
Yt
946 142 477 452 727 147 199 744 627 122
t
21 22 23 24 25 26 27 28 29 30
Yt
704 291 43 118 682 577 834 981 263 424
t
31 32 33 34 35 36 37 38 39 40
Yt
555 476 612 574 518 296 970 204 616 17 97
1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0
Autocorrelation
10
Lag 1 2 3 4 5 6 7
Corr -0.19 -0.01 -0.15 0.10 -0.25 0.03 0.17
T -1.21 -0.04 -0.89 0.63 -1.50 0.16 0.95
LBQ 1.57 1.58 2.53 3.04 6.13 6.17 7.63
Lag 8 9 10
Corr -0.03 -0.03 0.02
T -0.15 -0.18 0.12
LBQ 7.67 7.73 7.75
FIGURE 3-7 Autocorrelation Function for the Data Used in Example 18 3.3
Q statistic for m= 10 time lags is calculated = 7.75 (using Minitab).

The chi-square value 18.307, (tested at 0.05 significance level, degrees of freedom df = m = 10). Table B-4 (Page 527)
2 0.05=
Q<
, Conclusion: the series is random.
2 0.05
19
Do the Data have a Trend?

A significant relationship exists between successive time series values. The autocorrelation coefficients are large for the first several time lags, and then gradually drop toward zero as the number of periods increases. The autocorrelation for time lag 1: is close to 1, for time lag 2: is large but smaller than for time lag 1.
20

Data in Table 3-4 (Page 74)
Year 1955 Yt 3307 Year 1966 Yt 6769 Year 1977 Yt 17224 Year 1988 Yt 50251
1956
1957 1958
3556
3601 3721
1967
1968 1969
7296
8178 8844
1978
1979 1980
17946
17514 25195
1989
1990 1991
53794
55972 57242
1959
1960
4036
4134
1970
1971
9251
10006
1981
1982
27357
30020
1992
1993
52345
50838
1961
1962 1963 1964 1965
4268
4578 5093 5716 6357
1972
1973 1974 1975 1976
10991
12306 13101 13639 14950
1983
1984 1985 1986 1987
35883
38828 40715 44282 48440
1994
1995 1996 1997 1998
54559
34925 38236 41296 .
21
Data Differencing
A time series can be differenced to remove the trend and to create a stationary series. See FIGURE 3-8 (Page 73) for differencing the Data of Example 3.1 See FIGURES 3-12, 3-13 (Page 75)
22
Are The Data Seasonal?

For
quarterly data: a significant autocorrelation coefficient will appear at time lag 4. For monthly data: a significant autocorrelation coefficient will appear at time lag 12.
23

Table 3-5:
Year 1994 December 31 147.6
See Figures 3-14, 3-15 (Page 77)

March 31 251.8 June 30 273.1 September 30 249.1
1995
1996 1997 1998
139.3
140.5 168.8 259.7
221.2
245.5 322.6 401.1
260.2
298.8 393.5 464.6
259.5
287.0 404.3 497.7
1999
2000 2001 2002 2003 2004 2005
264.4
232.7 205.1 193.2 178.3 190.8 242.6
402.6
309.2 234.4 263.7 274.5 263.5 318.8
411.3
310.7 285.4 292.5 295.4 318.8 329.6
385.9
293.0 258.7 315.2 286.4 305.5 338.2
24
2006
232.1
285.6
291.0
281.4
Time Series Graph

Quarterly Sales: 1995-2007
600 500
Quarterly Sales
400 300 200 100 0 Years

FIGURE 3-14 Time Series Plot of Quarterly Sales for Coastal Marine for Example 3.5
25
1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0
Autocorrelation
12
Lag 1 2 3 4 5 6 7
Corr 0.39 0.16 0.29 0.74 0.15 -0.15 -0.05
T 2.83 1.03 1.81 4.30 0.67 -0.64 -0.23
LBQ 8.49 10.00 14.91 46.79 48.14 49.44 49.60
Lag 8 9 10 11 12 13
Corr 0.34 -0.18 -0.43 -0.32 0.09 -0.35
T 1.48 -0.77 -1.79 -1.24 0.32 -1.34
LBQ 56.92 59.10 71.46 78.32 78.83 87.77
FIGURE 3-15 Autocorrelation Function for quarterly Sales for Coastal Marine for Example 3.5
Autocorrelation coefficients at time lags 1 and 4 are significantly 26 different from zero, Sales are seasonal on quarterly basis.
Choosing a Forecasting Technique

Questions to be Considered:
Why is a forecast needed? Who will use the forecast? What are the characteristics of the data? What time period is to be forecast? What are the minimum data requirements? How much accuracy is required? What will the forecast cost?
27
Choosing a Forecasting Technique (Cont.)

The Forecaster Should Accomplish the Following:

Define the nature of the forecasting problem. Explain the nature of the data. Describe the properties of the techniques. Develop criteria for selection.
28
Choosing a Forecasting Technique (Cont.)
Factors Considered:

Level of Details. Time horizon. Based on judgment or data manipulation. Management acceptance.
Cost. 29
General considerations for choosing the appropriate method

Method
Judgment
Uses
Can be used in the absence of historical data (e.g. new product). Most helpful in mediumand long-term forecasts Sophisticated method Very good for medium- and long-term forecasts Easy to implement Work well when the series is relatively stable
Considerations
Subjective estimates are subject to the biases and motives of estimators.
Causal
Must have historical data. Relationships can be difficult to specify Rely exclusively on past data. Most useful for short-term estimates.
Time series
30
Method
Nave Simple averages Moving averages Single Exponential smoothing Linear (Double) exponential smoothing (Holts) Quadratic exponential smoothing Seasonal exponential smoothing (Winters) Adaptive filtering Simple regression
Pattern of Data
ST , T , S ST ST ST T T S S T
Time Horizon
S S S S S S S S I
Type of Model
TS TS TS TS TS TS TS TS C
Minimal Data Requirements Nonseasonal

1 30 4-20 2 3 4 2xs 5xs 10
Seasonal
Multiple regression
Classical decomposition Exponential trend models S-curve fitting Gompertz models Growth curves Census X-12 ARIMA (Box-Jenkins) Lading indicators Econometric models
C,S
S T T T T S ST , T , C , S C C
I
S I,L I,L I,L I,L S S S S
C
TS TS TS TS TS TS TS C C
10 x V
5xs 10 10 10 10 6xs 24 24 30 6xs 3xs
Time series multiple regression T,S I,L C Pattern of data: ST, stationary; T, trended; S, seasonal; C, cyclical. Time horizon: S, short term (less than three months); I, intermediate; L, long term Type of model: TS, time series; C, causal. Seasonal: s, length of seasonality. of Variable: V, number variables.
31
Measuring Forecast Error

Basic Forecasting Notation
Y t = actual value of a time series in time t Y t = forecast value for time period t
e t = Yt -
Yt
= forecast error in time t (residual)

32
Measuring Forecasting Error (Cont.)

The Mean Absolute Deviation The Mean Squared Error The Root Mean Square Error
The Mean Absolute Percentage Error
1 n MAD Yt Yt n t 1 2 1 n MSE (Yt Yt ) n t 1

RMSE 2 1 n (Yt Yt ) n t 1
The Mean Percentage Error
Yt n 1 (Yt Yt ) MPE n t 1 Yt
33
1 n MAPE n t 1
Yt Yt
Equations (3.7 - 3.11)
Used for:
The measurement of a technique usefulness or reliability. Comparison of the accuracy of two different techniques. The search for an optimal technique.

Evaluate the model using: MAD, MSE, RMSE, MAPE, and MPE.
34
Results of the forecast accuracy for a sample of 3003 time series (1997):
Empirical Evaluation of Forecasting Methods
Complex methods do not necessarily produce more accurate forecasts than simpler ones. Various accuracy measures (MAD, MSE, MAPE) produce consistent results. The performance of methods depends on the forecasting horizon and the kind of data analyzed( yearly, quarterly, monthly).
35
Determining the Adequacy of a Forecasting Technique
Are the residuals indicate a random series? (Examine the autocorrelation coefficients of the residuals, there should be no significant ones)
Are they approximately normally distributed?
Is the technique simple and understood by decision makers?

36

Chapter 3

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Chapter 3

Caricato da

Copyright:

Formati disponibili

King Abdulaziz University Faculty of Engineering

Industrial Engineering Dept.

IE 436 Dynamic Forecasting

CHAPTER 3 Exploring Data Patterns and an Introduction to Forecasting techniques

Exploring Time Series Data Patterns

The Cyclical component

The Seasonal Component

Exploring Data Patterns with Autocorrelation Analysis Autocorrelation:

Y) k 0,1, 2, .... (3.1)

rk = autocorrelation coefficient for a lag

= observation in time period t

Yt k = observation at time period t-k

Y = mean of the values of the series

Autocorrelation Function (Correlogram)

Computation of the lag 1 autocorrelation coefficient

843 0.572 1474

Autocorrelation Function (Correlogram) (Cont.)

0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 1 2 Lag 3

Questions to be Answered using Autocorrelation Analysis

the data random?

the data have a trend? the data stationary?

the data seasonal?

Are the data random?

Is an autocorrelation coefficient significantly different from zero?

(z instead of t for large samples).

- Standard error of the autocorrelation at lag k:

Is a particular autocorrelation coefficient is significantly different from zero?

Is an autocorrelation coefficient different from zero? (Cont.)

(developed by: Ljung, and Box) LBQ

Example 3.3 (Page 70)

Corr -0.19 -0.01 -0.15 0.10 -0.25 0.03 0.17

T -1.21 -0.04 -0.89 0.63 -1.50 0.16 0.95

LBQ 1.57 1.58 2.53 3.04 6.13 6.17 7.63

Corr -0.03 -0.03 0.02

T -0.15 -0.18 0.12

LBQ 7.67 7.73 7.75

Q statistic for m= 10 time lags is calculated = 7.75 (using Minitab).

, Conclusion: the series is random.

Do the Data have a Trend?

Example 3.4 (Page 72)

Are The Data Seasonal?

Example 3.5 (Page 76)

See Figures 3-14, 3-15 (Page 77)

Time Series Graph

400 300 200 100 0 Years

Corr 0.39 0.16 0.29 0.74 0.15 -0.15 -0.05

T 2.83 1.03 1.81 4.30 0.67 -0.64 -0.23

LBQ 8.49 10.00 14.91 46.79 48.14 49.44 49.60

Corr 0.34 -0.18 -0.43 -0.32 0.09 -0.35

T 1.48 -0.77 -1.79 -1.24 0.32 -1.34

LBQ 56.92 59.10 71.46 78.32 78.83 87.77

Choosing a Forecasting Technique

Choosing a Forecasting Technique (Cont.)

Choosing a Forecasting Technique (Cont.)

General considerations for choosing the appropriate method

Minimal Data Requirements Nonseasonal

Measuring Forecast Error

= forecast error in time t (residual)

Measuring Forecasting Error (Cont.)

1 n MAD Yt Yt n t 1 2 1 n MSE (Yt Yt ) n t 1

The Mean Percentage Error

Equations (3.7 - 3.11)

Example 3.6 (Page 83)

Empirical Evaluation of Forecasting Methods