Sei sulla pagina 1di 10

Expert Systems with Applications 36 (2009) 1110811117

Contents lists available at ScienceDirect

Expert Systems with Applications


journal homepage: www.elsevier.com/locate/eswa

A hybrid simulation-adaptive network based fuzzy inference system


for improvement of electricity consumption estimation
A. Azadeh a,*, M. Saberi b,1, A. Gitiforouz a, Z. Saberi c
a

Department of Industrial Engineering and Center of Excellence for Intelligent Based Experimental Mechanics, College of Engineering, University of Tehran, Iran
Department of Industrial Engineering, University of Tafresh, Iran
c
Department of Industrial Engineering, Sharif University of Technology, Iran
b

a r t i c l e

i n f o

Keywords:
Hybrid
Adaptive network based fuzzy inference
system
Computer simulation
Improvement
Time series
Electricity consumption

a b s t r a c t
This paper presents a hybrid adaptive network based fuzzy inference system (ANFIS), computer simulation and time series algorithm to estimate and predict electricity consumption estimation. The difculty
with electricity consumption estimation modeling approach such as time series is the reason for proposing the hybrid approach of this study. The algorithm is ideal for uncertain, ambiguous and complex estimation and forecasting. Computer simulation is developed to generate random variables for monthly
electricity consumption. Various structures of ANFIS are examined and the preferred model is selected
for estimation by the proposed algorithm. Finally, the preferred ANFIS and time series models are
selected by GrangerNewbold test. Monthly electricity consumption in Iran from 1995 to 2005 is considered as the case of this study. The superiority of the proposed algorithm is shown by comparing its results
with genetic algorithm (GA) and articial neural network (ANN). This is the rst study that uses a hybrid
ANFIS computer simulation for improvement of electricity consumption estimation.
2009 Elsevier Ltd. All rights reserved.

Signicance
This is the rst study that presents a hybrid simulation-adaptive network fuzzy inference system (ANFIS) for improvement of
electricity consumption estimation. The unique features of the proposed algorithm are two fold. First, ANFIS is ideal for complex and
uncertain data because it is composed of both ANN and fuzzy systems. Second Monte Carlo simulation is used to generate input
variables whereas the conventional methods use deterministic
data. The superiority of the proposed algorithm is shown by comparing its results with time series, genetic algorithm (GA) and ANN.
1. Introduction
There have been several studies on ANN and neuro-fuzzy models
in different cases (Baylar, Hanbay, & Ozpolat, 2008; aydas,
Hasalk, & Ekici, 2009; Dogantekin, Yilmaz, Dogantekin, Avci, &
Sengur, 2008; Huang, Kang, Chu, Chien, & Chang, 2009; Khajeh,
Modarress, & Rezaee, 2008; Subasi, Serdar Yilmaz, & Binici, 2008;
Wang & Chen, 2008).2 ANN is congured for a specic application,
such as pattern recognition, function approximation, data classica-

* Corresponding author.
E-mail address: aazadeh@ut.ac.ir (A. Azadeh).
1
Member of Young Researcher Club of Azad University of Tafresh.
2
As there are numerous papers, only some of ESWA papers are cited.
0957-4174/$ - see front matter 2009 Elsevier Ltd. All rights reserved.
doi:10.1016/j.eswa.2009.02.081

tion and so on in different areas of science. Time series modeling is


one of the main applications. Many researchers showed ANNs comparability and superiority to conventional methods for estimating
functions (Azadeh, Ghaderi, Tarverdian, & Saberi, 2006; Azadeh,
Ghaderi, & Sohrabkhani, 2007; Azadeh, Ghaderi, Tarverdian, & Saberi,
2007; Chiang, Urban, & Baldridge, 1996; Hill, OConnor, & Remus,
1996; Hwarng, 2001; Indro, Jiang, Patuwo, & Zhang, 1999; Jhee &
Lee, 1993; Kohzadi, Boyd, Kermanshahi, & Kaastra, 1996; Stern,
1996; Tang, Almeida, & de Fishwick, 1991; Tang & Fishwick, 1993).
Whereas neuro-fuzzy is combination of ANN and fuzzy system, have
a benet of two models and is selected instead of ANN (Werbos,
1974). Some of the neuro-fuzzy systems are well known by their
short names. For example, ANFIS (Jang, 1993), DENFIS (Kasabov,
2002), SANFIS (Wang & Lee, 2002) and FLEXNFIS (Rutkowski &
Cpalka, 2003), etc. ANFIS is used in present study as one of algorithm
tools.
One of the main objectives of this research is to combine conventional time series concepts with ANFIS. We show these concepts are more useful in improving ANFIS performance. These
concepts are preprocessing (for madding process, covariance stationary), post processing (to access main data) and principle component analysis (for input selection). Exploring the literature
reveals that combination of traditional concepts with ANFIS to
model time series has been rarely done. Although data preprocessing concept is considered in some literature, but the covariance
stationary concept in data preprocessing is ignored (Aznarte
et al., 2007; Gareta, Romeo, & Gil, 2006; Jain & Kumar, 2007;

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117

Karunasinghea & Liong, 2006; Nayak, Sudheer, Rangan, & Ramasastri, 2004; Niskaa, Hiltunen, Karppinen, Ruuskanen, & Kolehmainena, 2004; Oliveira & Meira, 2006; Tseng, Yu, & Tzeng, 2002).
Other aim is to used computer simulation as an overlapping approach with ANFIS. However, this study uses computer simulation
to generate random variables to be used in ANFIS, whereas previous studies only use available raw data for ANFIS. Moreover, the
integration of ANFIS and simulation is proposed as an alternative
forecasting approach in this study and it is compared with ANFIS
and time series.
Computer simulation has excellent capabilities such as proper
description of system behavior, scenario analysis and forecasting
capabilities. Numerous studies have been conducted in domain of
ANFIS or computer simulation; however, this is the rst study that
integrates ANFIS and simulation for forecasting electricity consumption to be used in the proposed algorithm of this paper. In
addition, in context of forecasting, simulation is an attractive tool
because it allows generation of random variables, which could dene the complex behavior of input data for forecasting problems.
Simulation could help in modeling of past data to be used for electricity demand process with relatively low cost.
Electricity, as a resource of energy, with its ever growing role in
world economy, and its multi-purpose application in production
and consumption has gained special attention. Through the development of societies and growth of economical activities, electricity
becomes more effective on corporations and their services. Corporations use electricity as a production factor. Also, families directly
or indirectly rely on electricity. Thus, energy consumption determines their and societys economical welfare. A major application
is the estimation of electricity consumption, which reveals the consumption growth in the forthcoming years.
This paper is organized as follows. In the next section, ANFIS is
described. After introducing one of the most famous conventional
models (ARIMA model), the importance of data preprocessing is
explained and different data preprocessing methods are proposed.
An algorithm3 for estimation and prediction with ANFIS is also
developed in Section 5. The results are shown in Section 6.
2. The hybrid algorithm
The ANFIS model with preprocessed data (ANFISW) and ANFIS
model without preprocessed data (ANFISWO) are considered to
determine the impact of preprocessing on ANFIS. Moreover, the
raw data is simulated by computer simulation to identify its probability distribution and the mean of probability distribution is then
used as input data for ANFIS. This is of course repeated for each
month. The advantage of simulated-based is to foresee if the stochastic nature of data has any impact on future demand estimation. Except ANFIS, the best time series model for the data set is
also identied. This is done to compare the best ANFIS model
against the best conventional time series model.
This algorithm has the following general basic steps:
Step 1: Collect data set in all available previous periods. Then, the
stationary assumption should be studied for both ANFISW
model and time series models. If the models are not
covariance stationary, the most suitable preprocessing
method should be selected and applied to the model. In
addition, simulated data must be generated with com-

3
In this algorithm, three basic models are selected in which two of them are ANFIS
models and the last is time series model. Two ANFIS models are ANFIS model with
preprocessing (ANFISW) and ANFIS model without preprocessing (ANFISWO). The
most suitable ANFIS model is then compared with time series model with the aid of
GrangerNewbold test. Preferred time series model is selected from linear or
nonlinear models. McleodLi test is used for this.

11109

puter simulation approach and above process must be


considered.
Step 2: Divide data into two sets, one for estimating the models
called the train data set and the other one for evaluating
the validity of the estimated model called test data set.
Usually train data set contains 7090% of all data and
remaining data are used for test data set (Zhang and Hu,
1998).
Step 3: Run and estimate all models.Input variables for ANFIS
models can be selected using autocorrelation function
(ACF). However, in most heuristic methods, selecting
input variables is experimental or based on the trial and
error method (Aznarte et al., 2007; Box & Jenkins, 1970;
Cybenko, 1989; Gareta et al., 2006; Hwarng, 2001; Jain
& Kumar, 2007; Karunasinghea & Liong, 2006; Nayak
et al., 2004; Niskaa et al., 2004; Oliveira & Meira, 2006;
Palmer, Montano, & Sese, 2006; Rumelhart & McClelland,
1986; Schiffmann, Joost, & Werner, 1992; Tseng et al.,
2002; Zhang & Hu, 1998; Zhang & Qi, 2005). Importance
of ACF approach is understood when difculty and careless of trial and error method are considered. Irregular
input selection is cause of its lack of preciseness. Even if
all the previous lag combinations are used, the trial and
error method will be time-consuming. For example, if
all the combinations are selected from the recent 12 lag,
the number of combination will be:


12 
X
12
i1

212 4096

While ACF approach introduces few combinations for


model input in comparison with trial and error process.As
well, for time series model preferred ARIMA model is selected. Input variables can be selected using autocorrelation function (ACF) and partial autocorrelation function
(PACF). By using result of selected model, McLeodLi test
is applied. The result of this test shows that nonlinear time
series must be construct or not.
Step 4: Post process the estimated data in the models which the
data were preprocessed.
Step 5: The reliability of each model is evaluated in this step.
GrangerNewbold test is used to compare the models.
First, the four fuzzy models (ANFISW, ANFISWO, ANFISIMW & ANFISIMWO) are compared with each other to
study the impact of preprocessing on ANFIS. The most
suitable ANFIS is called ANFIS*. The preferred ARIMA
model is selected from plausible model by aid of GrangerNewbold test and is called Li. The nonlinearity of
the process is determined by McLeodLi test. The result
of Li is used for McLeodLi test. If this test shows the nonlinearity process, a preferred nonlinear model is identied
and used (dened as NL). Finally, GrangerNewbold test
is used to select either ANFIS or time series. The main elements of the proposed algorithm are described next.

2.1. ANFIS
The fuzzy inference process we have been referring to so far is
known as Mamdanis fuzzy inference method, the most common
methodology (Mamdani & Assilian, 1975). The so-called Takagi
Sugeno method of fuzzy inference introduced in 1985 (Sugeno,
1985). It is similar to the Mamdani method in many respects.
The rst two parts of the fuzzy inference process, fuzzifying the inputs and applying the fuzzy operator, are exactly the same. The
main difference between Mamdani and TakagiSugeno method is

11110

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117

that the TakagiSugeno output membership functions are either


linear or constant. A typical rule in a TakagiSugeno fuzzy model
has the form:

If Input 1 x and Input 2 y then Output is z ax by c


For a zero-order Sugeno model, the output level z is a constant
(a = b = 0).
A TakagiSugeno system is suited for modeling nonlinear systems by interpolating between multiple linear models. are a promising alternative to econometric. ANFIS are congured for specic
applications, such as signal processing, automatic control, information retrieval, database management, and computer vision and
data classication, through learning process. e.g. (Jang, 1993).
2.2. Conventional time series models
Time series models are quite well known to predict a variable
behavior in the future by knowing its behavior in the past. One
of the most famous time series models is Autoregressive Integrated
Moving Average (ARIMA) model. The ARIMA model belongs to a
family of exible linear time series models that can be used to
model many different types of seasonal as well as non-seasonal
time series. In the most popular multiplicative form, the ARIMA
model can be expressed as:

Up Lyt hq Let
with

Up L 1  U1 L  . . .  Up L

Uq L 1  h1 L  . . .  hq Lq

threshold auto regressive (TAR) models in which the parameters


are dependent to the past values of the procedure. One example
of such models is described by Eq. (4).

where s is the seasonal length, L is the back Shift operator dened


by Lkyt = ytk and et is a sequence of white noises with zero mean
and constant variance. Eq. (1) is often referred to as the ARIMA (p,
q) model. p and q are the order of autoregressive and moving average terms, respectively.
Box and Jenkins (1970) proposed a set of effective model building strategies for identication, estimation, diagnostic checking,
and forecasting of ARIMA models (Werbos, 1974). In the identication stage, the sample autocorrelation function (ACF) is plotted. A
slowly decaying autocorrelation function suggests non-stationary
behavior. In such circumstances, Box and Jenkins recommend differencing of the data. A common practice is to use a logarithmic
transformation, if the variance does not appear to be constant. After
preprocessing, if needed, ACF and PACF of preprocessed data are
examined to determine all plausible ARIMA models. A well-estimated model should be parsimonious, ts the data well, has residuals that approximate a white noise process and has good out-ofsample forecasts. BoxPierce Q-statistic can be used to test whether
residuals can approximate the white noise process. So, the models
which are not white noise process will be eliminated from consideration. Then, parsimony and well-tness of the model are checked
using Akaike information criterion (AIC) and Schwartz Bayesian criterion (SBC). Finally, The GrangerNewbold test is applied to compare the forecasting performance of the models (Niskaa et al.,
2004). For more information about AIC, SBC, BoxPierce Q-statistic
and GrangerNewbold test, refer to Appendices IIII.
Some nonlinear time series patterns were also developed
mainly by Granger and Pristly. One of these nonlinear models is referred to as bilinear of which the rst rank model of the bilinear
model is as shown in Eq. (3).

Xt

a1 X t1 Z 1
if X t1 < d
t1
2
a2 X t1 Z t1 if X t1 P d

Furthermore, the proposed algorithm ts the best linear or nonlinear model to the data set. This is quite important because most
studies assume that linear time series such as ARIMA provide the
best t.
2.3. Data preprocessing
In time series methods creating a covariance stationary4 process
is one of the basic assumptions. Also, using preprocessed data is
more useful in most heuristic methods which require the investigation of stationary assumption for the models (Zhang & Hu, 1998). If
the models are not covariance stationary, the most suitable preprocessed method should be dened and applied. In forecasting models,
a preprocessing method should have the capability of transforming
the preprocessed data in to its original scale (called post processing).
Therefore, in time series forecasting method, appropriate preprocessing method should have two main properties. It should make
the process stationary and must have the post processing capability.
The most useful preprocessed methods are studied in the sections.
2.3.1. Normalization
There are different normalization algorithms which are Min
Max Normalization, Z-Score Normalization and Sigmoid Normalization. The MinMax normalization scales the numbers in a data
set to improve the accuracy of the subsequent numeric computations. Tseng et al. (2002), Nayak et al. (2004), Niskaa et al.
(2004), Karunasinghea and Liong (2006), Oliveira and Meira
(2006), Gareta et al. (2006), Aznarte et al. (2007), and Jain and
Kumar (2007) used this method in their articles to estimate time
series functions using heuristic approach.
If xold, xmax, xmin, xold are the original, maximum and minimum
values of the raw data, respectively and x0max ; x0min are the maximum and minimum of the normalized data, respectively, then
the normalization of xold called x0 new can be obtained by the following transformation function:

x0new



xold  xmin  0
xmax  x0min x0min
xmax  xmin

In Z-score normalization the data are changed so that their mean


and variance are 0 and 1, respectively. The transformation function
used for this method is as follows where std is the standard deviation of the raw data:

xnew

xold  mean
std

The Sigmoidal Normalization uses a Sigmoid function to scale the


data in the range of [1, 1]. The transformation function is as
follows:
a

xnew 1e
1ea

a xold mean
std

2.3.2. The rst difference method


The rst step in the BoxJenkins method is to transform the
data so as to make it stationary. The difference method was
proposed by Box and Jenkins (1970) and Werbos (1974). Also

In which Zt is the stochastic procedure and a, b and c are the model


parameters. It should be noted that only the last part of the above
equation is nonlinear. Another type of nonlinear models is the

4
By denition, an ARIMA model is covariance stationary if it has a nite and timeinvariant mean and covariance.

X t a X t1 b Z t c Z t1 X t1

11111

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117

Tseng et al. used this method in their article to estimation time series functions using heuristic approach (Tseng et al., 2002). The following transformation should be applied for the method:

yt xt  xt1

Industry in Iran (including transmission and distribution) published by the TAVANIR management organization (19922005).
As simulated data is important for us, the related process discussed
at rst.

However, for the rst difference of the logarithm method the transformation is adjusted as follows:

3.1. The simulated data

yt logxt  log xt1

For this purpose, the distribution function of each month is calculated and then by using the related function for each month, the
average value of that month is obtained. By this way, instead of
using the deterministic value for each month we have its average
value from the probable distribution the real value belongs to.
When the distribution of each month is found, the amount of the
square error and p-value of that distribution is also returned.
An example of the selected distribution functions for each
month is shown in Table 1. The selected distribution functions
are selected from a series of distribution functions according to
their p-values and square errors. Table 2 shows an example of such
selection for the period of 3/2001 to 6/2001 (see Fig. 1).

2.4. The McleodLi test


McLeod and Li (1983) was proposed to detect nonlinearity in
time series data. The McLeodLi Test seeks to determine if there
are signicant autocorrelations in the squared residuals from a linear equation (Zhang, 2001). To perform a test, autocorrelation coefcients in the LjungBox statistic are replaced by autocorrelation
coefcients of the squared residuals. This statistic determines
whether the squared residuals exhibit serial correlation. Ljung
Box statistic is discussed in Appendix II.
2.5. The GrangerNewbold test
Granger and Newbold (1974) test was proposed to compare two
time series models (Ghiassi, Saidane, & Zimbra, 2005). First, xt and
zt elements are formed as follow:

xt er1t er2t and zt er1t  er 2t

10

Let rxz denote the sample correlation coefcient between {xt} and


{zt}. Granger and Newbold show that r xz = 1  r 2xz H  10:5 has
a t-distribution with H  1 degrees of freedom. Thus, if rxz is statistically different from zero, model 1 has a larger mean square error
(MSE) than model 2 if rxz is positive and model 2 has a larger MSE
than model 1 if rxz is negative.
2.6. Error estimation methods
There are four basic error estimation methods which are listed
below:





Mean absolute error (MAE)


Mean square error (MSE)
Root mean square error (RMSE)
Mean absolute percentage error (MAPE)

Table 1
Electricity consumption distribution for each month in 2001.
Date

Distribution

3/2001
4/2001
5/2001
6/2001
7/2001
8/2001
9/2001
10/2001
11/2001
12/2001
1/2002
2/2002

Normal (285000, 20900)


Triangular (297000, 332000, 536000)
Normal (365000, 18300)
Normal (403000, 14200)
Normal (415000, 12300)
Normal (387000, 19000)
Normal (341000, 17200)
Normal (325000, 11400)
Normal (323000, 14700)
Normal (323000, 10000)
Normal (330000, 12000)
Normal (285000, 20900)

Table 2
Distribution functions for 3/2001 to 6/2001.
Month

Distribution function

Square error

p-value

3/2001

Uniform
Erlang
Exponential
Normal
Triangular
Gamma
Lognormal

0.077
0.153
0.153
0.100
0.099
0.160
0.249

0.045
0.038
0.038
>0.15
<0.01
<0.01
<0.01

4/2001

Uniform
Erlang
Exponential
Normal
Triangular
Gamma
Lognormal

0.138
0.225
0.225
0.082
0.074
0.245
0.348

>0.15
<0.01
<0.01
>0.14
>0.15
<0.01
<0.01

5/2001

Uniform
Erlang
Exponential
Normal
Triangular
Gamma
Lognormal

0.046
0.144
0.144
0.025
0.002
0.150
0.257

<0.01
<0.01
<0.01
>0.14
>0.15
<0.01
<0.01

6/2001

Uniform
Erlang
Exponential
Normal
Triangular
Gamma
Lognormal

0.123
0.225
0.225
0.093
0.061
0.233
0.308

<0.01
<0.01
<0.01
>0.15
0.124
<0.01
<0.01

They can be calculated by the following equations, respectively:

MAE
MSE
RMSE

Pn

t1

jxt x0t j

Pn
t1

xt x0 2
n

rP

n
0 2

MAPE

t1

xt x

11

Pn xt x0 
x
t1

All methods, except MAPE have scaled output. MAPE method is the
most suitable method to estimate the relative error because input
data used for the model estimation, preprocessed data and raw data
have different scales (Azadeh, Ghaderi, & Sohrabkhani, 2007;
Azadeh, Ghaderi, Tarverdian et al., 2007). Therefore, MAPE is used
as the major reference in this study.
3. The case study
The proposed algorithm is applied to 130 set of data which are
the monthly consumption in Iran from April 1994 to February
2005. Detailed information can be obtained from Electric Power

11112

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117

Step1
Collect data set in all available previous periods

Generating data by Monte Carlo simulation


No

Is the
process
stationary?

Determine the most suitable


preprocessing method for the actual
data

Yes

Step 2
Step 3

Divided data into training and test data sets

Run ANFISW, ANFISWO, AN FISIMW and ANFISIMWO

Step 4

Run appropriated ARIMA models

Post process the results of model

Step 5
Select the preferred ARIMA model by Box-pierce Qstatistic test, AIC, SBC and defined as L1 .
Run Granger-Newbold test for
comparison of the ANFIS *and Time
Series with actual data
Select the preferred
nonlinear model and
called NL1 and
Defined as Time Series

Yes

McLeod-LI
Test
No

Select the preferred model from


ANFIS and Time Series

Set L1 as preferred time


series model and called
Time Series

Fig. 1. The hybrid simulated ANFIS time series algorithm.

Fig. 2. The network to simulate electricity consumption average values.

Using these results, the average values for each month is simulated by Visual Slam (Enlin, 1995). The selected distribution functions for each month is then generated 1000 times to obtain steady
state. An example of the Visual Slam network can be seen in Fig. 2.
The outputs of the simulation are the average and standard deviation of daily consumption values. Then, the upper and lower limits
are constructed by l 3r. Next, the daily results are multiplied by

the number of days in per month and consequently monthly consumption values are obtained.
3.2. Step 1
It can be seen from Fig. 3a that raw data has a trend. As removing the trend is needed for more precise estimation in time series

11113

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117

methods and also for studying impact of preprocessing on ANFIS,


all preprocessing methods are applied on both ANFISW and ARIMA
model. Consequently, the best preprocessing method is selected to
convert the model to covariance stationary process. The results of
applying preprocessing methods for given data set is discussed in
the next section.

3.3.1. Normalization
All three methods of normalization are used to preprocess the
data, but as can be seen in Fig. 3bd, in which the normalized consumption data are shown the trend of data cannot be removed by
any of the normalization methods. Therefore normalization is not
suitable for preprocessing the data set.

3.3. Step 2

3.3.2. The rst difference method


The preprocessed data using the rst difference method is
shown in Fig. 3e. Although, the rst difference of the series seems
to have a constant mean, the variance ha an increasing pattern over
time. Thus, this method is not covariance stationary and cannot be

The 130 rows of data are divided into 118 training data set and
12 test data set. Also, the 129 preprocessed data are divided into
117 training data set and 12 test data set.

18000000

0.016

16000000

0.014

14000000

0.012

12000000

0.01

10000000

0.008

8000000

0.006

6000000

0.004

4000000

0.002

2000000

0
1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129

0
1

10 19 28 37 46 55 64 73 82 91 100 109 118 127

(a) Row data

(b) preprocessed data by Min-Max


normalizationmethod

3.5

0.8

0.6

2.5

0.4

0.2

1.5
1

0.5

-0.2

0
-0.5

11

21

31

41

51

61

71

81

91

101

111

121

131

13

25

37

49

61

73

85

97

109

121

-0.4
-0.6

-1
-1.5

-0.8

-2

-1

(c) preprocessed data by Zscore


normalization method

(d) preprocessed data by Sigmoidal


normalization method

2500000

0.08

2000000

0.06

1500000

0.04

1000000
0.02

500000

0
-500000

12

23

34

45

56

-1000000

67

78

89

100

111 122

11

21

31

41

51

61

71

81

91

101

111 121

-0.02
-0.04

-1500000
-2000000

-0.06

-2500000

-0.08

-3000000

-0.1

(e) preprocessed data by first


Difference method

(f) preprocessed data by first difference of the


logarithm method
Fig. 3. Raw and preprocessed data by different methods.

11114

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117

0.3
ACF

0.2

PACF
0.1
0
1

11

13

15 17

19 21 23 25 27

29

31 33

used for data preprocessing as prescribed by the algorithm. It


can be seen in Fig. 3f that the rst difference of logarithm is the
most likely candidate to have covariance stationary process.
Moreover, it is the most applicable preprocessing method for the
data set.

35

-0.1

Fig. 4. The ACF and PACF chart for preprocessed data.

Test Months
Actual data
ANFISIMW monthly electricity
consumption estimation

Electricity Consumption(Kwh)

Fig. 7. The comparison of ANFISWO output with actual data.

18000000
16000000
14000000
12000000
10000000
8000000
6000000
4000000
2000000
0

20000000
18000000
16000000
14000000
12000000
10000000
8000000
6000000
4000000
2000000
0

Fe

5)
00
)
n(2
J a 0 04
c(2 )
De 0 04
v(2
No 04)
0
t(2
)
Oc 0 04
p(2 )
Se 0 04
g(2 )
4
Au
00
)
l(2
J u 0 04
)
n(2
J u 0 04
y (2
Ma 04)
0
r(2
)
Ap 0 04
r(2
Ma 0 04)
b(2
Fe

Electricity Consumption(Kwh)

MAPE
0.036
0.025
0.045
0.021
0.015
0.025
0.035
0.039
0.025
0.026

5)
00
n(2 4)
Ja
0
0
c(2 )
De
4
00
v(2
No 04)
0
t(2
Oc 004)
p(2
Se 004)
g(2
Au
4)
00
l(2
Ju
4)
00
n(2 4)
Ju
00
y(2 )
4
Ma
00
r(2
Ap
4)
00
r(2 )
04
(20

Number of terms for each linguistic variables


1
2
3
4
5
6
7
8
9
10

Ma

Model
1
2
3
4
5
6
7
8
9
10

b
Fe

Table 3
Architecture of the 10 models and networks error.

20000000
18000000
16000000
14000000
12000000
10000000
8000000
6000000
4000000
2000000
0

b(
20
ar 04 )
(2
Ap 004
)
r(
M 200
ay 4 )
(2
Ju 00
n( 4 )
20
Ju 04
l (2 )
Au 00
g( 4 )
2
Se 00
p( 4 )
20
O 04
ct
)
(
N 200
ov
4)
(2
De 00
c( 4)
2
Ja 004
n(
20 )
05
)

-0.4

-0.3

Electricity Consumption(KWh)

-0.2

Test Months

Test months

Actual data

Actual data

ANFISWO monthly electricity


consumption estimation

ANFISW monthly electricity


consumption estimation

Fig. 8. The comparison of ANFISIMW output with actual data.

Y-1

20000000
15000000
10000000
5000000
0
5)
00
n (2
Ja
4)
00
c(2
De
4)
00
v(2
No
4)
00
t(2
Oc
4)
00
p(2
Se
4)
00
g(2
Au
4)
00
l(2
Ju
4)
00
n (2
Ju
4)
00
y(2
Ma
4)
00
r (2
Ap
4)
00
r(2
Ma
4)
00
b(2
Fe

Electricity Consumption(KWh)

Fig. 5. The comparison of ANFISW output with actual data.

Test Months

Input layer

input
membership
function layer

Rule layer Output


membership
function layer

Fig. 6. 5th ANFIS model structure.

Output
Layer

Actual data
ANFISIMWO monthly electricity
consumption estimation
Fig. 9. The comparison of ANFISIMW output with test actual data.

11115

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117


Table 4
The Q-statistics, AIC, SBC and coefcients estimation.
AR (2)

ARIMA (1,1)

ARIMA (1,(1,12))

AR (2, MA(12))

0.009
0.201

0.008
0.235
0.174

0.009
0.616

0.003
0.235

0.008
0.122
0.076

0.879

0.149
0.757
1,3 (0.85)
7.2
(0.51)
11.5
(0.41)
648
640

15.6 (0.00)
25.005
(0.00)
31.35
(0.05)
623
620

Q(12)
(0.00)
AIC
SBC

8.94 (0.07)
16.04
(0.05)
21.4
(0.11)
632
625

3.4. Step 3
For fuzzy regression models, ACF approach is used to select input variables among 12 lags. According to Fig. 4, yt is the function
of consumption in the 1th lag in preprocessed data. Similarly, yt is
the function of consumption in the 1th and 12th lags in raw data.
Input variables are selected for simulated data with ACF too.
3.4.1. Estimation of electricity consumption by ANFISW
In order to get the best ANFIS for the estimation of electricity
consumption, 10 models are tested to nd the best architecture.
The MAPE values of the mentioned models are shown in Table 3.
Gaussian and bell shaped membership functions are becoming
increasingly popular for specifying fuzzy sets as they are nonlinear
and smooth and their derivatives are continuous. Gradient methods can be used easily for optimizing their design parameters. In
present study, Gaussian membership function5 is considered as
term membership function. Therefore, 5th model is selected for estimating the electricity consumption.
The comparison of ANFISW output with test actual data is
shown in Fig. 5. Fig. 6 shows structure of fth model.
3.4.2. Estimation of electricity consumption by ANFISWO
With similar approach, model with 0.05 MAPE value is selected
for estimating the electricity consumption. The comparison of ANFISWO with actual data is shown in Fig. 7. Figs. 8 and 9 show the
comparison of actual data with and ANFISIMW and ANFISIMWO,
respectively.
3.4.3. Estimation of electricity consumption by time series model
In order to nd the preferred time series model appropriate ARIMA model must be estimated. The preferred ARIMA model is selected by aid of GrangerNewbold test. Then By using the result
of this model, McLeodLi test is applied. The result of this test
shows that nonlinear time series must be used or not. The following sections show that ARIMA model is sufcient for the case
study. Moreover, pervious researches show that linear time series
is the most ideal for our case study (Zhu, 1998).

Gauss x; r; c e

xc2
2r2

0.885
2.02 (0.72)
8.5
(0.41)
12.1
645
642

components. A seasonal factor at lag 12 is incorporated due to


availability of monthly data. Therefore, 5 models are considered
which are AR(1), AR(2), ARIMA(1,1), ARIMA(1,(1,12)) and AR(2,
MA(12)) for the training data set. Table 4 shows models information including Q-statistics, AIC, SBC and coefcients estimation.
BoxPierce Q-statistic test shows that AR (1) and AR (2)
should be eliminated. However, as measured by AIC and SBC,
ARIMA(1,1) and AR(2,MA(12)) do not t the data as well as the
ARIMA (1,(1,12)). Also GrangerNewbold test shows that ARIMA(1,(1,12)) has the better forecasting performance than
ARIMA(1,1) and AR(2, MA(12)). The value of t-statistic is 2.52
and 2.32, respectively.
3.4.3.2. The McleodLi test. Residuals of ARIMA(1,(1,12)) are used
to run McleodLi test examination of (Table 5) shows that
nonlinearity condition is not satised. So, ARIMA(1,(1,12)) is called
TM.
3.4.3.3. Nonlinear time series model. As mentioned, ARIMA model is
sufcient for our data and there is no need to identify an appropriate nonlinear time series model. The comparison of ARIMA
(1,(1,12)) output with test actual data is shown in Fig. 10.

Table 5
The McleodLi test results.
LjungBox Q-statistic

Q(4)5

Q(8)

Q(12)

1.4(0.846)

7.3(0.5)

11.2(0.51)

20000000
18000000
16000000
14000000
12000000
10000000
8000000
6000000
4000000
2000000
0
)
05
20
)
n(
Ja
04
20
c(
)
De
04
20
v(
No 0 4)
0
t(2
)
Oc 00 4
2
p(
)
Se
04
20
g(
A u 0 4)
0
l(2
)
Ju
04
20
)
n(
J u 00 4
2
y(
Ma 0 4)
0
r(2
)
A p 00 4
r(2
Ma 0 4)
20
b(
Fe

3.4.3.1. ARIMA model. To nd the best time series model preprocessed approach is used. Fig. 4 shows ACF and PACF charts, respectively. The theoretical ACF of a pure MA(q) process cuts off to zero
at lag q and theoretical ACF of an AR(1) models decays geometrically. Examination of Fig. 4 suggests that neither of these specications seems appropriate for the electricity consumption. The ACF
does not decay geometrically and it is suggestive of an AR(2) process or a process with both autoregressive and moving average

4.66 (0.33)
11.642
(0.17)
18.13
(0.501)
639
631

Electricity Consumption(Kwh)

A0
A1
A2
B1
B2
Q(4)
Q(8)

AR (1)

Test months
Actual data
ARIMA Monthly Electricity
Consumption Estimation
Fig. 10. The comparison of ARIMA output with actual data.

11116

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117

Table 6
The MAPE estimation of the models.
ANFISW model

ANFISWO model

ANFISIMW model

ANFISIMWO model

0.103

0.015

0.05

0.019

0.07

20000000
18000000
16000000
14000000
12000000
10000000
8000000
6000000
4000000
2000000
0
6)
00
n(2
Ja

5)
00
c(2
De

5)
00
v(2
No

5)
00
t(2
Oc

5)
00
p (2
Se

)
05
20
g(
Au

5
00
l(2
Ju

5)
00
n(2
Ju

)
05
20
y(
Ma

5)
00
r (2
Ap

5)
00
r(2
Ma

)
05
20
b(
Fe

Electicity consumption estimation

MAPE

ARIMA model

Fig. 11. Monthly electrical energy consumption prediction (20052006).

Table 7
The MAPE estimation of the proposed algorithm versus GA and ANN.

MAPE

Genetic algorithm

Articial neural network

The proposed algorithm

0.014

0.0156

0.0155

3.5. Step 4

5. Conclusion

Since data are preprocessed for ANFISW, ANFISIMW and ARIMA


models, the estimated data obtained by these models should
be post processed. Let ANFISxi (i: 1. . .12) be the ANFISW output
for preprocessed test data. ANFISxi is postprocessed by this
formula:

This paper presented a hybrid adaptive network fuzzy inference


system (ANFIS), computer simulation and time series techniques to
estimate and predict energy consumption estimation. The difculty
with electricity consumption estimation modeling approach such
as time series is the reason for proposing the hybrid approach of this
study. The algorithm is ideal for uncertain, ambiguous and complex
estimation and forecasting. Computer simulation was developed to
generate random variables for monthly electricity consumption.
Various structures of ANFIS were examined and the preferred model is selected for estimation by the proposed algorithm. Finally, the
preferred ANFIS and time series models were selected by Granger
Newbold test. Monthly electricity consumption in Iran from 1995 to
2005 was considered as the case of this study. The superiority of the
proposed algorithm is shown by comparing its results with genetic
algorithm (GA) and articial neural network (ANN).
This is the rst study that presented a hybrid simulation-adaptive network fuzzy inference system (ANFIS) for improvement of
electricity consumption estimation. The unique features of the proposed algorithm are two fold. First, ANFIS is ideal for complex and
uncertain data because it is composed of both ANN and fuzzy systems. Second Monte Carlo simulation is used to generate input variables whereas the conventional methods use deterministic data.
The superiority of the proposed algorithm is shown by comparing
its results with time series, genetic algorithm (GA) and ANN. Furthermore, the MAPE estimation of GA and ANN versus the proposed
algorithm showed the appropriateness of the proposed algorithm.

10^ ANFISxi  xi  1

24

ARIMA output postprocessed is similar to above mentioned case.


3.6. Step 5
ANFISWO (model 1) and ANFISW (model 2) are compared by
GrangerNewbold test. The value of t-statistic is statistically different from zero (2.602). Since rxz (which is explained in Appendix I) is
positive, then ANFISW has better forecasting performance than
ANFISWO model. The same procedure is also performed to compare FRSIMW with FRSIMW. At last, the test results show that ANFIS* (ANFISW) has better forecasting performance than TM(ARIMA
(1,(1,12))). The test results show that ANFIS* has better forecasting
performance than TM. Also, it can be seen from Table 6 that ANFIS*
has the least MAPE which shows the efciency of ANFISW among
other models.

4. Comparison with other intelligent methods


With the aid of ANFISW model, electricity for the next 12 month
is forecasted (Fig. 11). Table 7 shows the MAPE estimation for genetic algorithm (GA), articial neural network (ANN) versus the
proposed algorithm (Azadeh, Ghaderi, & Sohrabkhani, 2007;
Azadeh, Ghaderi, Tarverdian et al., 2007). Examination of this table
shows that the proposed algorithm provides good estimation with
respect to GA and ANN.

Appendix I. Akaike information criterion (AIC) and Schwartz


Bayesian Criterion (SBC)
The two most commonly used model selection criteria are the
AIC and the SBC. These criteria are used to select the most appropriate model. They have the following formulas:

A. Azadeh et al. / Expert Systems with Applications 36 (2009) 1110811117

AIC = T ln (sum of squared residuals) + 2n


SBC = T ln (sum of squared residuals) + n ln (T)
where n = number of parameters estimated (p+ q+ possible constant term)
T = number of usable observations.
Ideally, the AIC and SBC will be as small as possible (note that
both can be negative). As the t of the model improves, the AIC
and SBC will approach 1. Model A is said to t better than model
B if AIC (or SBC) for A is smaller than for B.
Appendix II. BoxPierce Q-statistic
The Q-statistic can be used to test whether a group of autocorrelations is signicantly from zero. Box and Pierce used the sample
autocorrelation to form the Q-statistic which has the following for and rs
mula. Let there be T observations labeled y1yT. We can let y
be estimates of l and qs, respectively where:

Q T

s
P
k1

rk

PT

r 2k

ytk y

y y
tk1 t
T
2
yt y
t1

 1=T
y

T
P

yt

t1

Under the null hypothesis that all values of rk = 0, Q is asymptotically v2 distributed with s degrees of freedom. The intuition behind
the use of statistic is that high sample autocorrelations lead to large
values of Q. certainly; a white noise process (in which all autocorrelation should be zero) would have Q value of zero. If the calculated
value of Q exceeds the appropriate value in a v2 table, the null signicant autocorrelations can be rejected.
References
Azadeh, A., Ghaderi, S. F., Tarverdian, S., & Saberi, M. (2006). Integration of articial
neural networks and genetic algorithm to predict electrical energy
consumption. In Proceeding of the 32nd annual conference of the IEEE industrial
electronics society IECON06. Paris, France: Conservatoire National des Arts and
Metiers.
Azadeh, A., Ghaderi, S. F., & Sohrabkhani, S. (2007). Forecasting electrical
consumption by integration of neural network, time series and ANOVA.
Applied Mathematics and Computation, 186(2), 17531761.
Azadeh, A., Ghaderi, S. F., Tarverdian, S., & Saberi, M. (2007). Integration of articial
neural networks and genetic algorithm to predict electrical energy
consumption. Applied Mathematics and Computation, 2, 17311741.
Aznarte, J. L., Sanchez, J. M. B., Lugilde, D. N., Fernandez, C. D. L., Guardia, C. D., &
Sanchez, F. A. (2007). Forecasting airborne pollen concentration time series
with neural and neuro-fuzzy models. Expert Systems with Applications, 32(4),
12181225.
Baylar, A., Hanbay, D., & Ozpolat, E. (2008). An expert system for predicting aeration
performance of weirs by using ANFIS. ESWA, 35(3), 12141222.
Box, G. E. P., & Jenkins G. M. (1970). Time series analysis: Forecasting and control. San
Francisco: 7 Holden Day (Rev. ed. 1976).
aydas, U., Hasalk, A., & Ekici, S. (2009). An adaptive neuro fuzzy inference system
(ANFIS) model for wire EDM. Expert Systems with Applications, 36(3), 6135
6139.
Chiang, W. C., Urban, T. L., & Baldridge, G. W. (1996). A neural network approach to
mutual fund net asset value forecasting, Omega. International Journal of
Management Science, 24, 205215.
Cybenko, G. (1989). Approximation by superpositions of a sigmoidal function.
Mathematics of Control, Signals and Systems, 2, 303314.
Dogantekin, E., Yilmaz, M., Dogantekin, A., Avci, E., & Sengur, A. (2008). A robust
technique based on invariant moments ANFIS for recognition of human
parasite eggs in microscopic images. ESWA, 35(3), 728738.
Enlin, M. (1995). Use of principal component analysis in selecting plants for
uoride. Chinese Journal of Ecology, 14(3).
Gareta, R., Romeo, L. M., & Gil, A. (2006). Forecasting of electricity prices with neural
networks. Energy Conversion and Management, 47, 17701778.
Ghiassi, M., Saidane, H., & Zimbra, D. K. (2005). A dynamic articial neural network
model for forecasting time series events. International Journal of Forecasting, 21,
341362.

11117

Granger, C. W. J., & Newbold, P. (1974). Spurious regressions in econometrics.


Journal of Econometrics, 2, 111120.
Hill, T., OConnor, M., & Remus, W. (1996). Neural network models for time series
forecasts. Management Science, 42(7), 10821092.
Huang, Y.-R., Kang, Y., Chu, M. H., Chien, S. Y., & Chang, Y. P. (2009). Modied
recurrent neuro-fuzzy network for modeling ball-screw servomechanism by
using Chebyshev polynomial. Expert Systems with Applications, 36(3), 5317
5326.
Hwarng, H. B. (2001). Insights into neural-network forecasting of time series
corresponding to ARMA (p, q) structures. Omega, 29, 273289.
Indro, D. C., Jiang, C. X., Patuwo, B. E., & Zhang, G. P. (1999). Predicting mutual fund
performance using articial neural networks, Omega. International Journal of
Management Science, 27, 373380.
Jain, A., & Kumar, A. M. (2007). Hybrid neural network models for hydrologic time
series forecasting. Applied Soft Computing, 7(2), 585592.
Jang, J.-S. R. (1993). ANFIS: adaptive-network-based fuzzy inference system. IEEE
Transactions on Systems Man and Cybernetics, 23, 665685.
Jhee, W. C., & Lee, J. K. (1993). Performance of neural networks in managerial
forecasting. Intelligent Systems in Accounting, Finance and Management, 2,
5571.
Karunasinghea, D. S. K., & Liong, S. Y. (2006). Chaotic time series prediction with a
global model articial neural network. Journal of Hydrology, 323, 92105.
Kasabov, N. (2002). DENFIS: dynamic evolving neural-fuzzy inference system and
its application for time-series prediction. IEEE Transaction on Fuzzy Systems, 10,
144154.
Khajeh, A., Modarress, H., & Rezaee, B. (2008). Application of adaptive neuro-fuzzy
inference system for solubility prediction of carbon dioxide in polymers. Expert
Systems with Applications, 36(3), 57285732.
Kohzadi, N., Boyd, M. S., Kermanshahi, B., & Kaastra, I. (1996). A comparison of
articial neural network and time series models for forecasting commodity
prices. Neurocomputing, 10(3), 169181.
Mamdani, E. H., & Assilian, S. (1975). An experiment in linguistic synthesis with
a fuzzy logic controller. International Journal of ManMachine Studies, 7(1),
113.
McLeod, A. I., & Li, W. K. (1983). Diagnostic checking ARMA time series models using
squared-residual autocorrelations. Journal of Time Series Analysis, 4, 269273
[Monti, A.C., 1994].
Nayak, P. C., Sudheer, K. P., Rangan, D. M., & Ramasastri, K. S. (2004). A neuro-fuzzy
computing technique for modeling hydrological time series. Journal of
Hydrology, 291, 5266.
Niskaa, H., Hiltunen, T., Karppinen, A., Ruuskanen, J., & Kolehmainena, M. (2004).
Evolving the neural network model for forecasting air pollution time series.
Engineering Applications of Articial Intelligence, 17, 159167.
Oliveira, A. L. I., & Meira, S. R. L. (2006). Detecting novelties in time series through
neural networks forecasting with robust condence intervals. Neurocomputing,
70(13), 7992.
Palmer, A., Montano, J. J., & Sese, A. (2006). Designing an articial neural network for
forecasting tourism time series. Tourism Management, 27, 781790.
Rumelhart, D. E., & McClelland, J. L. (1986). Parallel distributed processing:
Explorations in the microstructure of cognition. Foundations (Vol. 1). Cambridge,
MA: MIT Press.
Rutkowski, L., & Cpalka, K. (2003). Flexible neuro-fuzzy systems. IEEE Transactions
on Neural Networks, 14, 554574.
Schiffmann, W., Joost, M., & Werner, R. (1992). Optimization of the backpropagation
algorithm for training multilayer perceptrons. Technical Report 16/92, Institute
of Physics, University of Koblenz.
Stern, H. S. (1996). Neural networks in applied statistics. Technometrics, 38(3),
205214.
Subasi, A., Serdar Yilmaz, A., & Binici, H. (2008). Prediction of early heat of hydration
of plain and blended cements using neuro-fuzzy modelling techniques. Expert
Systems with Applications, 36(3),49404950.
Sugeno, M. (1985). Industrial applications of fuzzy control. Elsevier Science Pub. Co..
Tang, Z., Almeida, C., & de Fishwick, P. (1991). A time series forecasting using neural
networks vs. BoxJenkins methodology. Simulation, 57(5), 303310.
Tang, Z., & Fishwick, P. A. (1993). Back-propagation neural nets as models for time
series forecasting. ORSA Journal on Computing, 5(4), 374385.
Tseng, F. M., Yu, H. C., & Tzeng, G. H. (2002). Combining neural network model with
seasonal time series ARIMA model. Technological Forecasting and Social Change,
69, 7187.
Wang, W., & Chen, Z. (2008). A neuro-fuzzy based forecasting approach for rush
order control applications. ESWA, 35(12), 223234.
Wang, J. S., & Lee, C. S. G. (2002). Self-adaptive neuro-fuzzy inference systems for
classication applications. IEEE Transactions on Fuzzy Systems, 10, 790802.
Werbos, P. I. (1974). Beyond Regression: New tools for prediction and analysis in
the behavior sciences. Ph.D. Thesis, Harvard University, Cambridge, MA.
Zhang, G. P. (2001). An investigation of neural networks for linear time-series
forecasting. Computers and Operations Research, 28, 11831202.
Zhang, G., & Hu, M. Y. (1998). Neural network forecasting of the British Pound/US
Dollar exchange rate. Omega. International Journal of Management Science, 26,
495506.
Zhang, G. P., & Qi, M. (2005). Neural network forecasting for seasonal and trend time
series. European Journal of Operational Research, 160, 501514.
Zhu, J. (1998). Data envelopment analysis vs. principal component analysis: An
illustrative study of economic performance of chinese cities. European Journal of
Operation Research, 111, 5061.

Potrebbero piacerti anche