Sei sulla pagina 1di 4

Support Vector Machine Optimized with Genetic Algorithm for Short-term Load

Forecasting
Lihong Ma
1
,Shugong Zhou
1
,Ming Lin
2
1
School of Economics and Management,Hebei University of Science and Technology
Shijiazhuang, Hebei, 050000, China
2
School of Business Administration,North China Electric Power University,
Baoding, Hebei, 071003 ,China
ziluolanmlh@126.com, sdlinm@163.com
Abstract
Accurately electric load forecasting has become the
most important management goal, however, electric load
often presents nonlinear data patterns. Therefore, a rigid
forecasting approach with strong general nonlinear
mapping capabilities is essential. In recent few decades,
support vector machines (SVM) has been successfully
employed to solve this problem.This paper elucidates the
feasibility of using SVM to forecast electricity load.
Moreover, genetic algorithms (GA) were employed to
choose the parameters of a SVM model. So, a GA-SVM
model for short-term load forecating is presented in this
paper., and the experiment results show the method in the
paper has greater improvement in both accuracy and
velocity of convergence for SVM. Consequently, the
model is practical and effective and provides a alternative
for forecasting electricity load.
1. Introduction
Along with the recent privatization and deregulation of
the electricity industry, accurate forecasting of electricity
load has been one of the most important issues. A precise
short-term electrical load forecasting (STLF) results in
cost savings and secure operational conditions allows
utilities to commit their production resources to optimize
energy prices and exchange with vendors and clients.
Motivated by this, a number of mathematical
representations have been tested and the respective
performances compared in the context of STLF.
Sadownik and Barbosa proposed dynamic nonlinear
models for load forecasting[2]. The main disadvantage of
these methods is that they become time consuming to
compute as the number of variables increases. In the
recent decade, lots of researches had tried to apply the
artificial intelligent techniques to improve the accuracy of
the load forecasting issue. Knowledge-based expert
system (KBES) and artificial neural networks (ANNs) are
the popular representatives. The KBES approaches
constructed electric load forecasting by simulating the
experiences of the system operators who were well-
experienced in the processes of electricity generation,
such as Rahman and Bhatnagar . The application of
ANNs to short-term load forecasting has also gained a lot
of attention. Dillon et al. used adaptive pattern recognition
and self-organizing techniques for short term load
forecasting. Later, he used an adaptive neural network for
short term load forecasting.[7] [8]. Meanwhile, lots of
researches also had tried to apply ANNs to improve the
load forecasting accuracy. Park et al. proposed a 3-layer
back-propagation neural network to daily load forecasting
problems. Novak applied radial basis function (RBF)
neural networks to forecast electricity load. [4].
Applications of hybrid ANNs model with statistical
methods or other intelligent approaches have received a
lot of attentions, such as hybrid with self-organizing map,
wavelet transform, particle swarm optimization, and
dynamic mechanism [5][6] [9].
Support vector machines (SVM) models, which are
based on the statistical learning theory, are a new class of
models that can be used for predicting values. They have
recently been successfully employed to solve nonlinear
regression and time series problems. So, SVMs have been
applied to forecast electricity load. This paper elucidates
the feasibility of using SVM to forecast electricity load.
Moreover, genetic algorithms (GA) were employed to
choose the parameters of a SVM model. So we attempt to
develop a SVM model whose parameters is applied to
determine by GA. Subsequently, examples of electricity
load data were used to illustrate the proposed GA-SVM
model. Consequently, the GA-SVM model provides a
promising alternative for forecasting electricity load.
In the following sections, we first briefly describe
some essentials of SVM and GA, and present sufficient
algorithm that SVM optimized with GA. Then a
numerical example is given to illustrate the application of
the scheme. At last, we give a summary and future work
to do.
2008 International Symposium on Knowledge Acquisition and Modeling
978-0-7695-3488-6/08 $25.00 2008 IEEE
DOI 10.1109/KAM.2008.67
654
2. Support vector machine and Genetic
algorithm
2.1. Support vector machine
SVM, proposed by Vapnik in 1995, was mainly used
to find out a separating hyperplane to separate two classes
of data from the given data set. Let each entry of data be
{xi, yi}, i =1,2,,l , xRd , y{ 1, + 1}, and xi is input
data, yi represents category, l denotes the sample quantity,
and d means the input dimension. For any xi on the
separating hyperplane, the condition whxi + b =0 should
be satisfied. As usual, f(x)= whxi + b denotes the
decision function, where w is the normal vector of the
hyperplane and b means the bias value. For any given
entry of test data, if f(x) > 0, the entry of data could be
classified as +1; if f(x)< 0, it could be classified as -
1.
SVM could be classified as liner or non-linear based
on the problem types .When data could be categorized as
two types, the linear SVM could find a hyperplane with
the maximum margin width 2/w to separate the data
into di?erent types by finding the minimum of w
2
/2
subject to
i
( w
i
y i > = " l
(1)
For solving the above problem, the Lagrange
optimization approach could be adopted for carrying out
the resolution process easily. Constraint Eq. (1) could be
replaced by Lagrange multipliers. The Lagrange function
is expressed as
| |
2
1
1
( , , ) ( )
2
l
p i i i
i
L w b w y w x b
1
l
i
i
o o
=
= + +
_
o
=
_
(2)
where the Lagrange multiplier aiP 0, i=1,2, ..., l,
corresponds to each inequality with the constraint Eq.
(1).As such, the original problem to find the minimum
w
2
/2 has been converted into finding the minimum Lp
with the constraint equation
0
i
o >
. However, it is still
diffcult for the non-linear SVM to find the optimal
solutions. Fordealing with this situation, the Lagrange
Dual Optimization Problem is used to make the solution
process easier, which is formulated as
Max
1 , 1
1
( ) ( )
2
l l
D i i j i j
i i j
L o o o o
= =
=
_ _ i j
y y x x
Subject to i =1,2,,l (3)
0
i
o >
1
0
l
i i
i
y o
=
=
_
Once
i
o
is found, the optimum w* and b* can be
obtained and therefore the decision function f(x,a , b ) can
be determined through Eq. (2). In addition, for the linear
SVM to process non-separable data, Vapnik indicated that
slack variables n could be added into the constraints as
( ) 1 0; 0
i i i i
y w x b i c c + + > >
(4)
When errors happen to the classification of training
data,
i
c
should be larger than 0. Therefore, a lower
i
c
_
should be preferred when determining the
separating hyperplane. For this purpose, a cost parameter
C >0 is added to control the allowable error
i
c
. The
objective function should be changed from the solution
for the minimumw
2
/2 into the solution for
2
, ,
1
1
2
l
i
w b
i
M i n w C
c
c
=
+
_
C>0 (5)
For simplification, Eq. (5) could be transformed into
the dual problem as
1 , 1
1
( ) ( , )
2
l l
D i i j i j i
i i j
j
L y y K x x o o o o
= =
=
_ _
Based on above descriptions, it is simple to use linear
SVM to separate the two different categories of data, if
data could be fully separated by a linear function;
otherwise, a parameter C is required to control the
allowable errors. However, in the real world, not all data
could be separated by linear hyperplane. Boser, Guyon,
and Vapnik made the comparison between the linear and
nonlinear problems, and found if the original data are
transferred to another feature space of high dimension
(:RdF) through the mapping function U, thereafter,
the linear classification was conducted within the space
and the process could find better effect. If data(xi xj)are
transferred to the feature space of a high dimension,
i.e.( (xi) (xj)), the corresponding term in the dual
problem (6) should be changed. The dot product of
(xi) and(xj) was defined by the Kernel function,
K(xi,xj), thus, the optimization by the linear or nonlinear
SVM finally becomes
1 , 1
1
( ) ( , )
2
l l
D i i j i j i
i i j
L y o o o o
= =
=
_ _ j
y K x x
Subject to
0 C o s s
and
2
( ) ( ) ( ) 0,
i j i j i j
k x x g x g x dx dx g L > e
))
(7)
Vapnik indicated that if K(xi,xj) was a symmetric
positive definite function, the Kernel function should
meet Mercers Condition (8), i.e.,
2
( ) ( ) ( ) 0,
i j i j i j
k x x g x g x dx dx g L > e
))
and there should exist an optimal solution in Eq. (7).
Hsu,Chang, and Lin indicated that generally there are four
types of kernel functions. Among them, two types are
usually adopted in applications. One was radial basis
function (RBF):
655
( ) (1 )
d
i j i j
k x x x x = +
, where
2
is the band
width of the RBF kernel; another one was the polynomial
kernel: , where d 2
denotes the monomial degree of the polynomial kernel.
Smola performed the experimentation methods to make a
comparison between RBF and polynomial kernel, proving
that RBF could reach better classification results. In
addition, the polynomial spends more time for the training
operations. In the applications, the parameters should be
carefully selected when SVM is used, because it defines
the structure with the feature space of higher dimensions
to control the complexity and accuracy of the solutions.
( ) (1 )
d
i j i j
k x x x x = +
The SVM classification can find the optimal solution
of Eq. (7); however it should overcome the time-
consuming problem for huge data processing. The
decomposition
method, proposed by Chang, Hsu, and Lin, 2000, was
used to improve the effciency of huge data processing.
Among the algorithms,the most frequently used
techniques include sequential minimal optimization (SMO)
and a Library for support vector machines (LIBSVM).
2.2 Genetic Algorithm
A genetic algorithm (GA) is used to solve global
optimization problems. The procedure starts from a set of
randomly created or selected possible solutions, referred
to as the population. Every individual in the population
means a possible solution, referred to as a chromosome.
Within every generation, a fitness function should be used
to evaluate the quality of every chromosome to determine
the probability of it surviving to the next generation;
usually, the chromosomes with larger fitness have a
higher survival probability.Thus,GA should select the
chromosomes with larger fitness for reproduction by
using operation slike selection, crossover and mutation in
order to form a new group of chromosomes which are
more likely to reach the goal. This reproduction goes
through one generation to another, until it converges on
the individual generation with the most fitness for goal
functions or the required number of generations was
reached. The optimal solution is then determined.
GA coding strategies mainly include two sectors; one
sector recommends the least digits for coding usage, such
as binary codes; another one recommends using the
realvalued coding based on calculation convenience and
accuracy. Binary codes are adopted for the decision
variables in solving the discrete problems, however, it
probably causes conflict between accuracy and e?ciency
when the problems are featured with continuity in that the
calculation burden is quickly increased. The adoption of
real-valued coding does not only improve the accuracy,
but also significantly increases the effciency in the larger
search space, so that more practical applications adopt the
real-value coding for solution.
3. GA-SVM model
As mentioned before, a kernel function is required in
SVM for transforming the training data. Therefore, there
are two parameters, C and
2
, required within the SVM
algorithm for accurate settings, since they are closely
related to the learning and predicting performance.
However, determining the values exactly is diffcult for
SVM. Tay and Cao suggested that C should range from
10 to 1000 and
2
from 1 to 100, so that the established
models can achieve much better results. Generally, to find
the best C and
2
a given parameter is first fixed, and then
within the value ranges another parameter is changed and
cross-comparison is made using the grid search algorithm.
This method was conducted with a series of selection sand
comparisons, and it will face the problems of lower
effciency and inferior accuracy when conducting a wide
rsearch.However,GA for reproduction could provide the
solution for this study. The schema of the GA-SVM
modle as Fig1. shows:
Date input
GA SVM
Date output Optimize the
parameters
Fig1. The schema of the GA-SVM modle
Following the above scheme of the proposed GA-SVM
model, the operating procedure in this study is described
as follows:
Step 1. Make settings for initial value ranges of (C,
2
) and construct an initial population.
Step 2. Randomize initial population.
Step 3. Training SVM model (5-fold cross-validation)
using each pair of (C,
2
).
Step 4. Calculate fitness values.
Step 5. If the stopping condition is satisfied, go to Step
8. If 50 generations are carried out, go to Step 7.
Step 6. Perform selection, reproduce operations, cross-
over, and mutation operations to create a new population.
Go to Step 3 for the next generation.
Step 7. Repeat steps 27.
Step 8. Find out the optimal (C *,
2
*).
Step 9. Train the SVM model with (C *,
2
*).
Step 10. Implement the GA-SVM modle for the load
forecasting.
4. Forecasting simulation examples
656
In order to test the performance of the algorithm given
above,we use this model to daily load forecasting of a
certain city. We forecast the maximize value, minimize
value and load coefficient of the j day by applying GA-
SVM model and SVM model respectively. The input
sample adopt the load data and weather data of the April
fiftieth 2007. In this paper, MATLAB 7 is employed to
normalize the sample data. In order to save the limited
space, this process is omitted. We compare the forecast
results that produced by GA-SVM arithmetic and SVM
arithmetic as Tab.1 shows.
Tab.1 Results of daily load forecast with different artificial
neural network
SVM algorithm GA-SVM algorithm Time
(h)
Actual
load
(MW)
Forecast
load(MW)
Error
(%)
Forecast
load(MW)
Error
(%)
0:00 1979.96 2008.39 1.44 1989.07
0.46
1:00 1885.96 1900.78 0.79 1876.47
-0.50
2:00 1809.87 1851.95 2.32 1825.37
0.86
3:00 1745.07 1795.31 2.88 1766.11
1.21
4:00 1727.75 1809.19 4.71 1780.63
3.06
5:00 1795.47 1816.32 1.16 1788.09
-0.41
6:00 1986.08 1970.28 -0.80 1949.20
-1.86
7:00 2107.32 2151.69 2.11 2139.02
1.50
8:00 2273.31 2274.10 0.03 2267.11
-0.27
9:00 2375.77 2381.76 0.25 2379.77
0.17
10:00 2373.53 2430.11 2.38 2430.36
2.39
11:00 2513.55 2498.84 0.59 2502.28
-0.45
12:00 2256.19 2265.43 0.41 2258.05
0.08
13:00 2221.74 2259.94 1.72 2252.30
1.38
14:00 2288.49 2294.50 0.26 2288.47
-0.00
15:00 2231.96 2296.95 2.91 2291.01
2.65
16:00 2332.96 2362.83 1.28 2359.95
1.16
17:00 2432.98 2496.81 2.62 2500.15
2.76
18:00 2476.47 2566.14 3.62 2572.69
3.88
19:00 2700.95 2715.90 0.55 2729.41
1.05
20:00 2668.54 2609.26 -2.22 2617.82
-1.90
21:00 2523.47 2502.59 -0.83 2506.20
-0.68
22:00 2444.17 2347.57 -3.95 2343.99
-4.10
23:00 2179.32 2131.76 -2.18 2118.17
-2.81
It is obvious that GA-SVM model arithmetic has more
small deviation than the SVM.The average relative
deviation of GA-SVM is 1.48% and the biggest relative
deviation is 4%. It reached the basic requirement of load
forecast deviation. But the forecast deviation of SVM is
bigger. So , the results indicate that using the algorithm of
SVM combining GA is better than BP neural network in
load forecasting.
5. Conclusions
In this paper, the genetic algorithms (GA) is used to
optimize the parameters of support vector machines
(SVM). We integrated GA and SVM to establish the
model for load frocating by using their traits in parameter
evolution as well as data training and classification. The
hybrid algorithm takes full advantages of the better
performance of global optimized search of GA and the
local optimized search of SVM. the results suggest that
the model presented above can provide highly accurate
and improve the predicting precision. Therefore, the
proposed model can provide managers with an easy and
effective way to load forecasting.
References
[1] Cristianini, N., & Shawe-Taylor, J. (2000). An introduction
to support vector machines. Cambridge University Press
[2] K G. Gross and F.D. Galiana: "Short term load forecasting",
Prm. IEEE, Vol. 75, No 12, pp. 1558-1573, 1987.
[3] Wang Zhiyong, Guo Chuangxin, Cao Yijia. A method for
short term load forecasting integrating fuzzy-rough set with
artificial neural network.[J]. Proceedings of the CSEE,
200525(19):7-11
[4] Novak B. Superfast auto-configuring artificial neural
networks and their application to power systems. Electric
Power Syst Res 1995;35:116.
[5] Vojislav K. Learning and soft computingSupport vector
machines, Neural Networks and Fuzzy Logic
Models.Massachusetts: The MIT Press; 2001.
[6] Christiani N, Shawe-Taylor J. An introduction to support
vector machines. Cambridge: Cambridge UniversityPress;
2000.
[7] Dillon TS, Morsztyn K, Phua K. Short term load
forecasting using adaptive pattern recognition and self-
organizing techniques. In: Proceedings of the fifth world
power system computation conference (PSCC-5),
September 1975, Cambridge, paper 2.4/3, p. 115.
[8] Dillon TS, Sestito S, Leung S. Short term load forecasting
using an adaptive neural network. J Elect Power Energy
Sys 1991;13(4): 18692.
[9] Cercignani C. The Boltzmann equation and its applications.
Berlin: Springer-Verlag; 1988.
657

Potrebbero piacerti anche