Regional Monthly Runoff Forecast in Southern Canada Using ANN K Means and L Moments Techniques

Canadian Water Resources Journal / Revue canadienne
des ressources hydriques
ISSN: 0701-1784 (Print) 1918-1817 (Online) Journal homepage: http://www.tandfonline.com/loi/tcwr20
Regional monthly runoff forecast in southern

Canada using ANN, K-means, and L-moments
techniques
Carlos Escalante-Sandoval & Leonardo Amores-Rovelo
To cite this article: Carlos Escalante-Sandoval & Leonardo Amores-Rovelo (2017): Regional
monthly runoff forecast in southern Canada using ANN, K-means, and L-moments techniques,
Canadian Water Resources Journal / Revue canadienne des ressources hydriques, DOI:
10.1080/07011784.2017.1290552
To link to this article: http://dx.doi.org/10.1080/07011784.2017.1290552
Published online: 27 Apr 2017.
Submit your article to this journal
Article views: 13
View related articles
View Crossmark data
Full Terms & Conditions of access and use can be found at

http://www.tandfonline.com/action/journalInformation?journalCode=tcwr20
Download by: [187.177.120.125] Date: 11 May 2017, At: 08:09

Canadian Water Resources Journal / Revue canadienne des ressources hydriques, 2017
https://doi.org/10.1080/07011784.2017.1290552
Regional monthly runoff forecast in southern Canada using ANN, K-means, and L-moments
techniques
Carlos Escalante-Sandoval* and Leonardo Amores-Rovelo
Faculty of Engineering, National Autonomous University of Mexico, Mexico City, Mexico
(Received 8 January 2016; accepted 25 October 2016)
River runoff forecasting is necessary for numerous applications related to water use, including water supply management,
power generation and flooding protection measures. In this study, a regional model using an artificial neural network
(ANN) is proposed for monthly runoff forecasting, which considers stations linked to the network that belong to the
same homogeneous region, and are delimited using K-means (KM-ANN) and L-moments (LM-ANN) techniques. This
methodology was applied to a sample of 90 monthly runoff series in southern Canada. The results were compared to
those of a traditional neural network for a given site (ANNs) using statistical indices, such as root-mean-squared error
(RMSE), relative square error (RSE), mean absolute error (MAE), relative absolute error (RAE), the concordance index
(d) and the coefficient of determination (r2). The LM-ANN technique produced better forecasts in 56.7% of the analysed
stations, whereas the KM-ANN and ANN techniques produced better forecasts in 27.7% and 15.6% of the stations,
respectively. Thus, the results indicate that the regionalisation process improved the forecasts in 84.4% of the studied
cases, and the estimation uncertainty was reduced by an average of 31.8%, according to the RMSE, RSE, MAE and
RAE values. Therefore, its application is recommended in Canada, where it would be useful for the Integrated Water
Resources Management Program.
Le ruissellement de la prévision du ruissèlement de la rivière est nécessaire dans un grand nombre d’applications liées à
l’utilisation de l’eau, par exemple, l’approvisionnement en eau ou en électricité, et les mesures de protection contre les
inondations. Dans cette étude, un modèle régional de réseau de neurones (ANN) est proposé pour les prévisions de ruis-
sellement un mois à l’avance, qui considère les stations du réseau qui appartiennent à la même région homogène, qui
sont délimitées pour les K-means (KM-ANN) et L-moments (LM-ANN) techniques. Cette méthodologie a été appliquée
à un échantillon de 90 séries mensuelles de ruissellement de la région sud du Canada. Les résultats ont été comparés à
ceux d’un réseau de neurones traditionnel pour le site en utilisant des indices statistiques comme la racine de l’erreur
quadratique moyenne (RMSE), l’erreur quadratique relative (RSE), l’erreur absolue moyenne (MAE), l’erreur absolue
relative (RAE), l’indice de concordance (d) et le coefficient de détermination (r2). La technique LM-ANN produit de
meilleures prévisions dans 56.7% des stations analysées, alors que les techniques KM-ANN et ANN produisent de meil-
leures prévisions à 27.7% et 15.6% des stations, respectivement. Les résultats indiquent que le processus de régionalisa-
tion a amélioré les prévisions dans 84.4% des cas, et l’incertitude de l’estimation a été réduite en moyenne de 31.8%,
selon les valeurs de RMSE, RSE, MAE et RAE. Par conséquent, son application est recommandée au Canada, où il peut
bénéficier du Programme de gestion intégrée des ressources en eau.
Keywords: runoff forecast; regional ANN; homogeneous region; K-means; L-moments
Introduction producing the quality and quantity of water necessary to

Canada occupies 7% of the Earth’s surface and possesses meet the growing demand, and accounting for water-
7% of the world’s renewable water supply. Water con- related risks. These risks have important social and
sumers and administrators understand that the climate economic implications; the costs associated with floods
varies both spatially and temporally. Annual precipitation and droughts in the country are constantly increasing, as
in Canada ranges from less than 500 mm in the Prairies are the potential problems between water users, including
to up to 3500 mm along the Pacific Coast. Meanwhile, industry, agriculture, electric power generation and
the mean annual runoff also ranges from 50 mm south- municipalities. All of these factors have transformed the
east of the Prairies to more than 2000 mm at the Pacific perception that Canada is a water-rich country into a
Coast (Pearse et al. 1985). myth.
Canada has a long tradition of basin administration The agricultural sector plays an essential role in food
along with protection and conservation of water production and has a critical role in water management.
resources. The problems that must be overcome include Agriculture consumes 70 to 80% of all available water;
*Corresponding author. Email: caes@unam.mx
© 2017 Canadian Water Resources Association

2 C. Escalante-Sandoval and L. Amores-Rovelo
approximately 85% of the water used in agriculture is could model systems with complex non-linear and inter-
dedicated to irrigation, predominantly in Western active relationships between input and output. They used
Canada, and the remaining 15% is used for cattle pro- runoff series from 21 basins located in British Columbia,
duction. Canada. Cigizoglu (2003) successfully applied an ANN
Due to the spatial and temporal distributions of water to forecast daily runoffs of rivers in the Turkish eastern
in the country, it has been necessary to build canal sys- Mediterranean. Mohammadi et al. (2005) applied an
tems and water storage infrastructure to bring water to ANN, i.e. the autoregressive integrated moving average
users when and where it is required. Moreover, hydro- (ARIMA) model, and regression analysis to the inflow
electric plants are still the main source of electric power forecast into the Amir Kabir Dam, located in northeast-
in the country. The installed capacity is 76,000 MW, and ern Teheran, Iran. They concluded that the best results
approximately 475 plants produce an average of 355 TW were obtained using an ANN. Diamantopoulou et al.
h per year (CHA 2008). (2006) used a three-layer artificial neural network model
Canadian culture is tightly linked to water, but as its denominated the Time Delay Artificial Neural Network
population grows, so does competition for water usage. (TDANN) for forecasting of daily runoff of the Aliak-
This makes it necessary to have an integral plan for mon River in northern Greece. The authors determined
water resource management. the number of nodes of the hidden layers based on the
The development and implementation of tools for the maximum value of the correlation coefficient; they also
optimal planning of water resources requires the analysis demonstrated that the TDANN technique has great
of runoff generated by the basins for evaluating engi- potential for hydrological and environmental applica-
neering and environmental problems, such as flood con- tions. Demirel et al. (2009) analysed runoff forecasts
trol, the real-time operation of reservoirs, hydroelectric based on soil and water characteristics through the soil
generation, and quality control of water and river ecosys- and water assessment tool (SWAT) and an ANN for
tems. In particular, the precision of runoff forecasts is a daily runoffs of the Pracana Basin, Portugal, and found
key factor for the planning, design and operation of that the ANN produced better results than the SWAT.
hydraulic infrastructure, and has profound socio- Pierini et al. (2012) forecasted time series of upstream
economic impacts at the regional level. However, runoff runoffs of a hydroelectric plant in Buta Ranquil, and
is among the more complex elements of the hydrologic demonstrated that a three-layer ANN produced better
cycle, not only due to the basin response factors but also results than autoregressive (AR) models. Arlan (2013)
due to the complexity of atmospheric processes, which applied different types of ANN architectures for forecast-
make it a non-linear process. This latter condition indi- ing monthly runoffs of the Euphrates River and con-
cates that non-linear techniques may be more appropriate cluded that ANNs are important tools for this type of
than linear techniques for representing the linkage forecast.
between runoff and climate. The above literature demonstrates that ANNs are
The use of artificial neural networks (ANNs) in run- effective tools for forecasting runoff series. As has
off forecasts is recent. Zealand et al. (1999) applied an already been mentioned, the accuracy and precision of the
ANN to produce a 7-day runoff forecast series of the forecast plays a central role in the effective use of water
Winnipeg River in northwestern Ontario, Canada, and resources. Thus, any means for the reduction of the esti-
compared the results to those from other, conventional mation uncertainty will potentially yield economic and
approaches. The analysis of the ANN included the input social benefits.
data type and the number and sizes of the hidden layers An approach to diminish this uncertainty is to apply
to be included in the network. The authors demonstrated regional estimation techniques, which uses information
that ANNs could represent complex non-linear relation- from stations from the same homogeneous region. The
ships. Coulibaly et al. (2000) used novel techniques for delineation of homogeneous regions is an essential step to
the training phase of a multi-layered ANN, and used ensure the transferability of hydrological information
both the chosen propagation model and the cross-valida- (Hosking and Wallis 1997). Toth (2009, 2013) used a clus-
tion technique to their advantage to avoid over- or tering algorithm based on unsupervised self-organising
under-training of the ANN. The methodology was map (SOM) neural networks to improve the rainfall-runoff
applied to the multi-variable hydrologic time series of modelling performance. Demirel et al. (2012) made a vali-
the Chute-du-Diable hydrosystem, located in northern dation of an ANN flow prediction model using Ward’s
Quebec, Canada. The forecast volumes were better than clustering technique. The clusters obtained represented
those obtained through statistical and conceptual operat- well the similarity of flow characteristics of different rivers
ing models. Coulibaly et al. (2001) efficiently forecasted in the region. They suggested that their methodology is a
a multi-variable time series of flow into a hydroelectric good flow prediction model, but it can be improved if
dam in Quebec, Canada, using temporal neural networks. other cluster procedures are used. Dehghan et al. (2014)
Cannon and Whitfield (2002) demonstrated that ANNs applied a multisite feed-forward artificial neural network
Canadian Water Resources Journal / Revue canadienne des ressources hydriques 3
(FFANN) along with principal component analysis (PCA) (uncorrelated variables). The Euclidean distance from p
to conduct an uncertainty analysis of monthly streamflow to q is
forecasting. PCA was used for reducing the model archi- rffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
X ffi
tecture complexity. Their results indicate that the model in d ðp; qÞ ¼ ðpi qi Þ2 (1)
general is capable of forecasting monthly streamflow i
acceptably. Despite the fact that none of the above papers where pi represents the attributes of the site of interest,
used cluster analysis, their results show the potential of and qi represents the attributes of the point q.
using multisite ANN models. Latt et al. (2015) proposed a The measurement scale of the different attributes
neuronal network-based regionalisation approach for flood affects the value obtained from Equation (1). Therefore,
management of ungauged catchments. First, they used the attributes must be standardised using the
PCA to reduce the dimensionality of the data set and to corresponding standard deviation (s:d:ðÞ), as described
define the significant components to be used in pooling in Table 1.
homogeneous regions. Then, Ward’s method was used to In this study, the attributes used for each site
search initial cluster numbers prior to k-means clustering. are those recommended by Hosking and Wallis (1997)
Finally, the regional index flood models were developed and include the following: (a) the basin area, A km2 ;
via the ANN and regression models based on catchment (b) the mean annual runoff volume, V ðm3 =sÞ; and
descriptors. They concluded that the ANN approach cap- (c) the geographic coordinates of latitude
tures much better the nonlinear relationships between the [Lat ðdegreesÞ, longitude [Long ðdegreesÞ] and elevation
index flood and the catchment descriptors for each cluster. [Elevðmetres above sea levelÞ] for each hydrometric
The goal of this study is to develop a regional ANN station.
model that reduces the uncertainty of monthly runoff K-means is a simple learning algorithm that solves
forecasts. The information used for generating the fore- clustering problem; however, it does not necessarily find
cast at a given site is complemented by that from other the most optimal configuration because the algorithm is
sites belonging to the same homogeneous region. For significantly sensitive to the initial selected cluster cen-
this purpose, two options are proposed: the first is a tres Kg. To avoid this inconvenience, in this paper the
regional neural network model applied to regions methodology proposed by Pham et al. (2005) is used to
delineated using the K-means grouping technique (KM- select the best value of Kg. This method evaluates the
ANN), and the second is a regional neural network function f Kg as follows:
model applied to homogeneous regions using the 8
>1 if Kg ¼ 1
L-moments technique (LM-ANN). The regional monthly < SKg
runoff forecasts were compared to those obtained f Kg ¼ aKg SKg 1 if SKg 1 6¼ 0; 8Kg [ 1 (2)
>
:1
through traditional ANN modelling. The methodology if SKg 1 6¼ 0; 8Kg [ 1
was applied to records from 90 hydrometric stations
located in southern Canada. (
1 4N3 d if Kg ¼ 2and Nd [ 1
a Kg ¼ 1aKg 1 (3)
aKg 1 þ 6 if Kg [ 2and Nd [ 1
Delimitation of homogeneous regions where SKg is the sum of the cluster distortions when the
K-means technique number of clusters is Kg, Nd is the number of data set
The K-means technique (MacQueen 1967) is a method attributes, and aKg is a weight factor. SKg is given by:
that divides m s-dimensional points into Kg groups, X
Kg
whereby the sum of squares inside the cluster is min- S Kg ¼ Ij (4)
imised. This procedure begins by randomly choosing Kg j¼1
points that will be considered the centroids of the Kg
groups. Each group is built by assigning each site to the
nearest centroid. Once this grouping is established, the Table 1. Recommended attribute standardisation (Hosking and
mean value of each cluster is obtained, and these aver- Wallis 1997).
ages become the new centroids. This assignment and
Variable Standardised variable
averaging process is repeated until none of the centroids
2
moves. Using this technique, regions are obtained that A (km ) log (A)/s.d. (A)
can be considered homogeneous. The measure of the dis- Elev (m asl) √Elev/s.d. (√Elev)
tances between observations is Euclidean. Lat (degrees) Lat/s.d. (Lat)
Long (degrees) Long/s.d. (Long)
Let p ¼ ðp1 ; p2 ; . . .; ps Þ and q ¼ ðq1 ; q2 ; ; qs Þ be V (m3/s) V/s.d. (V)
two of m points in a Euclidean space with s attributes
where Kg is the specified number of clusters, and Ij is heterogeneous’ if 1 H\2, and ‘definitely heteroge-
the distortion of the cluster, which is obtained as neous’ if H ≥ 2.
follows: To better determine the grouping (clusters) such that
the heterogeneity measured by the statistical index H is
X
Nj
reduced, it is recommended to move one station or a
Ij ¼ ½d ðp; qÞ2 (5)
i¼1
group of stations to another region or to either join or
subdivide regions (Hosking and Wallis 1997).
where Nj is the number of objects belonging to cluster j, For regions that are considered definitely heteroge-
and d ðp; qÞ is the Euclidean distance from p to q. neous, Hosking and Wallis (1997) recommended identi-
As mentioned by Pham et al. (2005), the role of fying the discordant sites responsible for the increase in
f Kg is to reveal
trends in data distribution. When Kg the statistical index H. The discordant sites are statistical
increases, f Kg should converge to some constant value outliers with ^s2 values substantially different from the
(∼1). Then, if any intermediate value of Kg exhibits a regional average (s2 ). These outliers can be identified
special behaviour, such as a minimum point of f Kg through the discordancy measure
(i.e. <0.85, as they recommended), that value of Kg can
be taken as the final number of clusters. 1
Di ¼ Ns ðui uÞA1 ðui uÞT (8)
3
where ui is a vector containing the relationships of the
L-moments technique
L-moments ^si2 ; ^si3 ; and ^si4 for site i, ui is a regional
The hydrologic homogeneity of a region requires that all average vector for the L-moments relationships
sites are able to share a population distribution (Hosking ðs2 ; s3 ; and s4 Þ, and
and Wallis 1997). The statistical similarity between the
distributions for each site i can be evaluated using the X
Ns
A¼ ðui uÞðui uÞT (9)
L-moments distribution (Hosking and Wallis 1997), i¼1
which includes the following parameters: variation
ðiÞ ðiÞ
coefficient (L-CV), ^s2 ; skewness (L-skewness), ^s3 ; or A site i is discordant if Di > 3. However, the final
ðiÞ number of sites that are considered to belong to a given
kurtosis (L-kurtosis), ^s4 .
The Hosking and Wallis (1997) homogeneity test com- region affects the discordance appreciation, such that
pares the variability between the observed and simulated new critical discordance values must be considered as a
relationships of the L-moments of each site. The simulated function of the number of sites in each region. Table 2
relationships are obtained for the regions generated at each presents these values.
step of an iterative search process of the appropriate homo-
geneous region. The utilised variability measure is
( ) Artificial neural network (ANN)
XNs 2 X Ns
Network architecture
ðiÞ
V ¼ ni ^s2 s2 ni (6)
i¼1 i¼1 An ANN is based on the highly structured form of the
ðiÞ interconnections of the brain cells. This approach is fast
where ^s2 represents the L-CV estimates of site i, s2 is and robust in noisy environments, and highly adaptable
the regional average of the L-CV, ni is the record length to new environments. The network is composed of a set
at site i, and Ns is the number of sites in the region. of sensory units that constitute the input layer with n1
Hosking and Wallis (1997) calculated a statistical nodes, one or more hidden layers with n2 nodes each of
measurement of the homogeneity (H), which can be which are calculated through a non-linear transformation
obtained from many simulations as follows: described below, and an output layer with n3 nodes. The
ð V lV Þ input signal propagates through the network in a forward
H¼ (7) direction, layer by layer. Each neuron in a particular layer
rV
is connected with all neurons in the following layer. The
where V represents the variability at the site obtained architecture of a traditional ANN is shown in Figure 1.
from Equation (6); μV and σV are the mean and standard The mathematical representation of the three-layer
deviation of the Nsim simulated values of V, respectively, feed-forward ANN is
which are calculated using the kappa probability distribu- ( ( " !#
tion function. X
n2 X
n1 X
k
^ tþ1 ¼ f3
Q w1;j hf2 wi;j f1 wi;h Qth þ wi;0
To obtain reliable values, the number of simulations
must be large (Hosking 1994). In this study, Nsim = 500 )
j¼1 ) i¼1 h¼0
was used. According to Hosking and Wallis (1997), a þwj;0 i (10)

region is considered ‘homogeneous’ if H\1, ‘possibly
Table 2. Critical values of Di for the discordancy test (Hosking and Wallis 1997).
Sites
5 6 7 8 9 10 11 12 13 14 ≥ 15
Critical value 1.33 1.65 1.92 2.14 2.33 2.49 2.63 2.76 2.87 2.97 3.00
The mathematical representation of the three-layer

feed-forward regional ANN is
( " !
X
n2 X
n1 X
Ns X
k
^ tþ1 ¼ f3
Q ws;j f2 wi;j f1 wi;½kðh1Þ Qh
tgþ1
#
j¼1
)i¼1 h¼1 g¼1
þ wi0 þ wj0
(11)
where Qhtgþ1 is the monthly runoff; Ns is the number of
stations in each homogeneous region; k is the number of
lags (in this study, equal to 4); g is a counter ranging
from 1 to k; h is a counter that takes values from 1 to Ns;
wi;½kðh1Þ is the weight that controls the connection
between the input data and the input layer; wi,0 is the bias
for the ith neuron in the input layer; f1 is the input layer
activation function; wi,j is the weight that controls the
Figure 1. Architecture of a traditional artificial neural network
connection between the ith neuron in the input layer and
with one hidden layer.
the jth neuron in the hidden layer; wj,0 is the bias for the
where Qt is the monthly runoff; k is the number of lags jth neuron in the hidden layer; f2 is the hidden layer acti-
(in this study, equal to 4); wi;h is the weight that controls vation function; ws,j is the weight that controls the con-
the connection between the input data and the input nection between the jth neuron in the hidden layer and
layer; wi,0 is the bias for the ith neuron in the input the sth neuron in the output layer; f3 is the output layer
activation function; and Q^ is the output variable.
layer; f1 is the input layer activation function; wi,j is the
tþ1
weight that controls the connection between the ith neu-
ron in the input layer and the jth neuron in the hidden
layer; wj,0 is the bias for the jth neuron in the hidden Data normalisation
layer; f2 is the hidden layer activation function; w1,j is Data normalisation is a very important step to assure the
the weight that controls the connection between the jth efficiency of training algorithms. For example, the
neuron in the hidden layer and the neuron in the output algorithm used to train the Levenberg-Marquardt Back-
layer; f3 is the output layer activation function; and wi;j Propagation (LMBP) algorithm is particularly sensitive
is the output variable. to the scale of the data that is used.
The ANN is able to use some a priori unknown In this study, the applied normalisation scheme is
information in data that is captured in a previous process known as min-max, and it performs a linear transforma-
of training or learning. Learning corresponds to adjusting tion of the original data. Thus, assuming that minA and
the weight coefficients to achieve some specific condi- maxA are the minimum and maximum values of a given
tions. In a supervised training process, the network attribute A, respectively, the min-max normalisation
knows the output variables, and the weight coefficients assigns a value xobs to A for x in the range [new minA
are adjusted such that the calculated and known output new maxA ]:
variables are as close as possible.
xobs minA
x¼ ðnew maxA new minA Þ þ new minA
maxA minA
Regional ANN (12)
Network architecture The min-max normalisation attempts to preserve the rela-
Once each homogeneous region was identified by either tionship between the original data values. However, an
the K-means or L-moments technique, the architecture of ‘out of bounds’ error can be produced if a future value
the regional ANN shown in Figure 2 was established. occurs outside the original data range of A.
Figure 2. Architecture of the regional artificial neural network.
Activation function is a second-order, non-linear optimisation technique that

A neuron computes an output, based on the weighted is faster and more reliable than any other back-
sum of all its inputs according to an activation function propagation technique. The LMBP algorithm uses the
f ð xÞ. The employed activation functions are as follows approximate Hessian matrix, which can be approximated
(Govindaraju and Rao 2000): by
Sigmoidalfunction: f ð xÞ ¼
1
(13) H ¼ J T J (15)
1 þ ex
with J representing a Jacobian matrix containing the first
1e 2x derivatives of the neural network errors with respect to
Hyperbolictangentfunction: f ð xÞ ¼ (14) the weights and skewness. The gradient is calculated as
1 þ e2x
the product of the Jacobian matrix and a vector e, which
where xrepresents the weighted sum of inputs to the neu- contains the minimised errors
ron, and f ð xÞ is the neuron’s output. The sigmoidal func-
tion is bound in the range ð0; 1Þ, whereas the g ¼ J Te (16)
corresponding range for the hyperbolic tangent is ð1; 1Þ. The LMBP algorithm uses the following approximation
of the Hessian matrix for each update
Model selection DW ¼ ðH þ lI Þ1 g (17)
It is difficult to determine a priori a good network topol- where μ is a small scalar variable controlling the learning
ogy, i.e. the size of the hidden layer between the input process, and I is the identity matrix. When the scalar μ
and output layers. In this case, an optimal model was is zero, Equation (17) is equivalent to Newton’s method.
found by using a trial and error process. For the ANN When μ is large, Equation (17) is equivalent to the gradi-
and regional cases, the maximum number of neurons in ent descent method with a small step size. The goal is to
the hidden layer was 10. move toward Newton’s method as quickly as possible,
because it is faster and more precise for minimising
errors compared with the gradient descent method
Network training
(Coulibaly et al. 2000).
The technique employed in the ANN, KM-ANN and The LMBP algorithm trains the network with the
LM-ANN training phases was the LMBP algorithm. This following sequence:
Figure 3. Study area in Canada (Western and Eastern Canada).
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Step 1. Initialise the weights and biases using very 1X n
small random numbers, and normalise the input data RMSE ¼ ðxi ^xi Þ2 (18)
n I¼1
(Equation (12)).
Step 2. Calculate the output from each neuron in Pn
each hidden and output layer using the sigmoid transfer ðxi ^xi Þ2
RSE ¼ PI n¼1 (19)
functions (Equation (13)) or hyperbolic tangent (Equa- I¼1 ðx xi Þ2
tion (14)).
n
Step 3. Calculate the errors. 1X
xi ^x
MAE ¼ (20)
Step 4. Update the weights and biases; repeat steps 2 n i¼1 i
and 3.
Step 5. Repeat steps 1 through 4 until the error con- Pn
jxi ^xi j
verges to an acceptable level. RAE ¼ Pi¼1
n (21)
i¼1 j
x xi j
Performance evaluation for the ANN, KM-ANN and Pn

i¼1 ðxi ^xi Þ2
LM-ANN models d ¼ 1 Pn (22)
i¼1 ðj^
xi xj þ jxi xjÞ2
For comparisons between the actual runoff and that fore-
cast by the at-site and regional at-site models, six basic 2 32
Pn

i¼1 ðxi xÞ xi ^x
statistical indices were used as performance criteria: root- 6 7
mean-squared error (RMSE), relative square error (RSE), r2 ¼ 4qffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
Pn P ffi5 (23)
xÞ2 ni¼1 xi ^x
2
mean absolute error (MAE) and relative absolute error i¼1 ðxi
(RAE), whereas the concordance index (d) was chosen
In the above equations, xi is the ith observation; xi is the
as a criterion for comparing the models, and the coeffi-
ith runoff forecast by the model (ANN or regional); x is
cient of determination (r2) was used to measure the
the mean of the observed values; ^x is the mean of the fore-
degree of correlation between the forecast results and
cast values; and n is the total number of forecast outputs.
observed data.
Figure 4. Locations of the hydrometric stations used in this study (Western Canada).
Figure 5. Locations of the hydrometric stations used in this study (Eastern Canada).
Figure 6. Final grouping obtained using the K-means technique (Western Canada).
Figure 7. Final grouping obtained using the K-means technique (Eastern Canada).
The model is deemed to be working when RMSE, Canadian provinces of British Columbia, Alberta,
RSE, MAE and RAE are close to zero, and d and r2 Saskatchewan, Manitoba (provinces that belong to the
close to one. region defined as ‘Western Canada’), and Ontario,
Quebec, New Brunswick, Nova Scotia and Newfound-
land (belonging to ‘Eastern Canada’) (see Figures 3–5).
Data The monthly runoff data (in m3/s) were obtained
The proposed models were used to generate a 1-month from a CD-ROM produced by Environment Canada
runoff forecast at 90 hydrometric stations located in the (HYDAT 2013). Two data groups were defined: the first
Table 3. L-moments relationships and ‘D’ statistical index in region ‘9’ with 10 sites.
Site A (km2) nða nosÞ τ2 τ3 τ4 D

Moira River near Foxboro 2620 94 0.123 −0.013 0.088 2.3388
Petawawa River near Petawawa 4120 95 0.125 0.059 0.144 1.3112
Bonnechere River near Castleford 2380 89 0.183 0.046 0.160 0.6722
Madawaska River at Palmer Rapids 5800 80 0.125 0.047 0.079 0.1882
Ottawa River at Britannia 90,900 95 0.127 −0.099 0.115 0.6781
Mississippi River at Appleton 2900 92 0.143 −0.037 0.111 0.1573
Rideau River at Ottawa 3830 76 0.155 −0.053 0.137 0.4832
South Nation River near Plantagenet Springs 3810 94 0.172 0.045 0.111 0.7595
Richelieu (Riviere) Aux Rapides Fryers 22,000 73 0.130 0.081 0.094 0.4119
Humber River at Grand Lake Outlet 5020 85 0.338 0.180 0.117 2.9995
Figure 8. L-moment relationships in a previous region ‘9’ with 10 sites.
Results
Delimitation of the homogeneous regions
The five attributes proposed in Table 1 were obtained
for each of the 90 hydrometric stations in the study
zone. The six regions represented in Figures 6 and
7 were established using the K-means grouping
procedure.
To obtain the homogeneous regions using the
L-moments technique, the L-CV, L-skewness and
L-kurtosis relationships were calculated for each of the
90 analysed stations. The grouping obtained with the
K-means technique was used as the first step; a proce-
dure was performed in which the determined regions
Figure 9. Kappa distribution fitting to nine sites in region ‘9.’ were divided, and stations were added or removed until
a statistical homogeneity index (H) (Equation (7))
corresponded to the period 1915–2008 (1128 months) smaller than 1 was obtained. As an example of this pro-
and the second to the period 1900–2008 (1308 months). cess, the case of the formation of region ‘9’ is presented,
Eighty percent of the available sample was used for the in which 10 hydrometric stations were grouped. Their
training phase, whereas the validation and testing phases L-moments and corresponding discordance measures
used 10% of the available sample each. (Equation (8)) are presented in Table 3.
Figure 10. Location of the 11 homogeneous regions obtained using the L-moments technique.
Table 4. Characteristics of the 11 homogeneous regions obtained using the L-moments technique.
Region Sites s2 s3 s4 Region Sites s2 s3 s4
1 9 0.24 0.25 0.19 7 7 0.13 0.03 0.16
2 11 0.31 0.22 0.20 8 10 0.16 0.02 0.15
3 5 0.31 0.20 0.15 9 9 0.14 0.01 0.12
4 13 0.45 0.29 0.14 10 9 0.13 0.05 0.09
5 6 0.17 0.06 0.14 11 5 0.08 0.03 0.14
6 6 0.12 0.01 0.12
Given that 10 stations are present in region ‘9’, and found to be μV = 0.0212, σV = 0.0106, and
based on Table 2, the allowed critical value of the H ¼ 0:778\1: This indicates that the region can be
discordance mean is D = 2.491, which is exceeded in the considered homogeneous.
Humber River site at the Grand Lake Outlet (D = 2.999). The final outcome of the procedure resulted in the
Figure 8 presents the diagrams of the L-moment relation- delimitation of 11 homogeneous regions; their distribu-
ships, and it can be observed that the site represented by tion is shown in Figure 10. The characteristic means of
a square in the graphic (Humber River station at Grand the L-moment relationships and the number of sites in
Lake Outlet) is a clear outlier and must be removed from each region are presented in Table 4.
region ‘9’ and relocated to another region.
After removing the Humber River station at the
Grand Lake Outlet from region ‘9’ and using the nine Regional ANN models
remaining sites, the following statistical values were Once the number of sites belonging to a given region is
obtained: s2 ¼ 0.143, s3 ¼ 0.009, s4 ¼ 0.116, and V = determined using either the K-means or L-moments
0.027. The parameters of the kappa distribution (Figure 9) techniques, the architecture proposed in Figure 2, which
were determined to be ξ = 0.1202, a ¼ 0.5952, k ¼ consists of a multilayer perceptron network (multi-
0.0094, and h = 0.1164. The values of Equation (7) were layer feed-forward neural network), can be applied.
Figure 11. Monthly estimated runoff for the Spray River at Banff site and region in the validation phase at station 05BC001.
Figure 12. Monthly estimated runoff for the Spray River at Banff site and region in the test phase at station 05BC001.
Figure 13. Monthly estimated runoff for Bow River near Seebe site and region in the validation phase at station 05BE004.
Equation (12) was employed for data normalisation. The (Figure 6) and belonging to region ‘5’ (eight sites) in the
value of the new minA was assumed to be equal to 0, K-means grouping, and region ‘1’ (nine sites) in the
and the new maxA was assumed to be equal to 1.0, plac- homogeneous region delimitated by L-moments. Table 5
ing the analysed data within the range of [0, 1]. shows the values of the statistical indices of perfor-
The optimal model was found through a trial and mance, concordance and correlation for both stations.
error process by initialising the model in which the num- From the results, it can be observed that the regional at-
ber of neurons in the hidden layer was equal to the lag site models improved the runoff forecasts compared to
time; in this case, four neurons were used. It was found the at-site model. In the case of station 05BC001, the
that between five and 10 tests were sufficient to produce inclusion of one additional record in the analysis (eight
the best results, and the performance of each test was sites in the L-moments model and nine sites in K-means
evaluated using the r2 coefficient. model) was beneficial because in the test phase, the
Once the best model was obtained through trial and RMSE was reduced from 358 m3/s for the ANN model
error, the runoff values were forecasted using the at-site to 298 m3/s for the KM-ANN model. Therefore, the
ANN and the regional KM-ANN and LM-ANN models improvement was 16.7%, based on the following calcula-
(regional at-site). Finally, the performance (RMSE, RSE, tion: ½ð358 298Þ=358 100. For station 05BE004,
MAE and RAE), concordance (d) and correlation r2 this inclusion produced practically the same results for
levels were compared between the forecast data and the the same test phase, because the RMSE value surpassed
observations of each model to select the most adequate 739 m3/s for the ANN model and was reduced to
method for runoff forecasting. 630 m3/s for the LM-ANN model and 635 m3/s with the
In particular, Figures 11–14 present fits of the at-site KM-ANN model. Thus, the results were an average of
ANN models and the regional KM-ANN and LM-ANN 15.7% better for both cases when the regional at-site
models at stations 05BC001 (Spray River at Banff) and estimations were applied than when using the traditional
05BE004 (Bow River near Seebe), located in Alberta at-site ANN analysis.
Figure 14. Monthly estimated runoff for Bow River near Seebe site and region in the test phase at station 05BE004.
Table 5. RMSE, MAE, d and R2 statistical indices for the fitting of the artificial neural network (ANN) and regional ANN models
at stations 05BC001 and 05BE004.
05BC001 05BE004
RMSE MAE RMSE MAE
Model/Phase (m3/s) (m3/s) d r2 (m3/s) (m3/s) d r2
ANN (Validation) 386 263 0.947 0.891 655 446 0.907 0.906
LM-ANN (Validation) 376 249 0.950 0.905 641 432 0.910 0.914
KM-ANN (Validation) 367 247 0.953 0.912 617 434 0.919 0.921
ANN (Test) 358 228 0.950 0.926 739 472 0.844 0.911
LM-ANN (Test) 356 248 0.920 0.945 630 414 0.885 0.934
KM-ANN (Test) 298 235 0.949 0.951 635 464 0.886 0.933
The results were very similar for the remaining 88 0.91 were obtained using the at-site ANN model.
stations (Table 6). The RMSE, RSE, MAE and RAE Moreover, the average coefficient of determination also
values were improved when using the regional models increased, from 0.88 to 0.93 in the case of the LM-
(LM-ANN and KM-ANN) in comparison with those ANN model and from 0.89 to 0.94 for the KM-AMM
obtained using the at-site ANN model. These values model.
were better by an average of 29.6, 44.7, 26.2 and 25.6%, In general, the at-site ANN model better fit 15.6%
respectively. out of the 90 stations, whereas 27.8% and 56.7% of sta-
When the LM-ANN model was the best forecast tions were better fit using the regional at-site estimation
option, the average concordance index increased from of the KM-ANN and LM-ANN models, respectively.
0.90 to 0.96. For the KM-ANN model, this index This finding demonstrates that regionalisation has a
increased from 0.91 to 0.95. Reference values 0.90 and positive effect on the forecast model performance.
Table 6. RMSE, RSE, MAE, RAE, d, and r2 indices obtained by fitting the ANN and regional ANN models at all stations.
ANN LM-ANN KM-ANN

Station RMSE RSE MAE RAE d r2 RMSE RSE MAE RAE d r2 RMSE RSE MAE RAE d r2 Decision
01AF002 4859 0.13 3657 0.37 0.96 0.99 3727 0.08 2937 0.30 0.98 0.94 4549 0.12 3140 0.32 0.97 0.92 LM-ANN
01AK001 55 0.12 38 0.32 0.96 0.91 48 0.09 33 0.27 0.98 0.91 60 0.14 47 0.39 0.95 0.88 LM-ANN
01AQ001 56 0.09 41 0.28 0.97 0.96 52 0.08 30 0.20 0.98 0.93 78 0.18 65 0.44 0.94 0.86 LM-ANN
01AQ002 373 0.16 270 0.35 0.95 0.93 302 0.10 202 0.27 0.97 0.90 340 0.13 229 0.30 0.96 0.90 LM-ANN
01AR004 129 0.11 76 0.27 0.97 0.91 103 0.07 73 0.26 0.98 0.94 98 0.06 59 0.21 0.98 0.94 KM-ANN
01BE001 380 0.08 247 0.25 0.98 0.92 512 0.15 345 0.35 0.96 0.86 348 0.07 259 0.27 0.98 0.94 KM-ANN
01DG003 33 0.18 18 0.28 0.95 0.84 19 0.06 13 0.21 0.98 0.97 23 0.09 15 0.24 0.98 0.91 LM-ANN
01EC001 135 0.13 99 0.32 0.96 0.94 98 0.07 65 0.21 0.98 0.94 139 0.14 94 0.31 0.96 0.91 LM-ANN
01EF001 318 0.11 210 0.27 0.97 0.91 315 0.11 206 0.27 0.97 0.90 312 0.10 154 0.20 0.97 0.90 KM-ANN
01EO001 271 0.08 175 0.23 0.98 0.92 228 0.06 154 0.20 0.98 0.94 854 0.81 504 0.65 0.87 0.73 LM-ANN
01FB001 124 0.10 90 0.31 0.97 0.92 128 0.11 98 0.34 0.97 0.91 116 0.09 95 0.33 0.98 0.94 KM-ANN
01FB003 62 0.09 49 0.29 0.98 0.96 52 0.06 39 0.23 0.98 0.97 75 0.13 58 0.34 0.97 0.96 LM-ANN
02YK001 756 2.31 596 1.54 0.81 0.98 489 0.97 379 0.98 0.89 0.99 784 2.48 619 1.60 0.80 0.99 LM-ANN
02YL001 781 0.16 579 0.39 0.97 0.92 698 0.13 516 0.35 0.97 0.89 936 0.23 646 0.44 0.96 0.91 LM-ANN
02BD002 357 0.18 281 0.44 0.96 0.95 330 0.16 229 0.36 0.97 0.93 647 0.60 423 0.66 0.91 0.90 LM-ANN
02BE002 352 0.48 256 0.64 0.92 0.89 244 0.23 180 0.45 0.96 0.94 644 1.61 459 1.14 0.81 0.76 LM-ANN
02CE002 136 0.08 94 0.26 0.98 0.93 151 0.09 109 0.30 0.98 0.91 161 0.11 120 0.33 0.98 0.92 ANN
04LD001 3197 0.94 1929 0.80 0.86 0.70 1076 0.11 714 0.29 0.97 0.91 1854 0.32 1035 0.43 0.94 0.86 LM-ANN
04LF001 1430 0.54 802 0.56 0.90 0.74 775 0.16 466 0.32 0.96 0.85 697 0.13 423 0.29 0.97 0.90 KM-ANN
04LJ001 1312 0.17 1008 0.43 0.96 0.86 1103 0.12 911 0.39 0.97 0.89 1129 0.13 945 0.41 0.97 0.93 LM-ANN
02EA005 63 0.14 43 0.34 0.96 0.87 44 0.07 37 0.29 0.98 0.93 60 0.12 38 0.30 0.96 0.88 LM-ANN
02EB004 197 0.11 136 0.31 0.97 0.90 168 0.08 128 0.29 0.98 0.94 215 0.13 137 0.31 0.97 0.87 LM-ANN
02EB006 530 0.10 382 0.28 0.97 0.91 389 0.05 315 0.23 0.99 0.95 478 0.08 337 0.25 0.98 0.92 LM-ANN
02EC002 249 0.14 170 0.33 0.96 0.90 207 0.10 159 0.31 0.97 0.91 232 0.12 158 0.30 0.96 0.88 LM-ANN
02FB007 39 0.22 28 0.40 0.92 0.92 23 0.08 18 0.27 0.98 0.94 25 0.09 18 0.26 0.97 0.94 LM-ANN
02FC001 555 0.11 411 0.31 0.96 0.97 426 0.07 308 0.23 0.98 0.94 573 0.12 376 0.29 0.97 0.90 LM-ANN
02FC002 291 0.12 212 0.32 0.96 0.91 245 0.09 175 0.27 0.98 0.92 258 0.10 184 0.28 0.97 0.93 LM-ANN
02GA003 283 0.09 181 0.24 0.97 0.92 267 0.08 170 0.23 0.98 0.92 295 0.10 192 0.25 0.97 0.95 LM-ANN
02GA010 111 0.13 64 0.26 0.96 0.94 80 0.07 54 0.22 0.98 0.94 89 0.08 60 0.25 0.98 0.93 LM-ANN
02GB001 354 0.07 238 0.22 0.98 0.95 327 0.06 210 0.20 0.98 0.94 348 0.07 210 0.20 0.98 0.93 LM-ANN
02GD001 173 0.15 121 0.33 0.95 0.94 148 0.11 94 0.26 0.97 0.91 174 0.16 126 0.35 0.94 0.93 LM-ANN
02GD003 318 0.29 222 0.46 0.88 0.90 213 0.13 165 0.35 0.96 0.92 228 0.15 173 0.36 0.95 0.92 LM-ANN
02GD004 51 0.18 35 0.36 0.93 0.93 39 0.11 27 0.28 0.97 0.92 46 0.15 34 0.35 0.95 0.92 LM-ANN
02GD005 214 0.22 156 0.43 0.91 0.96 169 0.14 115 0.32 0.95 0.92 208 0.21 147 0.41 0.92 0.92 LM-ANN
02GE002 419 0.13 302 0.32 0.95 0.97 326 0.08 238 0.25 0.98 0.94 542 0.22 396 0.42 0.91 0.90 LM-ANN
02HA003 8039 0.49 6464 0.70 0.93 0.99 6536 0.32 5263 0.57 0.95 0.99 6277 0.30 5050 0.54 0.95 0.99 KM-ANN
02HB001 7 0.06 5 0.23 0.99 0.95 9 0.11 6 0.29 0.98 0.96 24 0.69 19 0.91 0.75 0.99 ANN
02HL001 300 0.12 240 0.35 0.97 0.90 214 0.06 152 0.22 0.99 0.95 301 0.12 253 0.36 0.97 0.92 LM-ANN
02KB001 381 0.08 282 0.28 0.98 0.94 323 0.06 230 0.23 0.98 0.94 372 0.08 287 0.28 0.98 0.94 LM-ANN
02KC009 168 0.09 118 0.28 0.97 0.91 166 0.09 129 0.31 0.98 0.91 143 0.07 114 0.27 0.98 0.93 KM-ANN
02KD004 456 0.08 324 0.25 0.98 0.95 336 0.04 264 0.21 0.99 0.96 332 0.04 283 0.22 0.99 0.97 KM-ANN
Canadian Water Resources Journal / Revue canadienne des ressources hydriques
02KF005 5903 0.10 3320 0.22 0.97 0.92 5002 0.07 3352 0.23 0.98 0.94 5071 0.07 3475 0.23 0.98 0.94 LM-ANN
02KF006 256 0.10 184 0.29 0.98 0.91 264 0.11 185 0.29 0.98 0.92 242 0.09 184 0.29 0.98 0.92 KM-ANN
15
(Continued)
Table 6. (Continued).
16
ANN LM-ANN KM-ANN

02LA004 478 0.16 289 0.31 0.96 0.86 450 0.14 290 0.32 0.97 0.89 402 0.11 297 0.32 0.97 0.90 KM-ANN
02LB005 712 0.16 512 0.42 0.95 0.85 755 0.18 533 0.43 0.96 0.86 767 0.19 614 0.50 0.96 0.87 ANN
02OJ007 1781 0.08 1194 0.22 0.98 0.94 2081 0.11 1432 0.27 0.97 0.90 1285 0.04 951 0.18 0.99 0.97 KM-ANN
05AA004 39 0.42 24 0.74 0.78 0.92 17 0.08 11 0.35 0.98 0.93 19 0.10 14 0.44 0.97 0.90 LM-ANN
05AA008 56 0.15 36 0.36 0.97 0.92 41 0.08 26 0.25 0.98 0.93 45 0.09 34 0.34 0.98 0.94 LM-ANN
05AC003 32 0.33 16 0.51 0.85 0.90 23 0.16 13 0.41 0.94 0.98 28 0.25 16 0.50 0.89 0.93 LM-ANN
05AD003 258 0.16 167 0.36 0.96 0.89 180 0.08 124 0.27 0.98 0.94 236 0.14 154 0.33 0.96 0.88 LM-ANN
05AD005 114 0.17 79 0.40 0.96 0.91 79 0.08 55 0.28 0.98 0.94 95 0.12 62 0.31 0.97 0.93 LM-ANN
05AE002 92 0.98 36 0.83 0.57 0.62 35 0.14 21 0.47 0.96 0.90 45 0.23 23 0.52 0.96 0.91 LM-ANN
05AE005 14 1.19 6 1.06 0.59 0.67 5 0.15 3 0.56 0.95 0.95 7 0.28 4 0.64 0.87 0.96 LM-ANN
05AE006 13 0.00 7 0.03 0.99 0.99 112 0.08 98 0.47 0.98 0.94 151 0.14 125 0.60 0.97 0.94 ANN
05AE016 11 0.47 8 0.60 0.92 0.92 20 1.61 15 1.19 0.65 0.67 11 0.47 8 0.60 0.92 0.92 ANN
05AF010 23 4.71 18 1.91 0.63 0.91 10 0.82 9 0.98 0.84 0.96 8 0.62 6 0.65 0.73 0.99 KM-ANN
05AH005 27 3.44 23 1.78 0.62 0.83 31 4.58 25 1.97 0.72 0.98 39 7.16 33 2.59 0.71 0.95 ANN
05BC001 358 0.12 228 0.29 0.95 0.93 356 0.11 248 0.30 0.92 0.95 298 0.08 235 0.28 0.95 0.95 KM-ANN
05BE004 739 0.31 472 0.48 0.84 0.91 630 0.23 414 0.42 0.89 0.93 635 0.23 464 0.47 0.89 0.93 LM-ANN
05BJ001 714 4.61 421 2.33 0.60 0.65 105 0.10 67 0.37 0.97 0.90 138 0.17 76 0.42 0.93 0.98 LM-ANN
C. Escalante-Sandoval and L. Amores-Rovelo
05BJ004 92 0.13 45 0.28 0.96 0.90 63 0.06 42 0.26 0.98 0.95 97 0.14 62 0.38 0.95 0.97 LM-ANN
05BL007 190 11.78 92 4.03 0.67 0.83 14 0.06 10 0.44 0.99 0.94 18 0.10 12 0.53 0.97 0.93 LM-ANN
05CC002 475 0.10 222 0.23 0.97 0.90 1550 1.09 698 0.73 0.86 0.84 57 0.00 22 0.02 0.99 0.99 KM-ANN
05DC001 673 0.15 469 0.40 0.97 0.94 989 0.33 576 0.49 0.94 0.91 59 0.00 30 0.03 0.99 0.99 KM-ANN
05DF001 1443 0.23 903 0.43 0.96 0.95 2416 0.66 2071 1.00 0.90 0.89 917 0.09 443 0.21 0.98 0.94 KM-ANN
05EA001 75 0.95 53 0.90 0.86 0.86 141 3.34 108 1.86 0.68 0.74 29 0.14 26 0.44 0.97 0.94 KM-ANN
05GG001 74 0.01 54 0.02 0.99 0.99 2112 0.29 1697 0.57 0.93 0.90 1338 0.12 767 0.26 0.98 0.95 ANN
05HA003 27 2.31 18 2.54 0.61 0.74 13 1.45 7 0.96 0.84 0.91 5 0.18 3 0.48 0.93 0.96 KM-ANN
05HG001 12 0.01 8 0.01 0.99 0.99 4017 1.30 2443 0.96 0.85 0.95 1356 0.15 1017 0.40 0.96 0.88 ANN
05JB001 98 1.09 33 0.86 0.68 0.71 48 0.26 15 0.39 0.89 0.91 101 1.16 50 1.31 0.71 0.85 LM-ANN
05JF001 57 0.08 33 0.26 0.97 0.94 70 0.13 58 0.46 0.97 0.94 103 0.27 54 0.43 0.94 0.80 ANN
05KJ001 208 0.00 111 0.02 0.99 0.99 4606 0.21 3672 0.50 0.95 0.91 1972 0.04 1440 0.20 0.99 0.96 ANN
05LJ007 113 0.71 54 0.72 0.66 0.62 46 0.12 35 0.46 0.97 0.92 56 0.17 37 0.50 0.94 0.86 LM-ANN
05ME001 193 0.18 134 0.41 0.94 0.82 227 0.25 139 0.43 0.92 0.82 138 0.09 68 0.21 0.98 0.93 KM-ANN
05MJ001 746 0.23 397 0.36 0.92 0.87 1079 0.49 742 0.67 0.82 0.85 334 0.05 214 0.19 0.99 0.96 KM-ANN
05NB001 20 0.61 18 2.30 0.81 0.80 77 9.35 21 2.69 0.63 0.99 18 0.50 17 2.19 0.85 0.97 KM-ANN
05ND004 18 0.23 13 0.63 0.92 0.87 26 0.50 19 0.90 0.92 0.93 17 0.21 13 0.61 0.94 0.84 KM-ANN
05ND007 111 0.43 65 0.82 0.91 0.76 390 5.33 150 1.92 0.68 0.83 102 0.36 64 0.82 0.84 0.81 KM-ANN
05NF002 45 1.18 23 1.36 0.81 0.95 84 4.11 32 1.88 0.73 0.91 45 1.18 18 1.04 0.83 0.93 KM-ANN
05NG001 427 0.29 224 0.49 0.88 0.79 383 0.24 205 0.45 0.90 0.88 457 0.33 255 0.56 0.81 0.71 LM-ANN
05OC001 4426 0.23 2603 0.44 0.92 0.86 7819 0.73 4373 0.74 0.64 0.91 5271 0.33 2978 0.51 0.87 0.82 ANN
05OC004 372 0.27 168 0.41 0.89 0.85 283 0.15 166 0.41 0.95 0.88 402 0.31 189 0.47 0.87 0.82 LM-ANN
05OD001 473 0.35 225 0.38 0.91 0.70 478 0.36 369 0.62 0.93 0.95 473 0.35 273 0.46 0.86 0.82 ANN
05OE001 204 0.47 101 0.50 0.79 0.80 172 0.33 93 0.46 0.85 0.92 217 0.53 115 0.57 0.73 0.83 LM-ANN
05PA006 836 0.11 521 0.28 0.97 0.91 868 0.11 602 0.33 0.97 0.90 748 0.09 523 0.28 0.98 0.94 KM-ANN
05PA012 308 0.09 234 0.34 0.98 0.92 578 0.32 429 0.62 0.91 0.86 509 0.25 335 0.49 0.95 0.88 ANN
(Continued)
Table 6. (Continued).
ANN LM-ANN KM-ANN
05PB014 510 0.19 295 0.38 0.95 0.82 405 0.12 328 0.34 0.97 0.89 424 0.13 304 0.35 0.96 0.91 LM-ANN
05PC018 2850 0.12 1828 0.29 0.97 0.88 2586 0.10 1643 0.26 0.97 0.91 3646 0.20 2733 0.44 0.95 0.91 LM-ANN
05PC019 3353 0.24 1950 0.39 0.95 0.82 2084 0.09 1469 0.29 0.98 0.91 3466 0.26 2684 0.54 0.92 0.91 LM-ANN
02AB006 345 0.13 199 0.31 0.97 0.90 546 0.34 453 0.71 0.93 0.89 515 0.30 265 0.41 0.95 0.93 ANN
Notes: RMSE (root-mean-squared error), RSE (mean absolute error), MAE (mean absolute error), RAE (relative absolute error), d (concordance index), and r2 (coefficient of determination).
Canadian Water Resources Journal / Revue canadienne des ressources hydriques
17
Conclusions Coulibaly, P., F. Anctil, and B. Bobée. 2001. Multivariate reser-

voir inflow forecasting using temporal neural networks.
In this study, two regional models based on ANNs were Journal of Hydrologic Engineering 6 (5): 367–376.
developed for forecasting monthly runoffs. The models Dehghan, M., B. Saghafian, F. Rivaz, and A. Khodadadi. 2014.
used information from homogeneous regions delineated Uncertainty analysis of monthly streamflow forecasting.
using the K-means and L-moments techniques. Current World Environment 9 (3): 894–902.
The forecast generated by of the regional models, KM- Demirel, Mehmet C., Anabela Venancio, and Ercan Kahya.
2009. Flow forecast by SWAT model and ANN in Pracana
ANN and LM-ANN, were compared to those obtained basin. Portugal. Advances in Engineering Software 40 (7):
with at-site ANN models. According to the RMSE, RSE, 467–473.
MAE, RAE, d and r2 statistical indices obtained during Demirel, C., M. Booij, and K. Ercan. 2012. Validation of an
the model validation and test phases, 56.7% of the cases ANN flow prediction model using a multistation cluster anal-
analysed were better fit by the LM-ANN model, and the ysis. Journal of Hydraulic Engineering 17 (2): 262–271.
Diamantopoulou, M. J., P. E. Georgiou, and D. M. Papamichail.
KM-ANN model better fit 27.8% of the analysed stations. 2006. A time delay artificial neural network approach for
For the remaining 15.6% of the stations, no improvement flow routing in a river system. Hydrology and Earth System
was obtained using regional models. Sciences 3: 2735–2756.
Results showed that in 84.4% of cases, inclusion of Govindaraju, R. S., and A. R. Rao. 2000. Artificial Neural
information from neighbouring stations together with a Networks in Hydrology. The Netherlands: Kluwer.
Hosking, J. R. M. 1994. The four-parameter kappa distribution.
perceptron multilayer regional model (multilayer feed- IBM J. Res. Develop. 38 (3): 251–258.
forward neural network) generated more accurate Hosking, J. R. M., and J. R. Wallis. 1997. Regional frequency
monthly runoff forecast. According to the RMSE, RSE, analysis: An approach based on L-moments. Cambridge,
MAE and RAE statistics, the forecast values were better UK: Cambridge University Press.
by an average of 31.8%. Taking both models into HYDAT. 2013. HYDAT data base. National Water Data
Archive. Environment and Climate Change. Canada. https://
account, the concordance index increased from 0.90 to ec.gc.ca/rhc-wsc/default.asp?lang=En&n=9018B5EC-1
0.96, while the coefficient of determination increased Latt, Z. Z., H. Wittenberg, and B. Urban. 2015. Clustering
from 0.88 to 0.93. This level of improvement in the fore- hydrological homogeneous regions and neuronal network
cast of monthly volumes has previously been demon- based index flood estimation for ungauged catchments: An
strated in the regional estimation of maximum instant example of the Chindwin River in Myanmar. Water
Resources Management 29: 913–928.
flows (Hosking and Wallis 1997). MacQueen, J. B. 1967. Some methods for classification and
The proposed regional models represent a promising analysis of multivariate observations. In Proceedings of the
field of research for forecasting monthly runoffs. These Fifth Symposium on Math, Statistics, and Probability, 281–
forecasts may be improved if other climate variables, 297. Berkeley, CA: University of California Press.
such as maximum or minimum temperatures and precipi- Mohammadi K. 1, Eslami H. R., and Dayyani Dardashti, Sh.
2005. Comparison of regression, ARIMA and ANN models
tation, are included in the delimitation process of homo- for reservoir inflow forecasting using snowmelt equivalent
geneous regions. (a case study of Karaj). Journal of Agricultural Science
and Technology 7: 17–30.
Pearse, P. H., F. Betrand, and J. W. McLaren 1985. Currents of
References change: Final report: Inquiry on federal water policy.
Arlan, Cheleng A. 2013. Artificial neural network models Ottawa: Environment Canada.
investigation for Euphrates River forecasting & back Pham, D. T., S. S. Dimov, and C. D. Nguyen. 2005. Selection
casting. Journal of Asian Scientific Research 3 (11): of K in K-means clustering. Proc. Instn. Mech. Engrs, Part
1090–1104. C: J Mechanical Engineering Science. 219: 103–119.
Cannon, A., and P. Whitfield. 2002. Downscaling recent Pierini, Jorge O., Eduardo A. Gomez, and Luciano Telesca.
streamflow conditions in British Columbia, Canadá using 2012. Prediction of water flows in Colorado River, Argen-
endemble neuronal network models. Journal of Hydrology tina. Latin American Journal of Aquatic Research 40 (4):
259 (1–4): 136–151. 872–880.
CHA. 2008. Hydropower in Canada. Past, present and future. Toth, E. 2009. Classification of hydro-meteorological conditions
Canada: Canadian Hydropower Association, 8. and multiple artificial neural networks for streamflow forecast-
Cigizoglu, H. K. 2003. Estimation, forecasting and extrapola- ing. Hydrology and Earth System Sciences 13: 1555–1566.
tion of flow data by artificial neural networks. Hydrological Toth, E. 2013. Catchment classification based on characterisa-
Sciences Journal 48 (3): 349–361. tion of streamflow and precipitation time series. Hydrology
Coulibaly, P., F. Anctil, and B. Bobée. 2000. Daily reservoir and Earth System Sciences 17: 1149–1159.
inflow forecasting using temporal neural networks with Zealand, C. M., D. H. Burn, and S. P. Simonovic. 1999. Short
stopped training approach. Journal of Hydrologic Engineer- term streamflow forecasting using artificial neural networks.
ing 230 (3–4): 244–257. Journal of Hydrology 214: 32–48.

Regional Monthly Runoff Forecast in Southern Canada Using ANN K Means and L Moments Techniques

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Regional Monthly Runoff Forecast in Southern Canada Using ANN K Means and L Moments Techniques

Caricato da

Copyright:

Formati disponibili

Canadian Water Resources Journal / Revue canadienne

des ressources hydriques

ISSN: 0701-1784 (Print) 1918-1817 (Online) Journal homepage: http://www.tandfonline.com/loi/tcwr20

Regional monthly runoff forecast in southern

Carlos Escalante-Sandoval & Leonardo Amores-Rovelo

To link to this article: http://dx.doi.org/10.1080/07011784.2017.1290552

Published online: 27 Apr 2017.

Submit your article to this journal

View related articles

View Crossmark data

Full Terms & Conditions of access and use can be found at

Download by: [187.177.120.125] Date: 11 May 2017, At: 08:09

Introduction producing the quality and quantity of water necessary to

*Corresponding author. Email: caes@unam.mx

© 2017 Canadian Water Resources Association

was used. According to Hosking and Wallis (1997), a þwj;0 i (10)

The mathematical representation of the three-layer

Figure 2. Architecture of the regional artiﬁcial neural network.

Activation function is a second-order, non-linear optimisation technique that

Figure 3. Study area in Canada (Western and Eastern Canada).

Performance evaluation for the ANN, KM-ANN and Pn

Site A (km2) nða nosÞ τ2 τ3 τ4 D

Figure 8. L-moment relationships in a previous region ‘9’ with 10 sites.

ANN LM-ANN KM-ANN

ANN LM-ANN KM-ANN

Conclusions Coulibaly, P., F. Anctil, and B. Bobée. 2001. Multivariate reser-

Potrebbero piacerti anche