Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Models
Chapter 20
ABSTRACT
The Stochastic Event Flood Model (SEFM) was developed for analysis of
extreme floods resulting from 72-hour general storms and to provide
magnitude-frequency estimates for flood peak discharge, runoff volume and
mximum reservoir level for use in hydrologic rislc assessments at dams. It
can also be used to assess the variability of floods produced by design storms
such as Probable Mximum Precipitation. The model was developed
specifcally for application in mountainous areas of the western United States
where snowmelt runoff is commonly a contributor to flooding. This chapter
provides a description of the basic concepts employed in developing the
Computer model and identifies the various hydrometeorological components
that are modeled in the Computer simulations. Results from some recent
applications of the model are also presented. The model is in the early stages
of implementation and changes are being made as more is learned about the
probabilistic characteristics of the hydrometeorological processes. It is
anticipated that the model will continu to evolve as improvements are made
to the model.
20.1.
OVERVIEW
simulation contains a set of input parameters that were selected based on the
historical record and collectively preserves the dependencies between
parameters. The simulated floods constitute elements of an annual maxima
flood series that can be analyzed by standard flood- frequeney methods. The
resultant flood magnitude-frequeney estimates reflect the likelihood of
occurrence of the various combinations of hydrometeorological factors that
affect flood magnitude. The use of the stochastic approach allows the
development of separate magnitude- frequeney curves for flood peak discharge
(Fig. 20.1a), flood runoff volume (Fig. 20.1b), and mximum reservoir level
(Fig. 20.1c). Frequency information about mximum reservoir levels is
particularly important for use in hydrologic risk assessments because it
accounts for all the pertinent hydrologic factors - flood peak discharge, runoff
volume, hydrograph shape, initial reservoir level, and reservoir operations. All
of the flood characteristics above, and the hydrologic risk can be evaluated on
a monthly, seasonal, or annual basis.
71N
20.2.
The stochastic event-based flood model41 has the capability to simlate a wide
range of hydrometeorological and watershed conditions. Computer simulations
are conducted for 72-hour duration general storms based on end-of-month
hydrometeorological conditions. Runoff is computcd on a distributed basis for
polygons of land called Hydrologic Runo IT Uuils (lIRUs) Ihal have eommon
mean annual
7(1'
20.2.1.
710
20.3.
The stochastic event flood model is currently configured for simulation of 72hour general storms. There is no computational limit to the size of the
watershed to which it can be applied. However, implicit in the development of
the model is the condition that some hydrometeorological parameters are
highly correlated spatially. For example, soil moisture accounting is conducted
to determine soil moisture conditions at the onset of the extreme storm. In
conducting soil moisture accounting, multi-month periods of precipitaron and
snowpack are taken to be highly correlated throughout the watershed.
Specifically, the exceedance probability of multi-month precipitation at a
given location is assumed to not vary significantly from that at other locations.
Thus, the exceedance probability can be adequately representcd as one areally
averaged valu. In a similar manner, the exceedance probability of multimonth snowpack can also be adequately represented as one areally averaged
valu.
As the watershed size increases, the requirement for high spatial
correlalion of inulti-month precipitation and snowpack becomes more (linicul!
lo salisfy. l itis considentlion suggesls that the stochastic model is applicable to
watersheds up to a nominal size of about 500 mi2. For larger watersheds, the
spatial variability of some hydrometeorological parameters may warrant that
site-specific modules be developed to address the site-specific spatial
characteristics of the watershed under study.
20.4.
conditions in the watershed at the onset of the extreme storm. This requires
that a distributed approach be used in modeling the rainfall-runoff process so
that the spatial variability of soil moisture, soil moisture storage
characteristics, soil infiltration rate, snowpack, and frozen ground conditions
can be properly accounted for in computing runoff.
20.4.1.
20.4.2.
712
n ioo to no
g Greater Than 110
Keechelus Watershed
Elevation Zones
20.5.
HYDROMETEOROLOGICAL COMPONENTS
Antecedent Temperatura
Antecedent Snowpack
Initial Streamflow
7
8
Dependency
Independent
Dependent upon: 1
Dependent upon: 1
Dependent upon: 1 and
2
Independent
Vares By Zone
Mean Annual
Precipitation
Elevation
Mean Annual
Precipitation
Mean Annual
Precipitation, Soils
Precipitation Temporal
Characteristics
Storm Centering
Independent
10
11 Precipitation Spatial Characteristics
12
20.5.1.
Independent
Independent
Elevation
Fig. 20.5. Seasonality of 72-Hour Extreme Storms for West Face of Sierra
Mountains iu Central, California.
NON-EXCEEDANCE PROBABILITY
Flg. 20.6.
Fxainple of Antecedent Precipitation for October lst to the
Eiul-of-Fcbruary, llames Oregon.
717
Snowpack Distribution
7IV
20.5.2.
720
analysis are obtained from gages within the watershed and in climatologically
similar areas. Figures 20.10a,b depict two temporal distributions generated by
SEFM for the Keechelus watershed. The temporal characteristics were
developed from analyses2932 of 25 storms in the Cascade Mountains of
Washington.
Precipitation Spatial Characteristics - Probabilistic analyses are conducted of
depth-area-duration data developed from historical storms. This information is
applied in a probabilistic manner to allow for variable storm areal coverage and
to describe the spatial distribution of precipitation over the watershed. Figure
20.11 depicts a family of depth0.80
c
t-
_ 0.70
=- 0.60 Z
O 0.50
< 0.40
9:
0.30
nj i
0.20
O
Q.
Rl
0.0
l 1
0
II
InmFlin
1
ILD
0.10
12
18
24
30
36
42
48
54
60
66
72
TIME (Hours)
0.80
0.70 c
^ 0.60 z
12
TIME (Hours)
18
24
30
36
42
48
54
60
66
72
20.5.3.
< 0.40
: 0.30
U
0.20
0L
0.10
0.00
WIINIIIII((IOII.
723
Rain + Snowmelt
Soil Moisture
Storage (Rool Zone)
1U
Surface Runoff
Gravitational or
Intermed ate
Vadose Zone
Mnimum Surface Infiltration Rate - is the limiting rate at which the soil can
accept water at the soil surface for a specified soils zone. This occurs when the
soil is fully wetted and soil moisture is at field capacity.
Deep Percolation Rate - is the limiting rate that a soil layer, hardpan within
the soil column, or underlying bedrock can accept water that has infiltrated the
surface of the soil for a specified soils zone. Water that passes through this
limiting soil layer, hardpan, or bedrock contributes to groundwater and does
not return to the stream during the time interval for modeling of the extreme
flood.
Soil Moisture Storage Capacity - is the moisture holding capacity of the soil
column to the depth that can be affected by evapotranspiration.
Evapotranspiration - is the average monthly potential evapotranspiration
amount for a specified zone of mean annual prccipitation.
lis
Soil
Moisture Surface Infiltration Rates (in/hr)
Mximum
Mnimum
Storage Capacity
Deep Percolation
Rate (in/hr)
(in)
l
4.00
2.00
0.60
0.06
6.00
2.00
0.60
0.06
6.00
2.00
0.60
0.08
4.00
0.00
0.06
5
(reservoir)
0.00
Surface Runoff Unit Hydrographs - are used to convert the computed surface
runoff volume from each sub-basin into a flood hydrograph. Surface runoff
unit hydrographs can have variable lag time and peak discharge to account for
the variability observed in nature.
Interflow Runoff Unit Hydrographs - are used to convert the computed
interflow runoff volume from each sub-basin into a flood hydrograph.
Interflow runoff unit hydrographs have fixed lag time and peak discharge
based on calibration to observed floods.
Reservoir Routing and Dam Operations - reservoir operations are simulated
consistent with standard operating procedures for the project under study. The
Computer program is currently set up for the HEC-1 model36, which uses a
fixed reservoir elevation-discharge rating curve. Project specific modules can
be developed to simlate more complicated operational procedures.
20.6.
SIMULATION PROCEDURE
One of the key features of the stochastic modcl is the use of Monte Cario
simulation
methods (Jain15,
Salas el al26) for
selocling 1 he
724
72I
discharge, the basics steps are to: collect an annual maxima series for the
period of record; view the magnitude-frequency characteristics of the data by
constructing a probability-plot using a standard plotting position formula to
estimate annual exceedance probabilities; and fit a probability distribution to
the annual maxima data in attempting to capture the statistical information
contained in the dataset. Flood peak discharge magnitude-frequency estimates
are then made using the distribution parameters for the fitted probability
model.
If an extremely long period of flood record were available (multi- thousand
years of flood peak discharge annual maxima in a stationary environment),
then a plotting position formula and probability-plot would be sufficient for
capturing the frequency characteristics for all but the rarest flood events within
the dataset.
The Computer simulation of multi-thousand years of flood annual maxima
provides a flood record analogous to the latter case described above. With that
in mind, the basic construct for the stochastic Computer simulation procedure
can be described as follows.
An extremely long record of 24-hour, 10 mi2 precipitation annual maxima
is generated using Monte Cario sampling procedures (assuming stationary
climate). A 72-hour general storm is developed for each of the 24-hour
precipitation annual maxima based on the probabilistic characteristics of the
temporal and spatial components of historical extreme storms. A storm date,
end-of-month is selected for occurrence of the storm. Hydrometeorological
parameters are then selected to accompany each storm based on the historical
record in a manner that preserves the seasonal characteristics and dependencies
between parameters. The general storms and all other hydrometeorological
parameters associated with the storm events are then used to generate an
annual maxima series of floods using rainfall-runoff modeling. Characteristics
of the simulated floods such as peak discharge, runoff volume and mximum
reservoir level are ranked in order of magnitude and a non-parametric plotting
position formula and probability-plots are used to describe the magnitudefrequency relationships.
726
current Pentium level (300 mhz) computational and storage power of personal
computers, 25,000 flood simulations can be conducted in about 12 hours using
about 3 gigabytes of storage.
If flood events more common than an Annual Exceedance Probability
(AEP) of about 1:2500 are of interest, then 25,000 simulations of annual
maxima are adequate to develop the magnitude- frequeney curves. In many
applications, there is a desire to estimate magnitude-frequeney curves for flood
events more rare than an AEP of 1:2500. In these cases, an altemative Monte
Cario sampling procedure is needed that allows development of the magnitudefrequeney curves for extremely rare floods while recognizing the practical
limits posed by computational power/storage constraints. This can be
accomplished using a piecewise approach (Barker et al3) that requires much
less computational effort. Magnitude-frequeney curves can be constructed by
computing several simulation sets. Each simulation set is used to define a
different portion of the frequeney curve - for example one to two log-eyeles of
annual exceedance probability (Fig. 20.14).
This approach can be best explained with an example. Consider the case
where flood events with an AEP of 1:1,000 to 1:10,000 are of interest. Since
the largest flood events in an annual maxima series (either historical or
Computer generated) exhibit the greatest variability, a record length about 10
times greater than the target recurrence interval (1/AEP) is appropriate to
reduce uncertainties due to sampling variability for the upper end of this
frequeney range. Thus, a record length of 100,000 annual maxima would be
used to develop probability - plots for making magnitude-frequeney estimates
in the target range of 1:1,000 to 1:10,000 AEP.
In a standard Monte Cario approach, annual maxima storm magnitudes
would first be sampled at random and then the hydrometeorological parameters
would be selected to accompany the storm. This approach would require that
the fiill 100,000 sample set be generated. In the piecewise approach, it is
recognized that the smallest storms in the sample set are not going to generate
the largest floods. Sincc wc are only interested in the upper-most log-cycle(s)
of extreme llood characteristics, we would simlate floods from the collection
of Ilie largor storms from a rcduccd sample set. For this example, we would
develop 1 he magniludc-frequcncy estimates based on the 25,000
727
1-0.44 rex N
+ 0./2
72N
where Pex is the annual exceedance probability, N is the number of years for the
record length being simulated, n is the actual number of simulations being
conducted (n out of N years), and i is the rank of the precipitation annual
maxima being simulated (ranges from 1 to n).
The resulting n floods from each simulation set are ranked in descending
order of magnitude and the Gringorten plotting equation is used to compute the
annual exceedance probabilities for flood peak discharge, flood runoff volume,
and mximum reservoir level.
20.7.
20.8.
SOFTWARE COMPONENTS
The general storm stochastic event flood model (SEFM) is comprised of six
software components. These components include: data entry; input data preprocessor; HEC-1 template file; stochastic inputs generator; IlliC-l rainfallrunoff flood computation model; and an output data
Fig. 20.15. Flow Chart for Operation of the Computer Software for Stochastic
Simulation.
20.8.1.
20.8.2.
The input data pre-processor performs a variety of tasks, and operates within
Microsoft Excel 9/ using Microsoft Visual Basic for Applications10. The first
task conducted through the spreadsheets is to perform validity checks of the
vales of the input parameters for each of the hydrometeorological variables.
Vales that are found to be out of bounds are identified/flagged on the
spreadsheet. The second task is to conduct preliminary Monte Cario
simulations for each of the hydrometeorological variables to allow examination
of the generated vales and to compute sample statistics of the generated
vales. This allows a basic confirmation of the validity of the generated vales
and allows comparisons to be made with historical data. Lastly, the preprocessor is used to cali the stochastic inputs generator to conduct Monte Cario
input parameter generation for all hydrometeorological variables for use in the
Computer flood simulations and to create the input files for the HEC-1
hydrologic model.
20.8.3.
7U
specific solution.
The stochastic inputs generator reads a HEC-1 input file, called a HEC-1
template file, that contains the Monte Cario input in 80 column card format
(cards). The output from the routine is an ASCII text HEC-1 input file with
the Monte Cario cards replaced by HEC-1 cards that reflect the Monte Cario
simulated surface runoff, interflow, initial reservoir elevation, and initial
streamflow. A separate input file is created for each Monte Cario simulation of
flood annual maxima. Thus, if 25,000 simulations (25,000 annual maxima) are
performed to define a frequency curve, then 25,000 HEC-1 input files will be
created by the routine.
20.8.3.1.
SEFM program. Table 20.3 lists the HEC-1 Monte Cario cards that are read
and replaced during the simulation.
Corresponding
HEC-1 Card
IT
ID
MCBA
BA
MCBF
BF
MCRS
RS
Purpose
Simulation Duration and Time-Step
Run Title for Project/Study
Surface And Interflow Components
Initial Streamfiow and Base Flow
Recession
Initial Reservoir Elevation
20.8.5.
20.8.6.
The output data post-processor is used as a repository for the output from the
flood simulations and is contained within a Microsoft Excel20 workbook.
Vales of the hydrometeorological inputs are passed to the post-processor to
allow examination of the inputs that produced a given output. Visual Basic20
routines are used to read the hydrographs saved in the punch files, and extract
the maxima peak flow rate, runoff volume, and reservoir elevation, and to
construct magnitude-frequency curves in Microsoft Excel20. Standard features
of the Excel spreadsheet allow the output to be sorted and analyzed in any
manner desired to examine the hydrologic conditions that produce a given
magnitude flood.
20.9.
APPLICATIONS OF SEFM
20.9.1.
frequencies of extreme storms, flood peaks, and mximum reservoir levels for
AR Bowman Dam and reservoir on the Crooked River in central Oregon.
It is seen in Fig. 20.16a that there are two storm seasons, a winter and
spring-summer season. Snowpack is at a mximum in the winter and early
spring and conventional flood analyses had considered the winter period to
offer the greatest potential for rain-on-snow flood events. However, large
floods (Fig. 20.16b) were found to occur more frequently in the late-spring
when soils are fully wetted at the end of the spring snowmelt season, and
snowpacks lingered at the high elevations in the watershed. Likewise,
mximum reservoir levels due to floods were found to occur more frequently
in the late-spring (Fig. 20.16c) when the reservoir was full or nearly full from
snowmelt runoff and the frequeney of extreme floods was highest.
This type of analysis can be useful for evaluating how the seasonality of
floods interaets with current reservoir operations to produce mximum
reservoir levels. This type of information provides a logical starting point for
optimizing reservoir operations to meet multi- purposc goals and to reduce
hydrologic risk.
AR BOWMAN WATERSHED
0.40
>O
5 0 30
Z)
2 020
tL
OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP END OF MONTH
0.40
0.30
0.20
: w WM, 1
,
0.10
OCT NOV
0.
DEC JAN FEB MAR APR MAY JUN JUL AUG SEP END OF MONTH
00
736
AR B o w m a n D a m Reservoir Inflow
Vs. Predpitation
90.000
80.0
'S. 70,000
60,000
50,000 !=
40,000
30,000
20,000
10,000
o
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 Watershed Average
Predpitation (in)
Fig. 20.17. Scatterplot of Flood Peak Discharge Versus Storm Magnitude for AR
Bowman Watershed, in Central Oregon.
25,0
annual maxima for the AR Bowman watershed. The AR Bowman
watershed resides in a semi-arid to sub-humid watershed and ex h i bits the
characteristically high variability between flood magnitude atld storm magnitude
commonly seen in these dry climatic settings.
7V7
20.9.3.
LU
O
DC
<
100000
(f
O
*
<
LU
CL
Fig. 20.18.
Scatterplot of
Peak Discharge
Flood Runoff
Volume for the
Bowman Watershed.
Flood
Versus
AR
RUNOFF VOLUME (acre-feet)
Figures 20.19a,b,c depict three flood hydrographs chosen from the simulations.
Each flood hydrograph contains the same runoff volume,
738
20 M.G. Schaefer,
different
monthsB.L.
andBarker
represents different antecedent conditions and/ initial
reservoir levels. This variety of conditions has resulted in different mximum
reservoir levels produced by the floods.
20.9.4.
1.0
0 0.50
0.20
0.10
0.04
0.02
0.01
Fig. 20.20. Comparison of Simulated Flood Peak Discharge Anniml Maxima with
USCS Regional Growth Curve.
N||. 20.21 n. IVliignitiulc-Pi'eqiicncy Curve Cor Flood Peak Discharge for Kcccheliis
Wnlershed biiNcd on SEFM Simulations.
741
140,00
0
20 / M.G. Schaefer,
B.L. Barker
J 120,000
UJ
IL
100,000
o
~ 80,000
3 60,000
O
>
it 40,000
O
z
0.5
10'1
102
10'3
10-4
10'5
106
107
10'8
2
2526.0
0 2525.0
, 2524.0
t 2523.0
0 L
0 t 2522.0
1 2521.0
0 I 2520.0
2519
2518.0
2517.0
2516.0
Fig. 20.21b,c. Frequency Curves for Flood Runoff Volume and Mximum Reservoir
Level for Keechelus Watershed based on SEFM Simulations.
Even with the relative sophistication of the stochastic approach, the rare
probabilities of those combinations of conditions that could pose a threat to
some dams are beyond what many experts would consider the limit of credible
calculation. In particular, recommendations put forward by an international
body of experts on flood hydrology (USBRW) suggest that an AEP of 10 5 for
flood characteristics is the limit o!' credible calculation with
technologies/methodologies currently available.
19
In keeping with the spirit of the USBR Rcport , magnitude- frequency
curves for flood characteristics are shown willi a solid lino out through an
AEP of 10 1 and a dashcd l i n e is useil beyond. II should
742
be noted that simulations are conducted for flood events more rare than 10'5
and are used to define the magnitude-frequency curve, however the dashed line
is used as a reminder of the limit of credible calculation.
7'M
Fig. 20.22. Frequency Flistogram for Floods Produced by PMP for the AR
Bowman Watershed.
SUMMARY
The Stochastic Event Flood Model was developed for analysis of extreme
floods resulting from 72-hour general storms and to provide magnitudefrequeney estimates for flood peak discharge, runoff volume and mximum
reservoir level for use in hydrologic risk assessments at dams. It was
developed specifically for application in mountainous areas of the western
United States where snowmelt runoff is commonly a contributor to flooding.
Application of the model al so pro vides insight into the seasonalities of
extreme storms, flood peaks, runoff volumes and mximum reservoir levels.
This type of analysis can be very useful for evaluating how the seasonality of
floods interaets with current reservoir operations to produce mximum
reservoir levels. This information provides a logical starting point for
optimizing reservoir operations to meet multi-purpose goals and to reduce
hydrologic risk.
REFERENCES
1.
2.
3.
4.
Benjamn JR and Cornell CA, Probabilitv. Statistics and Decisin for Civil Engineers.
McGraw-Hill, 1970.
5.
Cattanach, JD, and LuoW, Use of an Atmospheric Models and a Distributed Watershed
Model for Estimating the Probabilitv of Extreme Floods. ASDSO National Conference, Las
Vegas, pp673-680, October 1998.
6.
7.
Gringorten II, A Plotting Rule for Extreme Probabilitv Paper. Journal of Geophysical
Research, vol. 68, pp. 813-814, 1963.
8.
Haan CT, Statistical Methods in I-Ivdrology. Iowa State University Press, 1977.
9.
Helsel DR and Hirsch RM, Statistical Methods in Water Resources. Elsevier Studies in
Environmental Science 49, NY, 1992.
A Monte Cario
Approach to
ASDSO Annual Conference
10. Holtan HN, Stitner G J, Henson WH and Lpez NC, USDAHL-74 Revised Model of
Watershed Hydrology, Technical Bulletin No 1518. Agricultural Research Service, US
Department of Agriculture, 1975.
11. Hosking JRM, L-Moments: Analysis and Estimation of Distributions usina Linear
Combinations of Order Statistics. Journal Royal Statistical Society, Ser B, 52, pp 105-124,
1990.
12. Hosking JRM, and Wallis JR, Regional Frequencv Analysis - An Approach Based on I ,Moincnls. Cambridge Press, 1997.
7-15
74(>
7-17
ABSTRACT
A new stochastic hydrology program called SAMS (Stochastic Analysis
Modeling and Simulation) has recently been developed by Colorado State
University with support from the US Bureau of Reclamation. SAMS offers
computational and graphical capabilitics in the areas of stochastic modeling,
analysis and simulation. SAMS has built on capabilities previously available in
the widely used LAST stochastic hydrology package - developed by William
L. Lae of the Bureau of Reclamation in 1978 and 1979, but now offers
updated and cnhanced capabilities in many areas.
7.1 H
21.1.
INTRODUCTION
7-t')
7S0
SPIGOT (Grygier and Stedinger, 1990). The LAST package was developed
during 1977-1979 by the U. S. Bureau of Reclamation (USBR). Originally, the
package was designed to run on a mainframe Computer (Lae, 1979) but later it
was modified for use on personal computers (Lae and Frevert, 1990). While
various additions and modifications have been made to LAST over the past 20
years, the package has not kept pace with either advances in time series modeling
or advances in Computer technology. This is especially true of the Computer
graphics. These facts prompted USBR to promote the initial development of the
SAMS package.
The first versin of SAMS (SAMS-96.1) was released in 1996. Since then,
corrections and modifications were made based on feedback received from the
users. In addition, new functions and capabilities have been implemented. SAMS
2000 is the most recent versin. It has the following capabilities and limitations:
21.2.
and
(21.2)
(21.3)
752
(21.1)
m
Stochast
ic
Analysis
The temporal dependence
of y, may be characterized
,
function. The sample
autocorrelation coeffcient
Modelin
time series y, mayg and
be determined by
Simulati
on
(SAMS
m,
2000) /
21
by the autocorrelation
of the
(21.4)
where
(2L5)
m k = (i / N ) i o'/+ ~ y)(y t ~ y)
t= 1
and k = time lag. Likewise, for multisite series, the lag-k sample crosscorrelation coeffficient between the time series for sites i and j, denoted by
rk J, may be estimated by
(< )
where
<4
(1 / N) i
{/l t -
- JM)
(21.7)
in which ///(" is the sample variance for site i. Note that m, = mi.
21.2.1.2. Seasonal Statistics
Lcty ViT be a seasonal time series, where v =1,...,N represent years and
r =/ ..... co, seasons, with A^=number of years and co =number of seasons.
This time series can be analyzed by using the overall statistics as in section
21.2.1.1, but seasonal hydrologic time series, such as monthly llows, are better
characterized by seasonal statistics. The mean and slnndard deviation (or
season r can be estimated by
1
21 / J.D. Salas, W.L. Lae, D.K. Frevert
= ^
(21.8)
Jy v=i
and
1
(A,->,)2
V=1
(21'9)
(21.11)
0,1k
where
1
m
in which m 0 T represents the sample variance for season r. Likewise, for
multisite series, the lag-k sample cross-correlations between site i and site j,
for season t, rf T may be estimated by
m{
rL = 7 ---------------- (21-13)
and
yvvi
75-1
in which m( x represents the sample variance for season x and site i. Note that
in Eqs. (21.11) through (21.14) when T- k < 1, the terms,
V = 1, y VtX _ k ,
m^_ k , J_ k , y (/! k , and m" t _ k are replaced by
^
Ko + r-k respectively.
21.2.2.
(21.15)
7M
21.2.2.3.
(21.17)
where S a = 0 and y n is the sample mean of y v y n . Then, the adjusted range R*
and the rescaled adjusted range R** can be calculated by
K = max(^0, j;, ..., s) - min(J0, J,..., S)
(21 .18 )
and
(21.19)
respectively, in which s n is the standard deviation of y v y n that is determined by
Eq. (21.2). Likewise, the Hurst coeffcient for the series may be estimated by
75i
(21.22)
(21.23)
r=(jr+7y
(21 .24 )
- Power transformation
Box-Cox transformation
Y=
{X+a)b
b* 0
(21.25)
where Y is the transformed series, X is the original observed series, and a and
b are transformation coefficients. Note that the logarithmic transformation is
simply the limiting form of the Box-Cox transform as I he coefficient b
approaches zero. Also, the power transformation is a Nhifted and scaled form
of the Box-Cox transform. The variables Y and X can represent either annual
or seasonal data. For seasonal data a and b enn be choscn to vary with the
season. In addition, the transformed data enn be standard i zed by subtracting
the mean and dividing by the Hliiiuliird deviation (standardization is actually
an option in SAMS). For example, Ibr seasonal series, the standardiza!ion
may be expressed as:
7S7
(21.26)
where Z VT is the standardized series, and Y T and S T (Y) are the mean and the
standard deviation of the transformed series for month T. Then, the stochastic
models can be fitted to the standardized series Z v1 . .
21.3.
21.3.1.
-------- typBP
75H
(21.28a)
(21,28b)
+ e,
(21.29)
0, = mx / m0
(21.30)
o 2 () = (1 - f)s 2
(21.31)
* =
-x -
(21.32)
m~,
(21.33)
mx
(s2 - ,m,)
0.=<b.+
41 Yl
(^s 2 -^)
2(e) = ^
0,
(21.34)
(21.35)
0
^ = ^-1+02^2+^-^-.
7V>
(2L36)
,2/%'2
m\
- s^-m,
m~,m,
0, = -2 -
(21.37)
m{ - s-m-1/n m
m^m
x
2 3
s m102 m-,m,
m-.tru
~ 02mi>
%
\
1 S:>
;> %
1
0j s
+ (j)2m]
mx
e,
"
- mx + 02//?j)
/V ^
mx + 02 /?y
)6{
(21.40)
(21-41)
/=! /=!
where N is the sample size. Once the 0s and 0s are determined, then the
noise varianee c 2 (e) is determined by (1 / JV)E 2 . The minimization of the
sum of squares of Eq. (21.41) may be obtained by a numerical scheme.
Powell's algorithm has been commonly employed for least squares estimation
of parameters of ARMA models. The Powell algorithm (Gil et al, 1981 and
Himmelblau, 1972), is an expanded versin of the univariate gradient search
which is a useful optimization technique that does not require derivatives. The
moment estimates of ARMA(p,q) models may be taken as the initial vales in
the search algorithm. The non-derivative optimization techniques depend very
7<0
(21.38)
much on the starting points when the objective function is not convex. In these
cases there is no guarantee that the solution found corresponds to the global
minimum. The solution may be improved by choosing a different starting
point.
21.3.2.
(21.42)
{214J)
where A, a, and 5 are the location, scale, and shape parameters, respectively.
Lawrence (1982) found that e can be obtained by the following scheme:
e = A(l -</>) +?7
(21.44)
where
r\ = 0
(21.45)
if M= 0
M
if
11=Z
./ i
7(.l
M> 0
(21.46)
(21.47)
P_
a2
7 =7 =
(21.48)
where fl, o 2 , 7 , and Pj are the mean, variance, skewness coeffcient, and the
lag-one autocorrelation coeffcient, respectively.
Based on results given by Kendall (1968), Wallis and OConnell (1972),
and Matalas (1966) and based on extensive simulation experiments conducted
by Fernandez and Salas (1990), they suggested the following estimation
procedure:
K=
(21.52)
70
(21.53)
si
70
(21.54)
(21.55)
B= 1.487V-1 + 6.77 N~ 2 ,
(21.56)
and
L=^ ~ 2 ) ,
V(N-1)
(21.57)
7<i'
21.3.3.
(21.58)
(21.59a)
(21.59b)
C21.60)
7(.l
01,
]
A2,
= ,^ +
ni.1,T
1
(01,r+l4 ~
~^
(21.62)
(01,t4-1 ~ m \.j)(0!,t4-1 - m u) 0 \x + l
0l,*+i 4-1 - WI,T+I
"W
(21.63)
0,1,T
+1
lie) =
(21.64)
01.T
=
3,T
t.^-1
2 T
(21.65)
m. x m,, -/,_,w9 _,
(21.66)
W
l,-r-l //?1,T-2 _ 4-2 ^,T-1
A I . (4 - m u - K m 2^
(i+i^-%r+^+i^)
0, = 0, + -------- --- ---^------- -------72 ----------- 2 ~ ^
W
OT
(01,r 4-1 - ^l.r + 02,T l.T-l) (01,r 4-1 -
(21.67)
CT2(?)
-T-y2/+1 UT -------- ^
0
1,T + 1
1 T
(21.68)
where s- is the seasonal variance, and m k T is the estmate of the lag-/c 8easonto-season covarianee of J^T, i.e. M kx = E[Y VX Y vr _ /c ] beciuse K(Y vt ) 0. Note
also that s7 /// r.
75
In a similar manner as for the ARMA(p,q) model, the Least Squares (LS)
method can be used to estmate the model parameters of PARMA(p,q) models.
In this case, the parameters 0s and 0s are estimated by minimizing the sum
of squares of the residuals defmed by
N
\2
V = 1
(O
i s ( K ., - u,
T =1
(21.69)
V=1
=1
21.3.4.
0(B)Y( = et
(21.70)
e, = Be i
(21.72)
(21.73)
k^1
(21.74)
= 1
where M k = E[ Y^j'_k] is the lag-/c cross covariance matrix of yT (since E(Y t ) =
0.) In finding the MOM estimates, Eq. (21.74) for k= 1, is solved
simultaneously for the parameter matrices <P y -,j = 1, ..., p, by
substituting in Eq. (21.74) the population covariance matrices M k , k= 1,
by
the sample covariance matrices M k ,
k= 1, 2,
...,p. Then
Eq.
(21.73) is used toestmate the variance-covariance
matrix of the
residuals G. For example, the moment estimators of the MAR(l) model are:
=
(21.75)
G= f 0 -
(21.76)
(21.77)
Ihe abovc matrix equation can have more than one solution. However, a
nniquc Holulion can be obtained by assuming that B is a lower triangular
767
, = t + e, -
(21-78)
j= i
J=i
(2i.79)
y=i
Thus, Eq. (21.79) is the expression of a univariate ARMA(/j,^) model for site i
such that the parameters <//. and 0' can be estimated by the usual ARMA
model estimation methods.
Further, the vector of the noise terms et = []],..., 'n / / can be expressed as
et = ^
(21.80)
7(H
(21.81)
Thus, a CARMA model implies that the cross-correlations between sites are
carried through the residuals.
Two methods can be used for estimating the G matrix:
(1) . The MLE estmate of G may be obtained by
& = -- ^ (21-82)
where , t = 1,..., N are the residuals calculated from model (3.53).
(2) . The MOM estmate of G may be obtained as a function of the
parameters 0 and 0 and the cross-covariances Mu- of Z,, i.e.,
G = f(0, 0, M k )
(21.83)
(21.84)
7ti'J
(21.85)
V,T =
(21.87)
!= 1
Mt., =
(21.88a)
1=1
k,i =
-*<
0andk
(21.88b)
/= i
where M kx = F[Y vx F vx _ /c ] is
covariance
770
be estimated from
<7T = BXB
(21.89)
As for the MAR(p) model, a solution for the above equation can be obtained
by assuming that B% is a lower triangular matrix. This requires that Gx must be
positive definite.
case at hand) of the corresponding generated flow at a key station (or subkey
station) or, in temporal disaggregation, to ensure that the generated seasonal
vales add exactly to the generated annual valu, three methods of adjustment
based on Lae and Frevert (1990) are provided in SAMS. These methods will
be described in detail in the following sections.
21.3.7.1.
Spacial Disaggregation
Yt = AXt + Bet
(21.90)
772
A= MYX)M^(X)
BBT = M 0 (Y) - M 0 (YX)M^(X)M 0 (XY)
(21.91)
(21.92)
and
M,(X) = F[X t Xl ],
M Y )
= 7 ^,7 ,
(21.93)
in which Y n X p e p A, and B are defmed in the same way as for the Valencia
and Schaake model and C is an additional (hxh) parameter matrix. As for the
Valencia and Schaake model, the number of key stations / in the above
equations can be more than one so the above model can be used to
disaggregate annual flows at several key stations to their corresponding flows
at substations.
The model parameter matrices A, B, and C can be estimated by using
MOM as:
A = {[ M 0 (YX) - M X (Y) M Q '(Y) Mf(XY)]
[M 0 (XY) - M x (XY) M Q X (Y) Mf(XY)]-'}
C=
RB T =
[M X (Y) - AM x (XY)]M^(Y)
77
(21.95)
(21.96)
(21.97)
The valu of M\(XY) calculated in Eq. (21.98) should be used in Eqs. (21.95) (21.97) for estimating the model parameters. Lae suggested also that M(Y)
should be calculated as:
M(Y) = M l (Y)+ M 0 (YX)M^(Y)[ M* x (XY) - M x (XY)J
(21.98)
Adjustment for spatial disaggregation
( ]
(21.99)
j= i
approach 2:
=
fiLiA
(21.100)
$/'
J= 1
approach 3:
qi] =
n
+ (rq, - X
#) -
-------
(21.101)
CX^2
/'= 1
y =1
'
where
N
r=(\/N)%d r t
(21.102a)
/=i
I P
r, = ^,
(21.102b)
9t
21.3.7.2.
K,r =
,,-1
(21-103)
(21.104)
C x = [M lt (Y) - 4M U (JT)]M^(Y)
(21.105)
(21.106)
771
approach 2:
(21.110) &V,T
t= 1
and
approach 3:
(21.111)
\ (j2
where co is the
Qv is the
valu,
/= i
qvX
is
the
number of seasons,
generated annual
generated seasonal
valu, q*r is the adjusted generated seasonal valu, flr is the estimated mean of
qv% for season t, and T is the estimated standard deviation of qvX for season t.
777
21.3.9.1.
Testing the residuals properties generally involves testing the normality and
the independence of the residuals. First, the residuals are obtained from the
specified models after the parameters are estimated. For instance, in the case of
the univariate PARMA model of Eq. (21.58), the residuals are the numbers
e {v e x2 ,e xv ... that are derived from the
77N
model. On the other hand, in the case of the MPAR model of Eq. (21.84), the
residuals are the set of numbers
... i = 1,..., n
each set i corresponding to each site or station. Testing the residual properties
can be done in several ways depending on how the residuals are arranged.
Several tests are available for testing the normality of the residuals.
Common normality tests include the skewness test, the chi-square goodness of
ft test, the Kolmogorov-Smimov test, and the product moment correlation test
(Salas et al, 2000). For periodic-stochastic models, the normality tests should
be applied on a month-by-month basis. Often though the tests are applied
considering the entire sample of residuals. In the case of multivariate models,
the normality tests should be applied for each set of data (site by site). In
SAMS, the skewness test of normality is applied on a month-by-month basis
and on a site by site basis.
Likewise, several tests are available for testing the independence of the
residuals. The Portmanteau lack of ft test and the Anderson test (Salas et al,
1980) are commonly used for testing independence in time when the residuals
are derived from stationary stochastic models. On the other hand, the crosscorrelation t-test may be used for testing independence in time when the
residuals are derived from periodic- stochastic models such as those described
in the previous sections. The t-test is applied for the correlation between the
residuals of two successive months, i.e. twelve tests for monthly data.
However, the Portmanteau or Anderson tests may be also applied for testing
the independence of residuals derived from periodic-stochastic models, based
on the autocorrelation of the entire residuals series. In SAMS, the Portmanteau
test of independence was applied. For testing the independence between
residuals of two different sites (independence in space), the usual test is based
on the cross-correlation t-test. Also this test should be applied for the crosscorrelation between residuals of two sites on a season-by-season basis (twelve
tests for monthly data), although the test can be applied based on the crosscorrelation of the entire residual series for each pair of sites.
21.3.9.2.
TestingARMA modelparsimony
For a fitted ARMA(p,q) model, SAMS tests its model parsimony using Akaike
Information Criterion (AIC) (Salas, et al., 1980). For comparing among
competing ARMA(p,q) models, the following equation is used:
AIC(p, q) = N ln(G 2 e ) + 2(p + q)
(21.112)
where N is the sample size and cr is the mximum likelihood estimate of the
residual variance. Under this criterion the model which gives th minimum
AIC is the one to be selected. SAMS computes AICs for the fitted model and
77'
the models of both one step higher order and one step lower order for
comparison. For instance, for a fitted ARMA(1,1) model, SAMS will compute
the AIC vales for ARMA(1,1), ARMA(2,1), ARMA(1,2), ARMA(1,0), and
ARMA(0,1) models for comparison. Besides, to test the assumption of white
noise, the AIC of the ARMA(0,0) is also computed.
21.3.9.3.
Testing the properties of the process generally means comparing the statistical
properties (statistics) of the process being modeled, for instance, the process
Y vr in Eq. (21.58), with those of the historical sample. In general, one would
like the model to be capable of reproducing the necessary statistics that affect
the variability of the data. Furthermore, the model should be capable of
reproducing certain statistics that are related to the intended use of the model.
If Y vz has been previously transformed from X v T, the original nonnormal process, then one must test, in addition to the statistical properties of Y,
some of the properties ofX Generally, the properties of
Y include the seasonal mean, seasonal variance, seasonal skewness, and
season-to-season correlations and cross-correlations (in the case of multisite
processes), and the properties ofX include the seasonal mean, variance,
skewness, correlations, and cross-correlations (for multisite systems).
Furthermore, additional properties of X VT such as those
related to low flows, high flows, droughts, and storage may be includcd
depending on the particular problem at hand.
In addition, it is often the case that not only the properties of the seasonal
processes Y v T and A v T must be tested but also the properties of the
corresponding annual processes AY and AX. For example, this case arises
when designing the storage capacity of reservoir systems or when testing the
performance of reservoir systems of given capacities, in which one or more
reservoirs are for over year regulation. In such cases the annual properties
considered are usually the mean, variance, skewness, autocorrelations, crosscorrelations (for multisite systems), and more complex properties such as those
related to droughts and storage.
The comparison of the statistical properties of the process being modeled
versus the historical properties may be done in two ways. Depending on the
type of model, certain properties of the Y process such as the mean(s),
variance(s), and covariance(s), can be derived from the model in cise form. If
the method of moments is used for parameter estimation, the mean(s),
variance(s), and some of the covariances should be reproduced exactly,
7H0
however, except for the mean, that may not be the case for other estimation
methods. Finding properties of the Y process in cise form beyond the first two
moments, for instance, drought related properties, are complex and generally
are not available for most models. Likewise, except for simple models, finding
properties in cise form for the corresponding annual process A Y, is not
simple either. In such cases, the required statistical properties are derived by
data generation.
Data generation studies for comparing statistical properties of the
underlying process Y (and other derived processes such as A Y, X and AX) are
generally undertaken based on samples of equal length as the length of the
historical record and based on a certain number of samples which can give
enough precisin for estimating the statistical properties of concern. While
there are some statistical rules that can be derived to determine the number of
samples required, a practical me is to generate say 100 samples which can
give an idea of the distribution of the statistic of interest say 6. In any case, the
statistics 0(i), i = 1,...,100 are estimated from the 100 samples and the mean 9
and variance S 2 (0) are determined. Then, the mean deviation, MD(Q)
MD(0) = 6 - 6(J7)
(21.113)
X [6(0 - 8(S/)P
are obtained in which 6( H) is the statistie derived from the historical sample
(historical statistie). The statistics MD(Q) and RRMSD(Q) are useful for
comparing between the historical and model statistics derived by data
generation. In addition, one can observe where 6( H) falls relative to 6 - S(0)
and 6 + S(6). Also graphical comparisons such as the Box-Cox diagrams are
useful.
21.4.
STOCHASTIC SIMULATION
and normal random numbers generators are available in literature (see for
instance, Bradley, 1987 and Press et al. 1986). Subsequently, the normal
random numbers must be incorporated into the stochastic model. Section
21.4.1 summarizes a procedure to generate synthetic hydrologic time series by
using stochastic models. Section 21.4.2 discusses how stochastic simulation
can be used for forecasting.
21.4.1.
Let us assume that our original monthly flow data denoted by have X v x been
transformed into normal flows by using the logarithmic transformation, i.e.
(21.115)
(21.117b)
in which eVT is normally distributed with mean zero and standard deviation
one.
For generating monthly flows, the reverse procedure is followed. We start by
generating standard normal random numbers e v T. Then Eq.
(21.117b) is used to generate the Zs. After generating Z vx then Y VT
can be obtained by
Y vr
= Y z + S X (Y)Z VT
7N }
(21.118)
(21.119)
7N }
used for low order and high order stationary and periodic models while exact
generation procedures available in the literature apply only for stationary
ARMA models or the low order periodic models. Generation based on
multivariate models is carried out in a similar manner except that vector of
standard normal random numbers must be generated.
21.4.2.
The basic times series models are perfectly adequate for generating data for
planning studies, where the main concern is not the immediate next one or two
years. The generation of data for planning studies is usually performed using a
random set of starting conditions that have nothing what-so-ever to do with the
current flows, the immediate past flows or any available forecasts. However,
in real time operations, the main concern is what happens in the next few
months or at the most the next few years. In this case, the current state of the
system and all associated forecasts or variable that can help to forecast the next
few months or years is indeed important. In order to make use of this
information, the time series models are either used differently or modifled to
better use the available information. These changes will in most cases only
slightly alter the generated flows and then only for a short time. However,
minor changes can be of relatively large importance terms of the safety and
efficiency of operations. Besides the immediate past flows, current forecasts
could be important as would be variables such as the ocean temperatures or
other variables of potential valu in forecasting future flows.
Three ways will be discussed as to how the models may be adopted for
stochastic forecasting. The flrst is simply to utilize the time series models but
making use of the recent past flows. Rather than a random start, the models are
started with the most recent flows or their corresponding transformed and
standardized counterparts. For example, consider a simple annual
autoregressive model of order one. This model has but one lagged term. Rather
than use a random term for the lagged flow when starting the generation, it is
easy to inser the present years flow when generating flows traces. Often the
lagged flow term is in a transformed and standardized form in the model and
some simple calculations are required to modify the actual flow into the corred
form for the model. For autoregressive models of any order, Ihis approach is
7M-I
very simple. For ARMA models, the process is more complicated as one has
no idea what the correct vales are for the random terms for the recently
experienced years.
Generally,
theyModeling
are easiest
approximated
frst/ 21solving
Stochastic
Analysis,
and Simulation
(SAMSby
2000)
the model equation for the most recent random term. By successive
substitution an equation can then be developed which gives the random term at
any time as an infinite series consisting only of past flows. Of course, only a
few terms are usually needed to adequately estimate the vales. The advantage
of this approach is that it makes use only of the time series models and, after
estimating the initial conditions, the generation of data is the same process as
normally used for planning studies. The disadvantage of this approach is that it
does not include some other information, which might be of help.
The second approach is to expand upon the first case by additionally
adding terms to the time series model to represent current forecasts or vales
of forecasting variables. This is easily done, however the parameter estimation
may be complicated and care must be taken to avoid pitfalls. Least squares
parameter estimation my in fact be the least prone to problems. The main
advantage is that all current and past knowledge is now utilized. A major
disadvantage is that the model now has many more sets of parameters. For
example, using a monthly model as an example, the model for generating May
flows is dependent upon the starting time. If the generation is started in
January, the model for the May flows has different parameters than if the
generation has started in say March. Further, the model parameters for
generating the first May llows is not the proper set for generating May flows a
year henee. If the goal is to generate flows for the next 18 months, 18 sets of
parameters (or equivalent if more than one month is generated at one time) are
needed for each of the 12 starting times or a total of 18 times 12 or 216 sets of
parameters. If the goal is to generate 36 months into the future, lliree times as
many are needed.
The third approach is where an entire series of variables are available into
the future and the time series model is modified to include these exogenous
variables. In cases where the future variable vales are nccurate into the
future, the time series model may only one set of parameters. However, if the
accuracy changes with distance into the future, the same approach is needed as
for the second approach.
21.5.
DESCRIPTION OF SAMS
21.5.1.
General Overview
SAMS is a Computer software package that deais with the stochastic analysis,
7N(>
21.5.2.
21.5.2.1.
Plotting of the data can help in detecting trends, shifts, outliers, and errors in
the data. SAMS can plot the data as curve, stick, and bar graphs. Figure 21.1
illustrates a time series plot for annual data. The scale of the plot is determined
based on the .sample mximum and minimum as shown in the control bar at
the bottom, but the user can change it by entering the desired graph scale
range. This enables the user to zoom in and out of the plot to examine the data
and do an on- screen graphical check for the variability of the data.
Fig. 21.1. Plotting of annual time series.
21.5.2.2.
SAMS tests the normality of the data by plotting the data on normal
probability paper and by using the skewness test of normality. To examine the
adequacy of the transformation, the comparison of the theoretical generated
distribution based on the transformation and the counterpart historical sample
distribution are plotted as shown in Fig. 21.2 for annual data. For seasonal
data, the results of the seasonal skewness tests are presented in graphical and
tabular formats. The test critical vales are also shown on the screen which are
guides to check whether the data is within the normal range. For example, if
the sample skewness coefficient for a given season is less than or equal to the
7N*
nlw Stolion B: fT
, ; ;!, v .. . j~~
'
|7~
Xianarfonnationi
ave
Diipey
TtaoriennftIS
blatas| scepiTians(matiOn|
PioycjsMnnu{
If the data at hand is not normal, one can check whether it can be
normalized by a certain transformation function. The user can choose any type
of transformation by simply clicking on the corresponding button. Three types
of transformations are available: logarithmic, power, and Box-Cox
transformations. The transformation can be done all at once for all seasons or
on a season-by-season basis. Figure 21.3 shows an example of seasonal
transformation results.
In the event that the user wants to model site 1 data with an ARMA (p,q)
model. Then, the ARMA model will be ftted to the transformed data and not
the original data.
A save option allows the user to save the transformation parameters in a
special file. To understand this feature of SAMS, suppose that a user
transformed the data and ftted the PARMA (1,1) model to the data.
Subsequently, the user wants to fit a different model to the transformed data.
Instead of doing the transformation process over again, the user can simply
open the transformation file, which was saved previously.
Stochastic Analysis,
Modeling and Simulation
(SAMS 2000) / 21
OMAt
wmmwhnm m, testihs:
Site 1 - KEECHELUS_RESERVOIR
2 Check normality of
data and use
transformation options:
7'i
Trans
.
Log
Log
Log
Log
Log
Log
Log
Log
Log
Log
Log
Log
Co-eff.a
Coeff.b
0 0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
Comp.Val. Tab.Val.(1
0.0004
0%
0.0004
level)
-0.0022
0.543
-0.0021
6
-0.0078
0.543
-0.0361
6
-0.0151
0.543
-0.0050
6
-0.0143
0.543
0.0004
6
-0.0438
0.543
0.0070
6
0.543
Type of Transformation: Logarithmic Y(t) = In
6
(X(t) + a) where X(t): Original Series, Y(t)
0.543
:Transformed Series a: Parameter of
6
Transformation
0.543
(IX: SAMftK
_
HR
6
Site 1 Season 1
0.543
NORMAL PPBABIUTY PAER
6
Satriple
Theotelicai
0.543
6
0.543
6
0.543
6
8.0000
3.5000
1.7000
-1.7000
-7.3000
40.0000
120.0000
80.0000
-1.4000
0.0000
-1.1000
2.5000
'OisTiHijn'fin
N orv
Site 1 Season 1 NORMAL PRBABILITV PAPER
NON-EACEEDANCE PPOBABILITY
Enter the valu ofa:. JO.O V 'la
.'.IBWUco | Statjff | 6cceptTiarofoima>ion| Eiinl |
:rilSlation#: |T
10
30 50 70
%
NON-EMCEEOANCE PRBABIUTV
EnterSeason#:
|T
7<>2
21.5.2.3.
2.
3.
4.
21.5.3.
7<>2
Standardization implies that not only the mean will be subtracted but in
addition the data will be further transformed to have a standard deviation equal
to one. For example, for the season 5 data, the mean for season 5 will be
subtracted from each data point, then each observed data point for that season
will be divided by the standard deviation of the 5th season. As a result, the
mean and the standard deviation of the standardized data of the 5th season will
become equal to zero and one, respectively. Then, the order of the model to be
fitted can be selected. Subsequently, the method of estimation of the model
parameters must be selected.
Currently SAMS provides two methods of estimation namely the method
of moments (MOM) and the least squares (LS) method. MOM is available for
the ARMA(p,q), GAR(l), MAR(p), PARMA(p,l), and MPAR(p) models whilo
I ,S is available for ARMA(p,q), CARMA(p,q), and l'ARMA(p,q) models l'he
I S melhod requires initial parameters estimates (starting points). These
starting points can be selected by the user or the MOM parametcrs estimates
can be used as the starting points. For cases where the MOM estimates are not
available such as for the PARMA (p,q) model where q>l, the MOM parameter
estimates of the closest model will be used instead. For example, for the
PARMA(3,3) model, the MOM estimates of the PARMA(3,1) model
(including zeros for the two remaining parameters) will be used as the starting
points. For fitting CARMA(p,q) models, the residual variance- covariance G
matrix can be estimated using either the method of moments (MOM) or the
mximum likelihood estimation (MLE) method (Stedinger et al., 1985). The
estimated model parameters can be saved in a file selected by the user.
After the model has been fitted and the estimated parameters have been
saved, it is recommended that the fitted model be tested to ensure that it is
appropriate for the data at hand. In general, this can be done by testing the
residuals and comparing the model and historical properties of the data. SAMS
has the ability to perform such testing. Testing of the residuals is an important
part of the modeling process by which the modeler can test whether the fitted
model is adequate. In all the models available in the current versin of SAMS
except the GAR(l) model, the basic assumptions about the residuals are that
they are normal and independent. SAMS performs certain statistical tests to
check the validity of these assumptions. The hypothesis that the residuals are
normally distributed is tested based on the skewness test of normality. The
results are presented in terms of rejecting or not rejecting the hypothesis. In
addition, the residuals are plotted on normal probability paper in order to
check graphically whether the residuals are normally distributed. For testing
the independence of the residuals, the Porte Manteau test of independence
(Salas, et al, 1980) is utilized. The correlogram of the residuals is also plotted
to help the user in checking the independence of the residuals. Figure 21.4
shows an example of results of both normality and independence tests of the
residuals.
7*)5
Fig. 21.4. Testing the normality and the independence of the residuals.
Once the model has been fitted to the data, the moments, e.g. the theoretical
covariance structure can be calculated based on the estimated parameters.
Comparing the model and historical covariance (correlation) structure is
another method of testing. SAMS provides the user with the ability to perform
such comparisons. Figure 21.5 is an example of graphical comparison of
model and historical month-to-month correlations. Additional examination of
the model can be made regarding model parsimony. The so called Akaike
Information Criteria (AIC) may be used for this purpose. SAMS uses AIC for
testing model parsimony when stationary ARMA models are utilized.
IiMPARINO
MODEL
and
HISTORIO
:
AL'CRRELC3RfMS . ral ________________ Modal
___________ Station 1
COMPARIMd MODEL d
HISTORICAL HlitQiK al ...
MoJgl
jRRELOGRA
MS . Stat-on
I
The system structure for adjustment usually depends upon the orders
and positions of the stations relative to each other. This is important when
adjustments need to be done to the generated series based on spatial
disaggregation. The system structure means defning for each main river
system the sequence of stations (sites) that conform to the river network.
SAMS uses the concept of key stations and subkey stations (substations
and subsequent stations). A key station is the farthest downstream station
along a main stream. For instance, station 1 is a key station in the river system
shown in Fig. 21.6. Likewise, 2 and 3 are also key stations. On the other hand,
if station 1 would not exist (or not used in the analysis), then stations 4 and 5
would become key stations. Let us continu the explanation assuming that
stations 1, 2, and 3 in Fig. 21.6 are key stations. Substations are the next
upstream stations draining to a key station. For instance, stations 4 and 5 are
substations draining to key station 1. Likewise, slalions 6 and 7 and K and l )
are,
7%
respectively, substations for key stations 2 and 3. Subsequent stations are the
next upstream stations draining into a substation. For instance, stations 11 and
12 are subsequent stations relative to substation 5 and station 10 is a
subsequent station regarding substation 4.
7H
For instance, key stations 1 and 2 and substations 4, 5, 6, and 7 form one group
in which the flows of all these stations are modeled jointly in a multivariate
framework, while key station 3 and its substations 8 and 9 form another group.
In this case, the cross- correlations between the stations within each group will
be preserved but the cross-correlations among stations in different groups will
not be preserved. For example, in the above configuration, the crosscorrelations between stations 1 and 3 will not be preserved but the crosscorrelations between stations 1 and 2 will be preserved. On the other hand, if
all the stations are defined in a single group, then the cross-correlations
between all the stations will be preserved. In the final step of disaggregation, a
group may contain stations 4, 5, 10, 11, and 12. In the current versin of
SAMS, the total combined number of stations in any defined group must not
exceed 10 stations. After modeling the annual flows using the above
configuration, the annual flows can be disaggregated into seasonal flows. This
is handled again by using the concept of groups as was explained above. The
user, for example, can choose stations 3, 8, 9, 17, 18, and 19 as one group. In
this case, the annual flows for these stations will be disaggregated into
seasonal flows by a multivariate disaggregation model so as to preserve the
seasonal cross-correlations between all the stations.
Currently, SAMS has two schemes for modeling the key stations. The flrst
scheme, denoted as scheme 1. will aggregate the annual flows of the key
stations that belong to a certain group, then use a univariate ARMA(p,q) to
model the aggregated flows, then the aggregated annual flows are
disaggregated (spatially) back to each key station by using the Valencia and
Schaake or the Mejia and Rouselle disagregation method. The second scheme,
denoted as scheme 2. will model the annual flows of the key stations belonging
to a given group by a multivariate MAR(p) model. Once the flows at key
stations are modeled, the rest of the procedure for generating annual flows at
all substations and subsequent stations and then for generating the seasonal
flows at all stations is the same as in scheme 1 (as above mentioned).
21.5.4.
7W
"Multisite". In addition, the data length (in years) and the number of samples
to be generated, and a seed number to initiate the generation process need to
be specified. In SAMS, both the number of samples and the length of data to
be generated are unlimited. The user should consider however the Computer
time it will take to generate many samples or very long samples especially if
the generation is to be done for multisite seasonal data.
Furthermore, one of four options regarding the generation model, must be
chosen.
Statistical analysis of the generated data is available. In the case of
analysis pertaining drought, surplus, and storage related statistics, SAMS will
analyze the data in terms of a desired threshold demand level. The default
demand level is the sample mean, but one can change it by keying a fraction of
the sample mean or the actual desired demand level. The results of the
statistical analysis of the generated data can be saved into a file with the
extensin .gst and this file will be automatically attached to store the results.
Note that the referred feature of the statistical comparison of the historical and
generated data can be also used for further testing and verifying whether the
fitted model performs as desired.
In estimating the generated statistics, the statistics of each generated
sample are first estimated then the means and standard deviations of those
statistics are computed which will be used to compare with their historical
counterparts. The results are presented in graphical or tabular formats. Figure
21.7 shows a comparison of the (observed) historical annual series and the
generated series for one sample. The user can change the station number,
sample number, and the graph scale as needed. For annual series, the
comparisons of the historical and generated mean, standard deviation,
skewness coefficient, coefficient o' variation, and sample mximum and
minimum are presented in tabular form. For seasonal series, the comparisons
are presented in both graphical and tabular formats as shown in Fig. 21.8. The
comparisons of correlations for annual and seasonal data may be presented in
graphical or tabular formats as shown in Fig. 21.9 (for seasonal data).
Fig. 21.7. Time series plots of the historical and generated annual flows.
Fig. 21.8. Comparison between the historical and the generated monilily
mean and standard deviations.
Fig. 21.9. Comparisons of the historical and generated seasonal cross- correlations.
NO 2
21.6. EXAMPLES
Statistical Analysis of Data
In this section, SAMS will be used to model actual hydrologic data. The data
used is the monthly data of the Yakima basin. The data will be read from the
file yakima.dat (refer to SAMS Users Manual, Salas, et at, 2000). The file
contains data for 12 stations in the Yakima basin. Each station's data consists
of 12 seasons and is 48 years long. SAMS was used to analyze the statistics of
the seasonal and annual data. Some of the annual and seasonal statistics
calculated by SAMS are shown below.
Annual Statistics
Site Number:
KEECHELUS_RESERVOIR
Historical
Mean
242.9312
Standard Deviation
55.3134
Skewness Coefficient
0.3416
Coef. Variation
0.2277
Mximum
375.5001
Minimum
151.7 000
Correlation Structure
LAG
0
1.0000
0.2773
HIM
2
3
4
5
6
7
8
9
10
-0.0591
0.0644
0.0104
0.0736
-0.1389
-0 .1669
-0 . 0322
-0.1162
0 . 0034
Lag-0
Cross
Correlations
Sites
1 and 1 (K & KE)
1.
E
0000
1 and 2 (K
& KA)
0
.
E
9877
1 and 3 (K
& YA)
0.7864
E
1 and 4 (K
& CL)
0.9826
8c YA)
E
1 and 5 (K
0.9834
Se
E
1 and 6 (K
YA)
0
Se
E
.9525
1 and 7 (K
BU)
0
.
Se
E
9190
1 and 8 (K
NA)
0
.
Se
E
8831
1 and 9 (K
TI)
0.8787
E
1 and 10
(K Se TI)
0.8698
Se
E
1 and 11
(K
NA)
0.8626
Se
E
1 and 12
(K
YA)
0.9243
E
Storage and Drought Statistics
Demand Level = 1.0000 * sample mean
Longest Drought
7.0000
Mximum Dficit
344.2187
Longest Surplus
6.0000
Mximum Surplus
244.0125
Storage Capacity
576.3561
Rescaled Range
10.419 8
Hurst Coefficient
0.7375
Seasonal Statistics
Site Number:
KEECHELUS_RESERVOIR
Season
Historical
HIM
Mean
1
21.6250
2
22 . 5979
3
17 . 8708
4
14 .1542
5
15.5708
6
26.8333
7
47.4375
8
38.1917
9
14.9604
10
4.7375
11
5.4792
12
13.4729
Standard Deviation
1
13.5856
2
13.9981
3
10.2554
4
8.9925
5
8.5916
6
8.5001
7
14.4123
8
19.0200
9
11.6909
10
2.6210
11
4.3821
12
8 . 4761
" ^>_
... .1.1__SSE
---Std.Dev. Station
1 <KE)
SltCoeff.
f
l
t
1
/
Skewness Coefficient
1
1. 0570
2
1.6400
3
0.8679
4
1.0953
5
2.2601
6
0.2109
7
0.1997
8
0.2420
9
1.1964
10
1.3112
11
2.8219
12
0.8688
NO
".
Season
to
Correlations
LAG 1 1
2
3
4
5
6
7
8
9
10
11
12
Season
0 . 5775
0.2969
0.2198
0.4555
0.4143
0.3211
-0 . 0872
0.5527
0.8343
0.8618
0.2814
0.4562
Sites 1 and 2
KA)
1
0 . 9853
2
3
4
5
6
7
8
9
10
11
12
0.9828
0.9793
0.9847
0.9924
0.9632
0.9788
0.9906
0 .9888
0.8572
0 .9504
0.9888
123.7427
80(>
(KE&KA)
A'
f
f o fc
v
v
___
/ ......... v;-'
95% L
'^-\/
95% L
-0.5
6 1 ib ii
Season
1b
N(I7
7.0000
163.890
1
640.110
3
39.0407
21.6.1.
Stochastic Modeling and
Generation of Data
0 . 6471
SAMS was used to model the annual and monthly flows of site 1 of Yakima
basin. Both annual and monthly data used in the following examples are
transformed using logarithmic transformations.
Longest Surplus
Mximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient
21.6.2.1.
SAMS was used to model the annual flows of site 1 with an ARMA(1,1)
model. The MOM was used to estmate the model parameters. SAMS was also
used to generate 150 samples each 48 years long using the estimated
parameters. The following is a summary of the results of the model fitting and
generation by using the ARMA(1,1) model.
Results of fitting an ARMA(1,1) model to the transformed and
standardized annual flows of site 1:
Model:ARMA
Model:ARMA
Number_of_sites:
Site(s)_ID:
Data_Transformations:
Site_l: a-coef= Data_Standardization:
Mean_of_the_process:
1
1
LOG
49.000000
YES
5.658607
Standard_deviation_of_the_process:
0.189585
Model_order(p,q): phi_parameters:
(Annual)
phi_l
-0 .138036
l.
11
(Annual)
k'k'k'k'k'k'k'kMean
kiz'k
0.885071
ARMA,
KEECHELUS RESERVOIR
Standard Deviation
Skewness Coefficient
Coef. Variation
Mximum Mnimum
Historical 2 4 2 . 9 3 1 2
Generated
55.3134 0
.3416
0 .227
7
375.5001
151.7000
242.9985
53.8040
0.4131
0.2212
385.1967
138 .7450
Correlation Structure
LAG
1.0000 1.0000
0
1
2
3
4
5
6
7
8
10
0.2773
-0.0591
0.0644
0.0104
0.0736
-0.1389
-0.1669
-0.0322
-0.1162
0.0034
0.2691
-0.0625
-0.0349
-0.0237
-0.0202
-0.0310
-0.0308
-0.0448
-0.0426
-0.0277
Longest
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage Capacity
Rescaled
Range
Hurst
Coefficient
1.0
* sample mean
7.0
344.218
7
6.0
244.012
5
576.356
1
10.4198
0.7375
NON
N(I7
6.0267
287
.2662 5
.2533
311.2614
488.3525
9.1089
0.6879
Stochastic Analysis, Modeling and Simulation (SAMS 2000) / 21
SAMS was also used to model the transformed and standardized annual
flows of site 7 with an ARMA(2,2) model using the Approximate LS method.
The result of modeling for this site are shown below:
Model:ARMA
Number_of_sites:
Site(s)_ID:
Data_Transformations:
Site_7:
a-coef=
Data_Standardization:
Mean of the process:
6.488171
Standard_deviation_of_the_j>rocess:
0.081923
1
7
LOG
450.000000
YES
/
Model_order(p,q):
phi_parameters:
2 2
(Annual)
phi_l
0.316854
phi_2
-0.122860
theta parameters:
(Annual)
theta_l
-0.002752
theta_2
0.003944
Variance_of_the_residuals:
0.918059
21.6.2.2.
(Annual)
'k'k'k'k'k'k'k'k'k'k'k
N0>
10.2554
8.9925
8.5916
8.5001
14.4123
19.0200
11.6909
2.6210
4.3821
8 .4761
Skewness Coefficient
1.0570
1.6400
0.8679
1.0953
2 . 2601
0 . 2109
0.1997
0.2420
1.1964
1.3112
2.8219
0 . 8688
21.4531
22 . 5754
17 .8748
13.9850
15.4822
26 .5404
47 . 5850
38 .5255
15.4387
4.7413
513.3797
.4180
13.2594
13
.4388
10.6862
9 .4890
8 .8690
8 .3496
13.9888
18 . 9623
13.9993
2.5598
4.1501
8.5231
1.0899
1.2611
1.3163
1.8644
2 .4466
0.3551
0.2544
0 .4822
2 .3082
1. 3478
2.1814
1.2928
to Season Correlations
Season
1
2
3
4
5
6
7
LAG 1
0.5775
0.2969
0.2198
0.4555
0.4143
0.3211
-0.0872
Historical
Generated
0.6249
0.4015
0.1513
0.4693
0.2756
0.2770
-0.0946
NIO
MEAN Generated
/\
/v\
Station 1 (KE)
V/
\i
4h)
Season
ib Vi ib
8
9
10
0.5527
0.8343
0.8618
0.5754
0.8147
0.6320
11
12
0.2814
0.4562
0.4625
0.3269
LAG
HHB
1 (KE)
0 . 3728
2
3
4
5
6
0.1639
0.0267
7
8
9
10
11
12
-0.1219
-0.3637
0.3692
0.7047
0.2319
0.1770
-0.1336
-0.3810
0.4268
0.6075
0.2310
0.1110
Storage
and Drought
0.2491
0.5
f
....... .. f \ 95% L
0.2180
E
o
-0.5
................... /
^ 95% L
I 5
\h
season
Statistics
Demand Level =
Longest Drought
11.0000
10.7400
Mximum Dficit
Longest Surplus
Mximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient
123.7427
7.0000
163.8901
640.1103
39.0407
0.6471
131.8937
6.5900
177.8746
487.0978
29.0030
0.5907
Es!13
ib l'l 1*2
Number_of_sites:
Site(s)_ID:
Data_Transformations:
Site_3: a-coef=
Site_5: a-coef=
Site_7: a-coef=
Data_Standardization:
Mean_of_the_process:
LOG
-205.000000
LOG
2000.000000
6.096067
LOG
8.147832
450
6.488171
YES
000000
Standard_deviation_of_the_process:
0.461667
0.103274
0.081923
Model_order(p,q):
phi_parameters:
1 0
(Annual)
phi_l
0.802852
-0.271925
-0.091863
0.180350
0.241441
0.127788
0.243420
-0.083069
Variance_of_the_residuals:
103272
(Annual)
0.716938
0.736062
0.704988
0.736062
0.900586
0.868521
0.704988
0.868521
0.919150
N12
Correlation
Historica
l
699.3479
246.3507
1.8333
0.3523
1726.4000
346.9000
Generate
d
687.1061
218.7747
1.1163
0.3161
1400.439
9
367.8550
1.0000
0.2548 0 . 0238
0.0770 0.0034 0
. 0430 0.1625 0.1544 0.1121
-0.2085
-0.0532
0
1
2
3
4
5
6
7
8
10
Lag-0 Cross Correlations Sites
7 and
3 (BU & YA)
7 and
5 (BU & YA)
7 and
7 (BU & BU)
0.7269
0.9536
1.0000
Skewness
Variation Mximum
Structure LAG
1.0000
0.2156
0.0339
0.0114
0.0204
0.0103
0.0307
0.0409
0.0252
0 . 0357
-0.0369
0.8142
0.9563
1.0000
4.0000
255.5000
6.0
268.
4500
498.
2249
9.23
97
0.6996
6.0933
303.3795
5.5733
299.3968
495.2468
9.3981
0.6966
Site
Number:
***********
0
1
2
3
4
5
6
7
8
9
10
Historical 1474.3375
358.9830 0.2136 0.2435
Lag-0 Cross
Sites
5 and
3
5 and
5
5 and
7
Storage and
0 .2386
2300.8103
732 .6721
1.0000
0.2546
0 . 0224
0.0445
0.1007
-0.0004
0.0092
-0.0187
0.0426 -
-0.0137
0.1397 -
-0.0363
0.1650 -
-0.0478
0.0598 -
-0.0251
0.1297 -
-0
0.0224
0347
Generated
1461. 5977
348.3850
0.2240
0.0406
Correlations
(YA &
(YA &
(YA &
Drought
Demand Level =
Longest
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage Capacity
Rescaled
Range
Hurst Coefficient
1.0
*
sample
mean
YA)
YA)
BU)
Statistics
0 .8040
1.0000
0
9536
7.0
2220.5625
6.0
1561.5746
3803.9871
10.596
6
0.7428
0.8653
1.0000
0.9563
6.2200
2088.8184
5.6133
2044.6892
3397.3022
9.7093
0.7070
BUMPING RESERVOIR
Site Number:
***********
Mean
Standard Deviation
Skewness Coefficient
Coef. Variation Mximum
Minimum
Historical 209 . 5250
53.9224 0.1097 0.2574
316.4000
112.1000
Generated
207.7169
52.5678
0.1658
0.2534
332.250
5
95.6784
Correlation Structure
LAG
1.0000
1
.2548
2
0.0238
3
.0770
4
0.0034
5
.0430
6
0.1625
7
0.1544
8
0.1121
0.2156
0 . 0339
-0.0114
-0.0204
-0.0103
-0.0307
-0.0409
-0.0252
-0.0357
-0.0369
-0.2085
9
Demand Level =
Sites
7 and
7 and
7 and
1.0000
3 (BU
5 (BU
7 (BU
Lag-0
Correlations
& YA)
& YA)
& BU)
-0.0532
Cross
0.7269
0.9536
1.0000
* sample mean
0.8142
0.9563
1.0000
Longest
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage Capacity
Rescaled
Range
Hurst Coefficient
4.0
255.5000
6.0
268.4500
498.2249
9.2397
0.6996
6.0933
303.3795
5.5733
299.3968
495 .2468
9.3981
0.6966
Hh.
H1 7
1 1
(Annual)
(Annual)
Variance_of_the_residuals:
0.036331
(Annual)
H1 7
0.000000
0.000001
group_#:
Key__stations ID:
Data Transformations:
Station_5:
a-coef=
5
LOG
2000.000000
Basic__statistics_of_the_key_stations: Mean_of_the_process:
8 .147832
Standard_deviation_of_the_process:
0.103274
Number_of_sub_stations:
Sub_stations_ID:
Data_Transformations:
Station_3:
a-coef=
Station_4:
a-coef=
2
34
LOG
-205.000000
LOG
1000.000000
Basic_statistics_of_the_sub_stations: Mean_of_the_process:
6 . 096067
7 .417926
Standard_deviation_of__the_process :
0 .461667 0 . 094175
0
.
000000 0
. 011408
group_#:
Number_of_key_stations:
1
11
Key_stations_ID:
Data_Transformations:
Station_ll: a-coef=
LOG
2406.000000
Basic_statistics_of_the_key_stations:
Mean_of_the_process:
8.189722
Standard_deviation_of_the_process:
0.097387
Number_of_sub_stations:
Sub_stations_ID:
Data Transformations:
2
8
Station_8:
a-coef=
Station_10:
a-coef=
LOG
2500.0
LOG
100.000000
Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
8.090804
6.195611
Standard_deviation_of_the_process:
0.072572
0.205165
A_matrix
0.738420
1.995138
B matrix
0.010106
0.052522
10
0.000000
0.040097
Hl'
group_#:
Number_of_sub_stations:
Sub_stations_ID:
Data_Transformations:
Station_3:
a-coef=
Station_4:
a-coef=
2
3
LOG
-205.000000
LOG
1000.000000
Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
6.096067
7.417926
Standard_deviation_of_the_process:
0.461667
0.094175
Number_of_subsequent_stations: 2
Subsequent_stations_ID:
1
2
Data_Transformations:
Station_l:
LOG
a-coef=
49.000000
Station_2:
LOG
a-coef=
210.000000
Basic_statistics_of_the_subsequent_stations
Mean_of_the_process:
5.658607
6.036669
Standard_deviation_of_the_process:
0
.189585
0
.124544
A_matrix
0.027025
0.005409
1.867341
1.288824
B_matrix
N0
0.000000
0.013417
Number_of_sub_stations:
Sub_stations_ID:
Data_Transformations:
Station_8:
a-coef=
Station_10:
a-coef=
2
8
10
LOG
2500.000000
LOG
100.000000
Basic_statistics_of_the_sub_stations:
Mean_of_the_process:
8.090804
6.195611
Standard_deviation__of__the_process :
0.072572
0.205165
Number_of_subsequent_stations: 2
Subsequent_stations_ID:
Data__Transformations :
Station_7:
a-coef=
Station_9:
a-coef=
LOG
450.000000
LOG
40.000000
Basic_statistics_of_the_subsequent_stations:
Mean_of_the_process:
6 .488171
5 . 980681
Standard deviation of the process:
0 . 081923
0 . 220482
A_matrix
0 . 841615 -0 . 007637
B_matrix
0.017719 0.007666
0
.
093955
1.071983
0 . 000000
0
020592
H/I
matrix
0.100187
-0.790256
-1.087746
-0.736562
-0.812205
matrix
0.272571
0.316740
0.313677
matrix
0.553959
0.669728
0.739637
0.610305
0.730781
1.056880
1.411542
1.300074
1.144258
0.337
333
0.342
454
-5.591171
-5.794002
-6.127742
-10.566408
-7.984121
0.000000
0.062122
0.037441
0.087106
1.344898
0.162744
0.154738
0.319091
0.254001
0.245251
0.342263
0.321621
0.432287
0.507121
0.064863 0.175279
-0.220576
0.154692
-0.291538
-0.239491
0.000000
0.000000
0.051319
0.001906
0.021514
2.833674
3.963215
0.458153 0.967182
-1.119348
-1.188065
-1.012993
-1.139128
3.974580
9.683748
5.462620
8.148174
9.577835
10.251024
0.00000
0
0.00000
0
0.00000
0
0.00000
0
0.00000
0
0.00000
0
0.00000
0
0.104401
0.041748
7.173420
8.640571
0.024983
group #: 1
Season : 5
A matrix 7.485024
-6.270320
-8.476122
-8.248028
-9.367930
matrix
0.812630
0.701064
0.697859
0.728257
0.736211
matrix
4.226117
3.171489
2.938035
2.794332
3.306370
group #: 2
Season : 1
A
matrix
1.716307
-0.044698
-0.326232
-0.327360
-0.147128
B matrix
0.356707
0.429611
0.285874
0.248105
0.420281
22.856783
20.105253
23.010061
16.965302
24.837963
0.000000
0.166081
-0.003222
0.054581
0.000507
-2.866502
-2.221176
-2.149964
-1.867283
-2.100979
16.572737
23.627403
20.584972
15.794100
22.826544
0.000000
0.108808
0.072522
0.057508
0.099421
1.439505
1.168698
1.391239
0.467788
1.053099
0.000000
0.000000
0.098642
0.186866
0.011022
0.141137
0.463517
0.547366
0.927850
1.024580
0.725862
-9.404758
-10.550832
-11.500857
-8.688657
-8.968649
-2.391560
-2.099235
-0.642270
-1.273152
-1.783531
3.279173
1.744206
1.468768
1.905422
1.898677
-3.392607
-2.921220
-1.793802
-2.042270
-3.548427
0.000000
0.000000
0.067637
0.059324
0.026707
-12.453279
-11.709853
-15.063147
-8.150460
-15.114661
0.191075
0.000000
0.000000
0.000000
0.000000
0.050552
0.000000
0.000000
0.000000
0.035298
0.014496
-6.186196
-5.100427
-1.557045
0.988259
-1.299358
0.000000
0.000000
0.000000
0.000000
0.103991
0.000000
0.000000
0.000000
-0.697620
-0.114892
0.423544
1.038808
1.114273
matrix
0.298232
0.272947
0.103876
0.106382
0.254331
0.338
970
-0.406830
-0.334373
-0.131294
-0.066460
-0.362385
0.236938 0.629304
-0.576146
-0.130068
-0.164665
-0.476519
0.654334
0.674303
0.182709
0.211029
0.584160
0.592925
0.381739
0.244746
0.659288
group #: 2
Season : 5
A matrix 3.034014
10.645264
-3.917177
-5.067391
-8.766766
matrix
2.416789
1.659479
1.618642
1.504343
1.396754
-2.220287
-3.768532
-0.828180
-2.485281
-3.613741
1.226575
6.803477
6.347968
4.987008
6.793612
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.285451
0.160045
0.209446
0.127493
0.115327
-0.016704
0.075774
-0.030911
0.595607
0.049310
1.505506
1.830661
1.827437
1.679651
0.869733
-3.527168
-2.907587
-2.917736
-2.833496
-2.370874
2.026513
1.818996
2.342135
2.304887
1.566433
2.032238
-2.139591
-2.449902
-2.352898
-1.248207
matrix
0.490879
0.408971
0.435884
0.417643
6.733072
11.803424
-0.284301
3.801157
9.480949
1.720777
2.314499
0.804703
2.491937
2.021952
RESERVOIR
Historical Generated
Season
Mean
1
16 . 8646
16.9702
2
19.7521
20.0277
3
16.1458
16.1352
4
13.3875
13.4198
5
15.2688
15.3021
6
26.2375
26.3170
7
44 . 6521
44.3891
8
33 .4583
33.1077
9
11.4625
11. 5481
1
2.5000
2.5730
0
1
3.0542
3.0691
1
1
8.9646
9.0435
2
Standard Deviation
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
12.1013
12.8655
9.2932
7 . 9937
8.5009
8 . 1718
14.4182
16.2124
9.5621
2.2897
3 . 0264
6.7700
ISMIS3
.. Historical
Station 2 (KA)
MEAN Generated
5
4
A,
/\
I i i
S 1b 11 1^
i )
Season
12.1722
12.9882
9.5560
8.6000
8.7934
7.9901
13.9078
16.0398
11.1432
2.8500
2.7782
7.4574
H 25
(KA)
0.7127
1.2846
0.9671
1.7533
5
6
7
8
9
1
0
1
1
1
2
2 .2952
0.1600
0.3599
0.2885
1.1013
1.2974
3.1266
1.1720
2.4070
0.2967
0.3659
0.5599
2.1439
2.2914
1.9573
1.8414
OII
El
5
SKEWNESS C0EFF.
Historical Generated Station 2 (KA)
2
1
* ......................................... ............
aA A
1\
1 i 3 4 S 7 1b 1'1 1^ Season
H 26
0.6589
0.4100
2
3
4
5
0 .3546
0.4388
0.4377
0.2811
0 . 0489
0 .5925
0.8565
0 .8978
0 .3768
0.6031
7
3
9
1
0
1
1
El 13
0 .4416
0.5262
0.2913
0.3904
0.3056
0.1841
0.0670
0.6309
0 . 8436
0.6757
0.4301
0.4574
0.6
S
05 0
O
-0.5
-1.0
'A
1
2
Demand Level
"I 3 5 I l ? 5 ib i'i iT
season
* sample mean
Longest Drought
Mximum Dficit
Longest Surplus
Mximum Surplus
Storage Capacity
Rescaled Range
Hurst Coeffcient
10.0000
112.4566
8.0000
163.0347
564.9718
36.5715
0.6356
10.7700
124.0420
7.6300
181.7904
593.0753
37.6897
0.6359
_____________________________ jBB
STORAGE AND DROUGHT STATISTICS
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
56.0854
76.6458
64.6104
62.5875
77.3479
151.5771
280.5313
236.7167
103.7792
40.6542
28.0208
36.2292
56.1426
77.0317
64.7072
63.0430
77.0965
152.925
1
280.540
1
236.687
3
103.159
6
40.6756
28.1190
35.8892
BSD
MEAN
'Historical
Standard Deviation
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
40.6182
70.4072
42.6869
39.8475
48.1696
59 .2644
96.9022
103 .3385
57.9129
17.0232
10.6177
20.6300
44.2509
64.4993
41.9235
41.8386
43.1282
58.0738
94.0675
100.587
4
57.7420
16.5544
10.5495
19.6031
Skewness Coefficient
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
0.9662
3.4510
2.3219
1.3324
2.7078
0.4732
0.5250
0.2663
0.8792
0.0272
-0.4915
1.3003
1.8455
2.1245
1.8524
1.8464
1.7876
0.5181
0 . 6177
0.4145
1.2173
0.1560
0.1691
1.3090
N27
Generated
Station 11
(NA)
to
Season
Correlations
LAG 1
0
.6809
3
0
.
4
5099 0
5
. 7709
6
0
.
7
5542
3
0.4644
9
0.3119
10
0.3550
11
0.5396
12
0.8317
Storage and
0 Drought
.
8759
Demand
Level
=
0.7917
Longest 0.5972
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage
Capacity
Rescaled
Range
Hurst Coefficient
2
1.0000 *
0
.
4309
0
.6573
0.548
2
0.540
6
0.543
8
0.471
0
Statistics
0.326
4 0 .
sample mean
5680
10.0000
0.847
693.8212
3 0 .
8521
7.0000
0.797
1063.9717
7
3839.3513
0.645
39.6617
2
0.6499
11.2400
767.2662
7.7000
1208.4659
3697.9873
38.2223 0
. 6381
Cross
Correlations
MIE
3
0.9853
0.9828
0 .
9793
0.9847
0.9924
0.9632
0.9788
0.9906
0.9888
0 .
8572
0 .
9504
0.9888
S tes 3 and
i
1
0. . 9068
2
3
4
5
6
7
8
9
1
0
1
1
1
2
0. ,8623
0. .6949
0. . 8251
0. , 9108
0. .7394
0. .7722
0. .8394
0. .7933
0. .4031
0. .2735
0, .8937
0.9844
0.9780
0.9725
0.9738
0.9650
0.9615
0.9761
0.9891
0.9578
0.6456
0.8028
0.9815
BHE3
SEASON TO SEASON CROSS CORRELATION Historical -"-Generated Stalion
1 & 2 (KE&KA) LagO
........ ...
0.5
95% L
c
t
O
-0.5
95% L
13451
6 1b 11 iS
season
(Y
A
0 .3507
.
0,
. .12506056
0
.0 . 1 2 7 5
. .0830
0
.0 . 0 9 9 4
. .2210
0
.0 . 3 0 8 7
.0 . 2 8 4 1
. .1972
0
.0 . 2 2 7 2
. .2705
0
,
ites
1
2
3
4
5
6
7
8
9
0.8905
0.8189
0.9137
0.9296
0.9286
0.9512
0.9699
0.9462
0.1526
0.2284
0.1553
0.0695
0.0676
0.2773
0.3916
0.3967
H2
0.6776
0.3608
0.9007
Sites
8 and 10
0.9755
0.9557
io 0 e
2
3
4
5
6
7
8
9
10
11
12
0.9867
0.9796
0.9847
0.9827
0.9781
0.9833
0.9897
0.9770
0.7619
0.5003
0.9741
0.9086
0.9642
0.9656
0.9274
0.9773
0.9752
0.9888
0.9533
0.7168
0.4811
0.9092
21.6.2.
0.4131
0.3005
0.2319
(NA & TI)
EiEa
SEASON TO SEASON CROSS CORRELATION Historical Generated
Station 8 fi. 10 (NA&TI) Lag 0 .
0.5
g
V
.................................................................................................... 95% L
95% L
-0.5 -1.0
Taking the basic autoregressive lag one annual model, AR(1), it is easy to
show how the generated flows are altered under a forecasting environment.
For argument sake, assume the mean of the process is 20, the standard
deviation is 5, the lag one serial correlation is 0.5, and the data is indeed
normally distributed and there is no need for any transformation. Assume that
many traces are to be generated into the near future, say for three years. We
will examine how this differs from a standard generation for first an normal
preceding year, second for an abnormally high preceding year and third for an
abnormally low preceding year.
First assuming that the previous just experienced years flow was exactly
the mean. This is a zero departure form the mean and the lagged term has a
corresponding valu of zero. The generated vales for the first year will be
distributed with a mean of 20 (unchanged because the lagged term had no
effect). The standard deviation of the generated flows is reduced to 4.33. Note
that a standard non-forecasting generation would have produced a mean of 20
and a standard deviation of 5. The second year will have a mean of 20.
Similarly for the third year. The standard deviation will be 4.84 for the second
year and 4.96. Note that the standard deviation quickly converges to the nonforecasting valu as is expected with a short term memory process.
Now, if the preceding year had been abnormally high, say lwo Standard
deviations high (a flow of 30), the means w i l l he 25, 22.5, and
N MI
Non-Forecasting
Model (for
First
comparison)
20
25
20
15
Second
20
22.5
20
17.5
Third
20
21.25
20
18.75
STOCHASTICALLY FORECASTING
OF THE STANDARD DEVIATION
Generated
Year
Non-Forecasting
Model (for
EXAMPLE
BEHAVIOR
First
comparison)
5
4.33
4.33
4.33
Second
4.84
4.84
4.84
Third
4.96
4.96
4.96
REFERENCES
liratley, P., Fox, B.L., and Schrage, L.E., 1987, A Guide to Simulation, 2nd. Edition, Springer
Verlag, New York.
l emandez, B., and J. D. Salas, 1990, Gamma-Autoregressive Models for Stream-Flow
Simulation, ASCE Journal of Hydraalic Engineering, vol. 116, no. 11, pp. 14031414.
I revert, D. K., M. S. Cowan, and W. L. Lae, 1989, Use of Stochastic Hydrology in Reservoir
Operation, J. Irrig. Drain. Eng., vol. 115, no. 3, pp. 334-343.
( l i l i , P. E., W. Murray, and M. H. Wright, 1981, Practical Optimization, Academic
Press, N. York.
(irygicr, J. C., and J. R. Stcdinger, 1990, SPIGOT, A Synthetic Streamflow
Generation Software Package, technical description, versin 2.5, School of Civil
and linvironmental linginccring, Comoll University, Ithaca, N.Y..
1111 n i non 11> I mi, I). M., I >72, Applied Nonlini'nr Pmgramming, McGraw-Hill, New York.
Hll
O 0.50