Cap20 Hidro

Section 5: Environmental
Models
Chapter 20
STOCHASTIC EVENT FLOOD MODEL

(SEFM)
M.G. Schaefer, Ph.D. P.E and B.L. Barker, P.E.
MGS Engineering Consultants, Inc., 7326 Boston Harbor Road NE, Olympia, WA 98506
ABSTRACT
The Stochastic Event Flood Model (SEFM) was developed for analysis of
extreme floods resulting from 72-hour general storms and to provide
magnitude-frequency estimates for flood peak discharge, runoff volume and
mximum reservoir level for use in hydrologic rislc assessments at dams. It
can also be used to assess the variability of floods produced by design storms
such as Probable Mximum Precipitation. The model was developed
specifcally for application in mountainous areas of the western United States
where snowmelt runoff is commonly a contributor to flooding. This chapter
provides a description of the basic concepts employed in developing the
Computer model and identifies the various hydrometeorological components
that are modeled in the Computer simulations. Results from some recent
applications of the model are also presented. The model is in the early stages
of implementation and changes are being made as more is learned about the
probabilistic characteristics of the hydrometeorological processes. It is
anticipated that the model will continu to evolve as improvements are made
to the model.
20.1.
OVERVIEW
The basic concept of the stochastic event-based rainfall-runoff model41 is to

employ a deterministic flood computation model and to treat the
input parameters as variables instead of fxed vales. Monte Cario sampling
procedures are used to allow the hydrometeorological input parameters to vary
in accordance with that observed in nature while preserving the natural
dependencies that exist between some climatic and hydrologic parameters.
Multi-thousand Computer simulations are conducted where each
71)7
20 I M.G. Schaefer, B.L. Barker
simulation contains a set of input parameters that were selected based on the
historical record and collectively preserves the dependencies between
parameters. The simulated floods constitute elements of an annual maxima
flood series that can be analyzed by standard flood- frequeney methods. The
resultant flood magnitude-frequeney estimates reflect the likelihood of
occurrence of the various combinations of hydrometeorological factors that
affect flood magnitude. The use of the stochastic approach allows the
development of separate magnitude- frequeney curves for flood peak discharge
(Fig. 20.1a), flood runoff volume (Fig. 20.1b), and mximum reservoir level
(Fig. 20.1c). Frequency information about mximum reservoir levels is
particularly important for use in hydrologic risk assessments because it
accounts for all the pertinent hydrologic factors - flood peak discharge, runoff
volume, hydrograph shape, initial reservoir level, and reservoir operations. All
of the flood characteristics above, and the hydrologic risk can be evaluated on
a monthly, seasonal, or annual basis.
Fig. 20.1a. Magnitude-Frequency Curve for Peak Discharge.
71N
Stochastic Event Flood Model (SEFM) / 20
Fig. 20.1b. Magnitude-Frequency Curve for Runoff Volume.
Fig. 20.1c. Magnitude-Frequency Curve for Mximum Reservoir Level.
20.2.
CAPABILITIES OF THE STOCHASTIC MODEL
The stochastic event-based flood model41 has the capability to simlate a wide
range of hydrometeorological and watershed conditions. Computer simulations
are conducted for 72-hour duration general storms based on end-of-month
hydrometeorological conditions. Runoff is computcd on a distributed basis for
polygons of land called Hydrologic Runo IT Uuils (lIRUs) Ihal have eommon
mean annual
7(1'
20 / M. G. SchaeJ'er, B.L. Barker
precipitation, elevation, and soil characteristics. Hydrometeorological

parameters that vary seasonally with mean annual precipitation, elevation or
soil type, such as antecedent precipitation, antecedent snowpack, and soil
moisture, are also allowed to vary spatially within the watershed via
accounting through the HRUs.
The SEFM model can be run in a completely stochastic mode where all
hydrometeorological parameters are allowed to vary. It can be run in a
completely deterministic mode with all parameters fixed, or it can be run in a
mixed mode with some parameters treated as variables and other parameters
fixed.
In most cases, the flood response of a given watershed and reservoir is
sensitive to only a few of the hydrometeorological parameters. Recognizing
this situation, the data entry interface allows the user to specify how each of
the hydrometeorological parameters is to be treated
- variable or fixed valu. This approach allows the user to provide a greater
level of detail in the simulation of those hydrologic processes that have the
greatest influence on the watershed/project under study.
Simulations are conducted based on end-of-month conditions for the
various hydrometeorological input parameters. A monthly time increment was
chosen for several reasons. It provides reasonable efficiency in data analysis
because many hydrometeorological variables are reported on end-of-month
intervals. Use of a monthly time increment results in dates of storm/flood
occurrence that are on-average 7-days, and at-most 15-days, different from
those obtained if the storm/flood date could occur on any day of the year.
Thus, twelve time increments are deemed sufficient for sub-division of the
climatic year (October 1 to September 30) to depict the natural seasonal
variability in hydrometeorological inputs such as soil moisture, snowpack,
reservoir level, etc.
20.2.1.
Simulation Capabilities of the SEFM model
The simulation capabilities of the model include:

Variable - Month of occurrence of extreme storm
Variable - Precipitation in climatic year prior to extreme storm
Variable - Snowpack snow-water equivalent
Variable - Temperatures for mclling of snowpack
Variable - Soil moisture conditions at onset of the storm
Variable - Surface infltration rate as function of soil moisture
condition Ability to - Simlate surface runoff and interflow
710
Stochastic Event Flood Modcl (SEFM) / 20
runoff Ability to - Compute runoff on a distributed basis using

conditions within each HRU Ability to - Check for frozen
ground conditions and set surface infltration rates accordingly
Ability to - Generate 72-hour general storms with
characteristics observed in historical storms Variable
- Temporal distribution of 72-hour general storms Variable Storm size and spatial distribution of 72-hour general storms
Variable - Storm centering within the watershed Variable Streamflow prior to extreme storm Variable - Reservo ir level
prior to extreme storm Variable - Time lag and peak discharge of
surface runoff unit hydrographs
20.3.
APPLICABILITY OF THE STOCHASTIC MODEL
The stochastic event flood model is currently configured for simulation of 72hour general storms. There is no computational limit to the size of the
watershed to which it can be applied. However, implicit in the development of
the model is the condition that some hydrometeorological parameters are
highly correlated spatially. For example, soil moisture accounting is conducted
to determine soil moisture conditions at the onset of the extreme storm. In
conducting soil moisture accounting, multi-month periods of precipitaron and
snowpack are taken to be highly correlated throughout the watershed.
Specifically, the exceedance probability of multi-month precipitation at a
given location is assumed to not vary significantly from that at other locations.
Thus, the exceedance probability can be adequately representcd as one areally
averaged valu. In a similar manner, the exceedance probability of multimonth snowpack can also be adequately represented as one areally averaged
valu.
As the watershed size increases, the requirement for high spatial
correlalion of inulti-month precipitation and snowpack becomes more (linicul!
lo salisfy. l itis considentlion suggesls that the stochastic model is applicable to
watersheds up to a nominal size of about 500 mi2. For larger watersheds, the
spatial variability of some hydrometeorological parameters may warrant that
site-specific modules be developed to address the site-specific spatial
characteristics of the watershed under study.
20.4.
DISTRIBUTED RAINFALL-RUNOFF MODELING
A key element in the stochastic approach is the selection of realistic initial

711
20 / M.G. Schaefer, B.L. Barker
conditions in the watershed at the onset of the extreme storm. This requires
that a distributed approach be used in modeling the rainfall-runoff process so
that the spatial variability of soil moisture, soil moisture storage
characteristics, soil infiltration rate, snowpack, and frozen ground conditions
can be properly accounted for in computing runoff.
20.4.1.
Hydrologic Runoff Units
To accommodate the distributed approach, the watershed is divided into

numerous sub-areas. These sub-areas are comprised of irregularly shaped land
areas having common mean annual precipitation, elevation, and soil
infiltration characteristics and are termed Hydrologic Runoff Units (HRUs).
Runoff is computed separately for each HRU and then combined to obtain the
response of each sub-basin.
20.4.2.
Mean Annual Precipitation Zones
Mean Annual Precipitation (MAP) often vares widely across mountainous

watersheds in the arid, semi-arid, and sub-humid western US6,23. This spatial
variability requires that a watershed be sub-divided into zones of similar mean
annual precipitation to facilitate the allocation of antecedent precipitation,
allocation of winter snowpacks, and computation of soil moisture budgets.
Sufficient zones should be employed to adequately describe the variability of
monthly antecedent precipitation, snowpack, and soil moisture that occurs due
to differences in the magnitude of annual precipitation. Figure 20.2 depicts an
example delineation of mean annual precipitation zones for the 55 mi 2
Keeehelus watershed29 in Washington State.
712
Stoehastic Event Flood Model (SEFM) / 20
Fig. 20.2. Mean Annual Precipitation Zones for Keechelus Watershed,

Washington.
Mean Annual Precipitation (n)
Less Than 80 80 to 90 ; j 90 to 1CO
n ioo to no
g Greater Than 110
20.4.3. Elevation Zones

Elevation information is needed
to account for
temperatura changes that occur
with elevation
during extreme storms. This
temperature
Scale:
information
is required for
1 Mile
snowmelt
computations
and for checking for frozen
ground
conditions. Selection of upper
and
lower
Keechelus
bake
bounds
Keechelus Watershed Mean
for the Annual Precipitation
elevation zones should be
based on a
hypsometric curve for the
watershed
to
ensure proper apportioning of areas. Figure 20.3 depicts an example
delineation of elevation zones for the Keechelus watershed.
20 / M.G. Schaefer, t.L. Barker

Elevation (Feet):
LessT han
3000 3000 to 4000
|g 4000 to 5000 l G reater T
ha n 5000
Scale: 1:200,000 1 t/liie
Keechelus Watershed
Elevation Zones
Fig. 20.3. Elevation Zones for Keechelus Watershed, Washington.
20.4.4o Soils Zones

Soil zones are used to delineate contiguous areas with similar soil
characteristics. Each soil zone represents a unique combination of mximum
and mnimum surface infiltration rate, deep percolation rate, and soil moisture
storage capacity. These vales are subsequently refined through calibration of
the hydrologic model using observed climate and streamflow data. Figure 20.4
depicts an example delineation of soil zones for the Keechelus watershed.
Fig. 20.4. Soil Zones for Keechelus Watershed, Washington.
20.5.
HYDROMETEOROLOGICAL COMPONENTS
A number of hydrometeorological inputs are required in employing the

stochastic approach. These same inputs are needed in a standard deterministic
analysis. However, in the stochastic approach, the inputs are treated as
variables and allowed to vary in a manner consistent with the historical record.
Further, dependencies between hydrometeorological inputs are preserved.
The various hydrometeorological inputs are listed in Table 20.1 and are
briefly described in the following sections. Table 20.1 also identifies the
dependencies that are preserved between the hydrometeorological components.
Table 20.1. Independent and Dependent Hydrometeorological

Components.
20 IM.G. Schaefer, B.L. Barker

No
Hydrometeorological
Component
1
2
Month of Storm Occurrence

Antecedent Precipitation
Antecedent Temperatura
Antecedent Snowpack
October 1st Soil Moisture
Initial Streamflow
7
8
Dependency
Independent
Dependent upon: 1
Dependent upon: 1
Dependent upon: 1 and
2
Independent
Vares By Zone
Mean Annual
Precipitation
Elevation
Mean Annual
Precipitation
Mean Annual
Precipitation, Soils

2
Initial Reservoir Level
2
Independent
Precipitation Magnitude- Frequency
Precipitation Temporal
Characteristics
Storm Centering
Independent
Temperature During Storm

8
10
11 Precipitation Spatial Characteristics
12
20.5.1.
Independent
Independent
Elevation
Probabilistic Inputs for Initial Watershed Conditions
Date of Occurrence of Extreme Storms - is the end-of-month of occurrence of

the extreme storm. It is based on the seasonality of extreme storms as depicted
by the monthly distribution of the historical occurrences of extreme72-hour
general storms. Figure 20.5 depicts an example of the seasonality of 72-hour
extreme storms on the west face of the Sierra Mountains in central
California31. Numeric storm dates are based on a system where September 1,
December 1, and February 1 are 9.0, 12.0, and 14.0 respectively.
Stochastie Event Flood Model (SEFM) / 20
Fig. 20.5. Seasonality of 72-Hour Extreme Storms for West Face of Sierra
Mountains iu Central, California.
Antecedent Precipitation - is the total precipitation from the start of the

climatic year (October lst) until the given end-of-month for locations within a
specified zone of mean annual precipitation. It is used in computing soil
moisture budgets and as an explanatory variable in correlation relationships
with other hydrometeorological parameters. Figure 20.6 depicts antecedent
precipitation data27 fitted by the threeBarnes Oregon
NON-EXCEEDANCE PROBABILITY
Flg. 20.6.
Fxainple of Antecedent Precipitation for October lst to the
Eiul-of-Fcbruary, llames Oregon.
717
20 / M. G. Schaefer, B.L. Barker
parameter Gamma distribution for a zone of mean annual precipitation within

the Crooked River watershed in Central Oregon.
Soil Moisture at Start of Climatic Year - is the soil moisture for the start of the
climatic year (October lst) for a specified HRU. It is used for computing soil
moisture budgets.
Snowpack - is the snow-water equivalent for a specified zone of mean annual
precipitation for a given end-of-month. It is correlated with antecedent
precipitation and the frequency of snow-free versus snow- on-ground
conditions is preserved. Figure 20.7 depicts the spatial distribution of
snowpack snow-water equivalent for typical end-of- February conditions27 in
the Crooked River watershed in Central Oregon.
Antecedent Temperature - is the average temperature in the two-week period
prior to the selected date of occurrence of the extreme storm. It is used for
checking for frozen ground conditions in each HRU.
Initial Streamflow - is the streamflow at the onset of the storm for the specified
end-of-month. It may be treated as an independent variable or correlated with
antecedent precipitation.
Reservoir Level - is the initial reservoir level at the onset of the storm for the
specified end-of-month. It may be treated as an independent variable or
correlated with antecedent precipitation. Figure 20.8 depicts the variability of
end-of-month reservoir levels for Keechelus Reservoir29 for the period from
1924-1998.
Snowpack Distribution
Fig. 20.7. Example Distribution of End-of-February Snowpack for Crooked River

Watershed, Central Oregon.
Fig. 20.8. Simple Box-Plot for the Variability of End-of-lYIonth Reservolr

Fevels, Kcechclus Reservoir, Washington.
7IV
20.5.2.
Probabilistic Inputs Related to the Occurrence of the

Extreme Storm
Stochastic simulation of the temporal and spatial distribution of extreme

storms is the most complex component of the Stochastic Event Flood Model.
The following general descriptions of the stochastic storm elements provide an
overview of the stochastic storm generation process.
Precipitation Magnitude-Frequency - General storms of 72-hour duration are
assembled utilizing the 24-hour basin-average 10 mi2 precipitation. The
precipitation magnitude-frequeney curve for the 24- hour duration is based on
regional analyses1224,303334 of 24-hour precipitation annual maxima at gages
within the watershed and in climatologically similar areas. Figure 20.9 depiets
a precipitation magnitude-frequeney curve developed from regional analyses3 3
conducted for Washington State for application at the Keechelus watershed.
Fig. 20.9. Basin-Average 24-Hour 10 mi2 Precipitation Magnitude- Frequency

Curve, for Keechelus Watershed
Precipitation Temporal Characteristics - Probabilistic information about the

temporal characteristics of historical 72-hour general storms is used for
assembly of hyetographs. This includes probabilistic information25,32 on:
depth-duration rclationships; clapsed time to Ihe high intensity segment of the
storm; sec|ucncing ol'hourly precipitation incremcnts; and macro storm
patterns. lixtrcmo storms seloetod lor
720
analysis are obtained from gages within the watershed and in climatologically
similar areas. Figures 20.10a,b depict two temporal distributions generated by
SEFM for the Keechelus watershed. The temporal characteristics were
developed from analyses2932 of 25 storms in the Cascade Mountains of
Washington.
Precipitation Spatial Characteristics - Probabilistic analyses are conducted of
depth-area-duration data developed from historical storms. This information is
applied in a probabilistic manner to allow for variable storm areal coverage and
to describe the spatial distribution of precipitation over the watershed. Figure
20.11 depicts a family of depth0.80
24-Hour = 8.0 inches
c
t-
_ 0.70
=- 0.60 Z
O 0.50
< 0.40
72 Hour = 10.7 inches
9:
0.30
nj i
0.20
O
Q.
Rl
0.0
l 1
0
II
InmFlin
1
ILD
0.10
12
18
24
30
36
42
48
54
60
66
72
TIME (Hours)
0.80
0.70 c
^ 0.60 z
12
TIME (Hours)
18
24
30
36
42
48
54
60
66
72
Area-duration curves for the 24-hour duration for Keechelus watershed29

developed from analysis of storms in the Cascade Mountains of Washington.
Storm Centering - Storms are allowed to center over different parts of the
watershed and one storm center is allowed per sub-basin. The user specifies
the spatial allocation of precipitation to the other sub-basins surrounding the
sub-basin with the storm center.
Fig. 20.11. Probabilistic Depth-Area-Duration Curves for Cascade Mountain

Areas in Washington.
20.5.3.
Inputs Related to Rainfall-Runoff Modeling
Rainfall-runoff computations are accomplished in two stages. First, surface

runoff is computed based on a surface infiltration rate using a decay function10
where the surface infltration rate is dependent on the magnitude of soil
moisture. Next, interflow runoff1 is computed based on a deep percolation rate.
1
< 0.40
: 0.30
U
0.20
0L
0.10
0.00
I ij'. 20.H)n,l>. Ilxiiiiiple I'oniponil Dlslrilmlions lor Keechelus Watershed,

722
Separate rainfall-runoff computations are conducted for each HRU to reflect

the site-specific climatic and soil conditions. The runoff from each HRU is
aggregated to the sub-basin level and surface and interflow unit hydrographs
are used to compute the surface and interflow flood hydrographs. Inputs for
rainfall-runoff modeling are described below. A schematic of this procedure is
shown in Fig. 20.12. Table 20.2 lists an example of the soil characteristics for
WIINIIIII((IOII.
723
Keechelus watershed in Washington. The geographic layout of these soils is

shown in Fig. 20.4.
Mximum Surface Infiltration Rate - is the mximum rate at which the soil can
accept water at the soil surface for a specified soils zone. This occurs when the
soil is at the wilting point having been desiccated by evapotranspiration.
Rain + Snowmelt
Soil Moisture
Storage (Rool Zone)
1U
Surface Runoff
Gravitational or
Intermed ate
Vadose Zone
(does not contribute to flood)
Fig. 20.12. Schematic of Soil Moisture and Runoff Processes.
Mnimum Surface Infiltration Rate - is the limiting rate at which the soil can
accept water at the soil surface for a specified soils zone. This occurs when the
soil is fully wetted and soil moisture is at field capacity.
Deep Percolation Rate - is the limiting rate that a soil layer, hardpan within
the soil column, or underlying bedrock can accept water that has infiltrated the
surface of the soil for a specified soils zone. Water that passes through this
limiting soil layer, hardpan, or bedrock contributes to groundwater and does
not return to the stream during the time interval for modeling of the extreme
flood.
Soil Moisture Storage Capacity - is the moisture holding capacity of the soil
column to the depth that can be affected by evapotranspiration.
Evapotranspiration - is the average monthly potential evapotranspiration
amount for a specified zone of mean annual prccipitation.
lis
20 /M.G. Schaefer, B.L. Barker
Temperatures during Extreme Storms used for Snowmelt Computations

- is the temporal sequence of temperatures during the occurrence of the 72hour general storm for a specified elevation zone that is used for computing
snowmelt runoff
Table 20.2. Soil Characteristics, Keechelus Watershed.

Soil
Zone
Soil
Moisture Surface Infiltration Rates (in/hr)
Mximum
Mnimum
Storage Capacity
Deep Percolation
Rate (in/hr)
(in)
l
4.00
2.00
0.60
0.06
6.00
2.00
0.60
0.06
6.00
2.00
0.60
0.08
4.00
0.00
10.00 closed subbasin

0.00
0.06
5
(reservoir)
10.00 closed subbasin

0.0
0.00
Surface Runoff Unit Hydrographs - are used to convert the computed surface
runoff volume from each sub-basin into a flood hydrograph. Surface runoff
unit hydrographs can have variable lag time and peak discharge to account for
the variability observed in nature.
Interflow Runoff Unit Hydrographs - are used to convert the computed
interflow runoff volume from each sub-basin into a flood hydrograph.
Interflow runoff unit hydrographs have fixed lag time and peak discharge
based on calibration to observed floods.
Reservoir Routing and Dam Operations - reservoir operations are simulated
consistent with standard operating procedures for the project under study. The
Computer program is currently set up for the HEC-1 model36, which uses a
fixed reservoir elevation-discharge rating curve. Project specific modules can
be developed to simlate more complicated operational procedures.
20.6.
SIMULATION PROCEDURE
One of the key features of the stochastic modcl is the use of Monte Cario
simulation
methods (Jain15,
Salas el al26) for
selocling 1 he
724
Fig. 20.13. Flow Chart for Stochastic Simulation Procedure.
20.6.1. Comparison with Traditional Flood Frequency

Analysis
A comparison with traditional flood-frequency analysis can be used to obUiin ;i
perspcctive on the approach used with the stochastic model. The primary ibcus iu
Iraclilional lloocl Ircquency
is flood peak
Iu eonihicting m
Rank Alianalysis
Events n Descending
Order(INchiirgi.
of Magnitude and
I
al -si I c Ircquency analysis for
peak
Develop Magnitude-Frequency Curves
Stoehastic Event Flood Model (SEFM) / 20
magnitude and combination of hydrometeorological input parameters for

computation of floods. While the individual elements of the model can be
complex, the basic concepts used in the simulation are straightforward. A
flowchart for the stochastic simulation procedure is depicted in Fig. 20.13 and
the basic concepts of the simulation procedure are described below.
72I
discharge, the basics steps are to: collect an annual maxima series for the
period of record; view the magnitude-frequency characteristics of the data by
constructing a probability-plot using a standard plotting position formula to
estimate annual exceedance probabilities; and fit a probability distribution to
the annual maxima data in attempting to capture the statistical information
contained in the dataset. Flood peak discharge magnitude-frequency estimates
are then made using the distribution parameters for the fitted probability
model.
If an extremely long period of flood record were available (multi- thousand
years of flood peak discharge annual maxima in a stationary environment),
then a plotting position formula and probability-plot would be sufficient for
capturing the frequency characteristics for all but the rarest flood events within
the dataset.
The Computer simulation of multi-thousand years of flood annual maxima
provides a flood record analogous to the latter case described above. With that
in mind, the basic construct for the stochastic Computer simulation procedure
can be described as follows.
An extremely long record of 24-hour, 10 mi2 precipitation annual maxima
is generated using Monte Cario sampling procedures (assuming stationary
climate). A 72-hour general storm is developed for each of the 24-hour
precipitation annual maxima based on the probabilistic characteristics of the
temporal and spatial components of historical extreme storms. A storm date,
end-of-month is selected for occurrence of the storm. Hydrometeorological
parameters are then selected to accompany each storm based on the historical
record in a manner that preserves the seasonal characteristics and dependencies
between parameters. The general storms and all other hydrometeorological
parameters associated with the storm events are then used to generate an
annual maxima series of floods using rainfall-runoff modeling. Characteristics
of the simulated floods such as peak discharge, runoff volume and mximum
reservoir level are ranked in order of magnitude and a non-parametric plotting
position formula and probability-plots are used to describe the magnitudefrequency relationships.
20.6.2. Details of the Simulation Procedure

One approach to conducting simulations would be to Computer genrate a
huge number of storms and floods, say I07 evenls and compute the frequency
curves based on lliose lloods. Ilowcver, Ihe compululional effort to conduct
this large number of flood simulations makes this approach impractical. With
726
current Pentium level (300 mhz) computational and storage power of personal
computers, 25,000 flood simulations can be conducted in about 12 hours using
about 3 gigabytes of storage.
If flood events more common than an Annual Exceedance Probability
(AEP) of about 1:2500 are of interest, then 25,000 simulations of annual
maxima are adequate to develop the magnitude- frequeney curves. In many
applications, there is a desire to estimate magnitude-frequeney curves for flood
events more rare than an AEP of 1:2500. In these cases, an altemative Monte
Cario sampling procedure is needed that allows development of the magnitudefrequeney curves for extremely rare floods while recognizing the practical
limits posed by computational power/storage constraints. This can be
accomplished using a piecewise approach (Barker et al3) that requires much
less computational effort. Magnitude-frequeney curves can be constructed by
computing several simulation sets. Each simulation set is used to define a
different portion of the frequeney curve - for example one to two log-eyeles of
annual exceedance probability (Fig. 20.14).
This approach can be best explained with an example. Consider the case
where flood events with an AEP of 1:1,000 to 1:10,000 are of interest. Since
the largest flood events in an annual maxima series (either historical or
Computer generated) exhibit the greatest variability, a record length about 10
times greater than the target recurrence interval (1/AEP) is appropriate to
reduce uncertainties due to sampling variability for the upper end of this
frequeney range. Thus, a record length of 100,000 annual maxima would be
used to develop probability - plots for making magnitude-frequeney estimates
in the target range of 1:1,000 to 1:10,000 AEP.
In a standard Monte Cario approach, annual maxima storm magnitudes
would first be sampled at random and then the hydrometeorological parameters
would be selected to accompany the storm. This approach would require that
the fiill 100,000 sample set be generated. In the piecewise approach, it is
recognized that the smallest storms in the sample set are not going to generate
the largest floods. Sincc wc are only interested in the upper-most log-cycle(s)
of extreme llood characteristics, we would simlate floods from the collection
of Ilie largor storms from a rcduccd sample set. For this example, we would
develop 1 he magniludc-frequcncy estimates based on the 25,000
727
Fig. 20.14. Example of Piecewise Assembly of Magnitude-Frequency Curve.
largest storms/floods from a record length of 100,000. This would provide a

sufficient number of floods to adequately define the magnitude-frequency
relationship in the target range of 10"3 to 10"4 AEP.
Implicit in the piecewise approach is that that few or none of the
75,0
smallest storms (storms more common than a 4-year event for this
example) will produce floods sufficiently rare to reside within the 10~ 3 to 104
target zone for flood characteristics. Confirmation of this behavior can be made
by examination of the computer-generated floods and determining the range of
storm magnitudes that are producing floods within the target zone.
To simplify generation of the 25,000 largest storms, a plotting position
formula is used to create a representative sample of extreme storms. The annual
exceedance probabilities of the 24-hour 10 mi2 precipitation for assembly of
72-hour general storms within each simulation set are defined by the
Gringorten7 plotting position formula:
=
1-0.44 rex N
+ 0./2
72N
where Pex is the annual exceedance probability, N is the number of years for the
record length being simulated, n is the actual number of simulations being
conducted (n out of N years), and i is the rank of the precipitation annual
maxima being simulated (ranges from 1 to n).
The resulting n floods from each simulation set are ranked in descending
order of magnitude and the Gringorten plotting equation is used to compute the
annual exceedance probabilities for flood peak discharge, flood runoff volume,
and mximum reservoir level.
20.7.
CURRENT CONFIGURATION OF STOCHASTIC

MODEL
The stochastic inputs generation component of the flood model is currently

configured to function with HEC-136, a lumped, event rainfall- runoff model. In
this configuration, the lumped rainfall-runoff inputs common to HEC-1
modeling are replaced by distributed rainfall-runoff inputs that are computed
within the SEFM. However, the SEFM is not limited to the HEC family of
models. The stochastic inputs generation component of the flood model was
intentionally constructed separately to allow it to be used with other rainfallrunoff models. Fully distributed event rainfall-runoff models could also be
used to simlate the flood response of the watershed. This type of approach has
recently been done in combining the SEFM stochastic inputs generator with
WATFLOOD5 6,17 a distributed rainfall-runoff model that can be operated in
both event and continuous modes. The SEFM approach could also be
configured to operate in a pseudo-continuous mode using a resampling
approach for the hydrometeorological components.
The remainder of discussion here will relate to the application of the SEFM
in conjunction with the HEC-1 rainfall-runoff model.
20.8.
SOFTWARE COMPONENTS
The general storm stochastic event flood model (SEFM) is comprised of six
software components. These components include: data entry; input data preprocessor; HEC-1 template file; stochastic inputs generator; IlliC-l rainfallrunoff flood computation model; and an output data
Fig. 20.15. Flow Chart for Operation of the Computer Software for Stochastic
Simulation.
post-processor. Figure 20.15 depicts the sequence of actions required for

condueting the Computer simulations using the software components. Each of
these components is briefly described in the following sections.
20.8.1.
Data Entry Software Component
Input parameters are entered via a Microsoft Excel 9 720 workbook. A

tab/worksheet is allocated for each of the hydrometeorological components.
Each tab contains a data entry screen and on-screen prompts to assist in entry
of the required input parameters.
20.8.2.
Input Data Pre-Processor Software Component
The input data pre-processor performs a variety of tasks, and operates within
Microsoft Excel 9/ using Microsoft Visual Basic for Applications10. The first
task conducted through the spreadsheets is to perform validity checks of the
vales of the input parameters for each of the hydrometeorological variables.
Vales that are found to be out of bounds are identified/flagged on the
spreadsheet. The second task is to conduct preliminary Monte Cario
simulations for each of the hydrometeorological variables to allow examination
of the generated vales and to compute sample statistics of the generated
vales. This allows a basic confirmation of the validity of the generated vales
and allows comparisons to be made with historical data. Lastly, the preprocessor is used to cali the stochastic inputs generator to conduct Monte Cario
input parameter generation for all hydrometeorological variables for use in the
Computer flood simulations and to create the input files for the HEC-1
hydrologic model.
20.8.3.
Stochastic Inputs Generator Software Component
The stochastic inputs generator is comprised of a series of Fortran modules

compiled as dynamic link libraries (dlls). These modules are used for Monte
Cario generation of the inputs that are required by HEC- I, the rainfall-runoff
flood computation model. Interflow runoff, surface runoff, and snowmelt
runoff are also computed by these modules and Ihe runoff vales are passed to
HEC-1 for transformation into hydrographs.
These modules are considered generic sinee they are general in nature and
do not contain fixed parameter vales that are applicable to a spccific
watershed(s). Where site-specific watershed conditions occur that are different
from the approach taken in the generic modules, a site- specific module can be
substituted for the generic module and the model operated as a site-specific
application. This situation is mst likely to occur for larger watersheds (greater
than nominal 500 mi2) where the spatial considerations of antecedent
precipitation, snowpack, and antecedent soil moisture may warrant a site-
7U
specific solution.
The stochastic inputs generator reads a HEC-1 input file, called a HEC-1
template file, that contains the Monte Cario input in 80 column card format
(cards). The output from the routine is an ASCII text HEC-1 input file with
the Monte Cario cards replaced by HEC-1 cards that reflect the Monte Cario
simulated surface runoff, interflow, initial reservoir elevation, and initial
streamflow. A separate input file is created for each Monte Cario simulation of
flood annual maxima. Thus, if 25,000 simulations (25,000 annual maxima) are
performed to define a frequency curve, then 25,000 HEC-1 input files will be
created by the routine.
20.8.3.1.
Random Number Generation and Monte Cario Simulation
Selection of hydrometeorological inputs using Monte Cario sampling

procedures requires the use of a random number generator. The algorithm used
in the SEFM was originally developed by Lewis et alIN and programmed by
Hosking14. It is a multiplicative congruential generator with base 23l-l and
multiplier of 75. This algorithm has been thoroughly tested and is the generator
commonly used at the IBM Research Divisin14,18. Each hydrometeorological
input is allocated a stream of random numbers on the interval [0-1]. Each
stream of random numbers is non-overlapping with streams for other
hydrometeorological components.
Standard Monte Cario sampling procedures are used for selection of vales
from probability distributions. The inverse transformation mcthod (Jain1 , Salas
et al26) is used extensively for generation of random variates.
20.8.3.2. Messages and Error Handling
Infonnational messages and error messages from the stochastic inputs
generator are stored in an ASCII text file. This file is purged at the start of
each simulation so it only contains messages from the most recent Computer
run. If a simulation fails or has problems, this file can be accessed to help
determine the cause of the problem.
20.8.4. HEC-1 Template File

The SEFM program, specifically the stochastic inputs generator, reads the
HEC-1 template file and replaces the Monte Cario cards with standard FIEC-1
cards that simlate the desired hydrologic component. The HEC-1 template
file is similar to a standard HEC-1 input file except that it is much shorter,
because the precipitation and runoff calculations are being performed by the
SEFM program. Table 20.3 lists the HEC-1 Monte Cario cards that are read
and replaced during the simulation.
Pable 20.3. Monte Cario Cards for HEC-1 Template File.

Monte Cario
Card
MCIT
MCID
Corresponding
HEC-1 Card
IT
ID
MCBA
BA
MCBF
BF
MCRS
RS
Purpose
Simulation Duration and Time-Step
Run Title for Project/Study
Surface And Interflow Components
Initial Streamfiow and Base Flow
Recession
Initial Reservoir Elevation
Incorporation of the surface runoff and interflow runoff components is

accomplished by replacing the HEC-1 basin runoff data cards by two sets of
HEC-1 cards. The surface runoff (precipitation minus infiltrated moisture) is
represented by HEC-1 PI incremental precipitation cards. These are followed
by UI unit graph cards for the surface runoff unit hydrograph(s). Since the
SEFM program is performing the soil moisture und surface runoff
calculations, a uniform loss rate card LU is used with (he loss rate sel to /ero.
The interflow for each sub-basin is entered into IIIC-I on Ql clirccl input
hydrograph cards since the interflow unit liydrographs uro ippliod by the
SIIM program. The surface and inlorllow runolT liyili'onnipliN mv liten
eombined.
MI
20.8.5.
HEC-1 Rainfall-Runoff Flood Computation Model

Software Component
HEC-1 is executed in a batch mode utilizing only a few computational features

of the HEC-1 model. HEC-1 is used primarily to: transform surface runoff into
hydrographs for each sub-basin; conduct channel routing of sub-basin
hydrographs through the stream network; and to route the inflow flood through
the reservoir.
A batch file is created in the HEC-1 directory with each stochastic
simulation. The batch file executes the HEC-1 program reading each input file
from the specified subdirectory. For each simulation, an HEC- 1 output punch
file is created that contains computed hydrographs at specific points of interest
such as the inlet to the reservoir and outflow from the reservoir.
20.8.6.
Output Data Post-Processor Software Component
The output data post-processor is used as a repository for the output from the
flood simulations and is contained within a Microsoft Excel20 workbook.
Vales of the hydrometeorological inputs are passed to the post-processor to
allow examination of the inputs that produced a given output. Visual Basic20
routines are used to read the hydrographs saved in the punch files, and extract
the maxima peak flow rate, runoff volume, and reservoir elevation, and to
construct magnitude-frequency curves in Microsoft Excel20. Standard features
of the Excel spreadsheet allow the output to be sorted and analyzed in any
manner desired to examine the hydrologic conditions that produce a given
magnitude flood.
20.9.
APPLICATIONS OF SEFM
The following sections present results from applications of the SEFM on

mountain watersheds in the western US. Results from the Keechelus
watershed29 and AR Bowman watershed27,28 are presented. The Keechelus
watershed is a forested mountain watershed with a drainage area of 55 mi2
located on the east slopes of the Cascade Mountains in Washington. The
climate is humid with mean annual precipitation averaging 94 inches (Fig.
20.2). The AR Bowman watershed is located in central Oregon on the Crooked
River (Fig. 20.7) and luis a drainage area of 2,300 mi2. The climate is semiarid to sub-luimid willi mean annual precipitation ranging from a low of 8
inches, to a high of 32 inches on the mountain ridge tops. Land use consists of
irrigated agriculture in the valley bottoms, open rangeland over much of the
watershed, and forested hillsides and ridge tops where there is adequate water
and soils to support timber.
20.9.1.
Seasonalities of Storms and Floods
Irrigation reservoirs in mountainous watersheds typically fill in the spring of

the year from snowmelt runoff and reach their lowest levels in the fall of the
year following the irrigation season. This pattern of reservoir operation has
implications for the seasonal frequencies of mximum reservoir levels
produced by floods. In arid, semi-arid and sub-humid climates in the western
US, the flood season may be separate from, or partially overlap, the season
when the reservoir is at normal high levels. The SEFM is particularly useful in
conducting flood analyses that allow partitioning out the seasonalities of
extreme storms, extreme flood peaks and runoff volumes, and mximum
reservoir levels.
Figures 20.16 a,b,c depict seasonality histograms for the seasonal
frequencies of extreme storms, flood peaks, and mximum reservoir levels for
AR Bowman Dam and reservoir on the Crooked River in central Oregon.
It is seen in Fig. 20.16a that there are two storm seasons, a winter and
spring-summer season. Snowpack is at a mximum in the winter and early
spring and conventional flood analyses had considered the winter period to
offer the greatest potential for rain-on-snow flood events. However, large
floods (Fig. 20.16b) were found to occur more frequently in the late-spring
when soils are fully wetted at the end of the spring snowmelt season, and
snowpacks lingered at the high elevations in the watershed. Likewise,
mximum reservoir levels due to floods were found to occur more frequently
in the late-spring (Fig. 20.16c) when the reservoir was full or nearly full from
snowmelt runoff and the frequeney of extreme floods was highest.
This type of analysis can be useful for evaluating how the seasonality of
floods interaets with current reservoir operations to produce mximum
reservoir levels. This type of information provides a logical starting point for
optimizing reservoir operations to meet multi- purposc goals and to reduce
hydrologic risk.
SEASONALITY OF EXTREME STORMS

0.50
AR BOWMAN WATERSHED
0.40
>O
5 0 30
Z)
2 020
tL
OCT NOV DEC JAN FEB MAR APR MAY JUN JUL AUG SEP END OF MONTH
SEASONALITY OF EXTREME FLOOD PEAK DISCHARGE

- AR BOWMAN WATERSHED
0.50
0.40
0.30
0.20
: w WM, 1
,
0.10
OCT NOV
0.
DEC JAN FEB MAR APR MAY JUN JUL AUG SEP END OF MONTH
00
Fig. 20.16a,b,c. Seasonality of Extreme Storms, Flood Peak Dischargc, and

Mximum Reservoir Level for AR Bowman Watershed,
Crooked River, Oregon.
736
20.9.2. Relationship Between Storm Magnitude and Flood

Stochastic Evcnt Flood Model (SEFM) / 20
Magnitude
Antecedent conditions can be an important factor in determining the flood
response of a watershed. This is particularly true for watersheds in arid or
semi-arid climates where large soil moisture dficits are common. In these
climatic settings, storm amounts are not large compared to those in humid
climates. Many storms fall on dry watersheds and limited runoff is produced.
Thus, there is high variability in the relationship between storm magnitude and
flood magnitude. Figure 20.17 shows the results of an SEFM simulation of
AR B o w m a n D a m Reservoir Inflow
Vs. Predpitation
90.000
80.0
'S. 70,000
60,000
50,000 !=
40,000
30,000
20,000
10,000
o
0.00 1.00 2.00 3.00 4.00 5.00 6.00 7.00 8.00 9.00 Watershed Average
Predpitation (in)
Fig. 20.17. Scatterplot of Flood Peak Discharge Versus Storm Magnitude for AR
Bowman Watershed, in Central Oregon.
25,0
annual maxima for the AR Bowman watershed. The AR Bowman
watershed resides in a semi-arid to sub-humid watershed and ex h i bits the
characteristically high variability between flood magnitude atld storm magnitude
commonly seen in these dry climatic settings.
7V7
20.9.3.
Relationship Between Flood Peak Discharge and Flood

Runoff Volume
Historical long duration extreme storms in the western US exhibit high

variability in the temporal and spatial distributions of precipitation over large
areas. This high variability in storm characteristics results in high variability in
the shapes of flood hydrographs and in the relationship between flood peak
discharge and flood runoff volume. Figure 20.18 displays the results from a
simulation of 25,000 annual maxima for the AR Bowman in central, Oregon.
This was a site-specific study that included simulations for both 72-hour
storms as well as sequences of long-duration storms over a 15-day period. The
variability of simulated flood peaks and runoff volumes is consistent with that
seen in historical streamflow data.
LU
O
DC
<
100000
(f
O
*
<
LU
CL
Fig. 20.18.
Scatterplot of
Peak Discharge
Flood Runoff
Volume for the
Bowman Watershed.
Flood
Versus
AR
RUNOFF VOLUME (acre-feet)
Figures 20.19a,b,c depict three flood hydrographs chosen from the simulations.
Each flood hydrograph contains the same runoff volume,
738
Event Flood Model (SEFM) / 20

Figure 20.19a,b,c. SEFM Simulated Stochastic
Flood Hydrographs
with Same Runoff
Volume
Produced
by
Storms of
Various
Magnitudes, and Temporal and Spatial Patterns for the AR

Bowman Watershed, in Central Oregon.
Iiowevcr storm characteristics have produced distinctly different peak

illMiharges uul hydrograph shapes, Also, eneh flood occurred during
20 M.G. Schaefer,
different
monthsB.L.
andBarker
represents different antecedent conditions and/ initial
reservoir levels. This variety of conditions has resulted in different mximum
reservoir levels produced by the floods.
20.9.4.
Calibration to Regional Flood-Frequency Curves
Calibration of a hydrologic model for a watershed is an importani element of

the modeling process. In many applications, there is little 01 no at-site
streamflow data available. In humid climates where floods are primarily
generated by rainfall or rain-on-snow events, the SEFM can be calibrated to
regional flood frequeney curves, such as thosi' developed by the US
Geological Survey (USGS).
In those cases where both precipitation and streamflow data are available
for historical storms/floods, that information can be used lo calbrate the
hydrologic model. In this situation, simulated flood peak discharges can be
compared against the regional growth curve for flood peaks as a verifleation of
the calibrated model.
Figure 20.20 shows a comparison of simulated flood peak discharj>,<
annual maxima for the Keechelus watershed with the applicable US( iS
regional growth curve42 for flood peak discharge. Use of the regional growth
curve (dimensionless) allows comparison of the slope and shapi of the flood
frequeney curve with that of the simulated data.
1.0
0 0.50
0.20
Simulated -------- Regional Solution
0.10
0.04
0.02
0.01
ANNUAL EXCEEDANCE PROBABILITY
Fig. 20.20. Comparison of Simulated Flood Peak Discharge Anniml Maxima with
USCS Regional Growth Curve.
20.9.5. Simulatcd Flood Magnitude-Frequency Curves

I lie primary purpose of the model is to develop magnitude-frequency ' i u ves
for flood peak discharge, runoff volume, and mximum reservoir level. The
magnitude-frequency curve for mximum reservoir level is pnrticularly
important because it accounts for the combined effeets of flood peak
discharge, runoff volume, hydrograph shape, initial reservoir li'vel, and
reservoir operations. It is the primary measure for assessing hydrologic risk,
as it directly relates to hydrostatic loads on concrete mictures, determines
spillway releases and hydraulic loads on pillways and energy stilling basins,
and is directly related to erosional I H ccs for overtopping of earthen
structures.
Figures 20.21 a,b,c depict the flood magnitude-frequency curves
ik'veloped from simulations at Keechelus Dam in Washington.
Kstimation of annual exceedance probabilities (AEPs) for extreme flood
peak discharges, runoff volumes, and mximum reservoir levels is ii ilifficult
technological challenge. In recognizing the difficulties of the i i . k , the
Stochastic Event Flood Model makes use of all livdrometeorological
information that can affect the magnitude of lluods. The model also
incorporates information on historical reservoir li'vels and reservoir
operational procedures. Regional analysis methods |rc utilized wherever
possible in analyzing hydrometeorological limmeters to reduce uncertainties
due to sample size.
N||. 20.21 n. IVliignitiulc-Pi'eqiicncy Curve Cor Flood Peak Discharge for Kcccheliis
Wnlershed biiNcd on SEFM Simulations.
741
140,00
0
20 / M.G. Schaefer,
B.L. Barker
J 120,000
UJ
IL
100,000
o
~ 80,000
3 60,000
O
>
it 40,000
O
z
0.5
10'1
102
10'3
10-4
10'5
106
107
10'8
2
2526.0
0 2525.0
, 2524.0
t 2523.0
0 L
0 t 2522.0
1 2521.0
0 I 2520.0
2519
2518.0
2517.0
2516.0
Fig. 20.21b,c. Frequency Curves for Flood Runoff Volume and Mximum Reservoir
Level for Keechelus Watershed based on SEFM Simulations.
Even with the relative sophistication of the stochastic approach, the rare
probabilities of those combinations of conditions that could pose a threat to
some dams are beyond what many experts would consider the limit of credible
calculation. In particular, recommendations put forward by an international
body of experts on flood hydrology (USBRW) suggest that an AEP of 10 5 for
flood characteristics is the limit o!' credible calculation with
technologies/methodologies currently available.
19
In keeping with the spirit of the USBR Rcport , magnitude- frequency
curves for flood characteristics are shown willi a solid lino out through an
AEP of 10 1 and a dashcd l i n e is useil beyond. II should
742
be noted that simulations are conducted for flood events more rare than 10'5
and are used to define the magnitude-frequency curve, however the dashed line
is used as a reminder of the limit of credible calculation.
20.9.6. Comparison of the PMF with the Distribution of

Floods Produced by PMP
SEFM can also be used to evalate the variability of floods that can be
produced by deterministic design storms such as Probable Mximum
Precipitation (PMP). This allows an assessment of the relative conservatism of
the Probable Mximum Flood (PMF) and to malee comparisons between the
PMF and the range of floods that could be produced by PMP23,27. This
analysis can be conducted in a number of ways depending on how the
temporal and spatial elements of PMP are treated. In the following example,
the 24-hour basin-average PMP amount of 8.75 inches was held constant and
the 72-hour storm amount and all temporal and spatial characteristics of the
storm were allowed to vary in accordance with that observed in historical
storms. The PMP storm could occur in any month allowed by HMR-5722,
which is the applicable Hydrometeorological Report for the area. Antecedent
conditions were allowed to vary in accordance with that observed in the
historical record.
Figure 20.22 shows the frequeney histogram of flood peak discharges
produced by PMP based on a simulation of 500 flood events. Simulated flood
peaks varied from a low of 10,000 cfs to a mximum of
328,0
cfs with a mean valu of 88,000 cfs. The frequeney histogram can be
compared to a PMF peak discharge of 267,500 cfs computed for the watershed
by standard deterministic methods. As is common in PMF analyses, the
mltiple conservatisms applied to the computation of the PMF results in a very
conservative approaeh. In this case, 99% of the floods produced by the
occurrence of 72-hour storms containing the 24- hour PMP are smaller than
the PMF computed for the watershed.
This same approaeh can also be used for analysis of runoff volume and
mximum reservoir level and provides an objective measure of the
conservatism of computed PMF vales.
7'M
Fig. 20.22. Frequency Flistogram for Floods Produced by PMP for the AR
Bowman Watershed.
SUMMARY
The Stochastic Event Flood Model was developed for analysis of extreme
floods resulting from 72-hour general storms and to provide magnitudefrequeney estimates for flood peak discharge, runoff volume and mximum
reservoir level for use in hydrologic risk assessments at dams. It was
developed specifically for application in mountainous areas of the western
United States where snowmelt runoff is commonly a contributor to flooding.
Application of the model al so pro vides insight into the seasonalities of
extreme storms, flood peaks, runoff volumes and mximum reservoir levels.
This type of analysis can be very useful for evaluating how the seasonality of
floods interaets with current reservoir operations to produce mximum
reservoir levels. This information provides a logical starting point for
optimizing reservoir operations to meet multi-purpose goals and to reduce
hydrologic risk.
REFERENCES
1.
Barker BL and Johnson DL, Implications of Interflow Modeling on Spillwav Pesien

Computations. 1995 ASDSO Annual Conference Proceedings, Lexington KY, Aug 1995.
2.
Barker BL, Schaefer MG,

Mumford JA, Swain RE,
Determine the Variabilitv
of PMF Estimates. 1996
proceedings, ppl07, Lexington KY, September 1996.
3.
Barker BL, Schaefer MG,

Mumford JA, Swain RE,
A Monte Cario
Approach to
Determine the Variabilitv
of PMF Estimates (Bumping Lake Dam).
prepared for
USBR Flood Hydrology Group, MGS Engineering Consultants, Final Report, July 1997.
4.
Benjamn JR and Cornell CA, Probabilitv. Statistics and Decisin for Civil Engineers.
McGraw-Hill, 1970.
5.
Cattanach, JD, and LuoW, Use of an Atmospheric Models and a Distributed Watershed
Model for Estimating the Probabilitv of Extreme Floods. ASDSO National Conference, Las
Vegas, pp673-680, October 1998.
6.
Daly C, PRISM. Parameter-EIevation Regression on Independent Slopes Mode/, Oregon

State University, Oregon Climate Service, Corvallis Oregon, 1994.
7.
Gringorten II, A Plotting Rule for Extreme Probabilitv Paper. Journal of Geophysical
Research, vol. 68, pp. 813-814, 1963.
8.
Haan CT, Statistical Methods in I-Ivdrology. Iowa State University Press, 1977.
9.
Helsel DR and Hirsch RM, Statistical Methods in Water Resources. Elsevier Studies in
Environmental Science 49, NY, 1992.
A Monte Cario
Approach to
ASDSO Annual Conference
10. Holtan HN, Stitner G J, Henson WH and Lpez NC, USDAHL-74 Revised Model of
Watershed Hydrology, Technical Bulletin No 1518. Agricultural Research Service, US
Department of Agriculture, 1975.
11. Hosking JRM, L-Moments: Analysis and Estimation of Distributions usina Linear
Combinations of Order Statistics. Journal Royal Statistical Society, Ser B, 52, pp 105-124,
1990.
12. Hosking JRM, and Wallis JR, Regional Frequencv Analysis - An Approach Based on I ,Moincnls. Cambridge Press, 1997.
7-15

13. Hosking JRM, The 4-Parameter Kapna Distribution. Mathematical Sciences Dept., IBM
Research Divisin
14. Hosking JRM, FORTRAN Libraries for Use with the Method of L-Moments. Versin 3,
Research report RC20525, IBM Research Divisin, Yorktown Heights, NY, 1996.
15. Jain R, The Art of Computer Systems Performance Analvsis. John Wiley and Sons, 1991.
16. Kouwen NJ, The WATFLOOD Hvdrologic Model. WATFLOOD Users Manual.
WATFLOOD SPL8. Flood Forecasting System. University of Waterloo, March 1997.
17. Kouwen NJ, Soulis ED, Seglenieks F, Graham,A The Use of Distributed Rainfall Data and
Distributed Hvdrologic Models for Estimation of Peak Flows for the Columbia River Basin,
for BC Hydro Hydroelectric Divisin, University of Waterloo, March 1997.
18. Lewis PAW, et al, Multiplieative Congruential Generator with Base 231-1 and Multiplier 75.
IBM Systems Journal, Mathematical Sciences Dept, IBM Research Divisin NY, 1969.
19. Linsley RK, Kohler MA, Paulhus JLH, Hvdrology for Engineers. McGraw-Hill, 1975.
20. Microsoft Corporation, Redmond, WA, Office 97 and Visual Basic for Applications. 1996.
21. National Research Council (NRC), Estimating Probabilities of Extreme Floods, Methods
and Rccommended Research, National Academy Press, Washington DC, 1988.
22. National Weather Service, Probable Mximum Precipitation for the Pacific Northwest
States - Columbia. Snake River. and Pacific Coastal Drainages. Hydrometeorological
Report No. 57, US Department of Commerce, NOAA, Silver Spring, MD, October 1994.
23. Oregon Climate Service, Mean Annual Precipitation Maps for Western United States.
PRISM Model, CorvaIlis Oregon, 1997.
24. Parrett C. Regional Analvses of Annual Precipitation Maxima in Montuna. IJSGS Water
Resources Investigations Report 97-4004, Helena Montana, March 1997.
74(>

25. Parrett C, Characteristics of Extreme Storms in Montana and Methods for Constructing
Svnthetic Storm Hvetographs. USGS, Water Resources Investigations Report 98-4100,
Helena Montana, July 1998.
26. Salas JD, Delleur JW, Yevdjevich Y, and Lae WL, Applied Modeling of Hydrologic
Time Series. Water Resources Publications LLC, 1980.
27. Schaefer MG and Barker BL, Stochastic Modeling of Extreme Floods for A.R. Bowman
Dam. MGS Engineering Consultants, November 1997.
28. Schaefer MG and Barker BL, Assessment of Risk Reduction Measures at A.R. Bowman
Dam using a Stochastic Model of Extreme Floods. MGS Engineering Consultants, Oct
1998.
29. Schaefer MG and Barker BL, Stochastic Modeling of Extreme Floods for Keechelus Dam.
MGS Engineering Consultants, September 1999.
30. Schaefer MG and Barker BL, Precipitation Magnitude-Freauencv Characteristics for the
American River Watershed. MGS Engineering Consultants, January 2000.
31. Schaefer MG and Barker BL, Seasonality of Extreme Storms for the American River
Watershed. MGS Engineering Consultants, November 1999.
32. Schaefer MG, Characteristics of Extreme Precipitation Events in Washington State.
Washington State Department of Ecology, Water Resources Program, 89- 51, October
1989.
33. Schaefer MG, Regional Analvses of Precipitation Annual Maxima in Washington State.
Water Resources Research, Vol. 26, No. 1, pp. 119-132, January 1990.
34. Schaefer MG, Magnitude Frequencv Characteristics of Precipitation Annual Maxima in
Southern British Columbia. MGS Engineering Consultants, Inc., December 1997.
35. Stedinger JR, Vogel RM, and Foufoula-Georgiou E, Frequencv Analysis of Extreme
Events. Chapter 18. Handbook of Hvdrology. McGraw Hill, 1992.
36. US Army Corps of Engineers, HEC-1 Hydrograph Package; Hydrologic Engineering
Center, Davis, California, 1990.
37. US Army Corps of Engineers, Runoff from Snowmelt. Engineering and Design Manuals,
EM 1110-2-1406, January 1960.
7-17

38. US Department of Agricultura, Natural Resources Conservation Service, National Soil
Survey Center, State Soil Geographic STATSGO) DataBase. 1991, revised 1994, 1995.
39. US Department of Interior, Bureau of Reclamation, A Framework for Characterizing
Extreme Floods for Dam Safetv Risk Assessment. USBR Dam Safety Office and Utah
State University, November 1999.
40. US Department of Interior, Bureau of Reclamation, Flood Hvdrology Manual. Arthur G.
Cudworth, Jr., Surface Water Branch, Earth Sciences Divisin, 1989.
41. US Department of Interior, Bureau of Reclamation, Stochastic Event Flood Model (SEFM)
Technical Support Manual. USBR Flood Hydrology Group, MGS Engineering Consultants,
Inc, October 1998.
42. US Dept of the Interior, Geological Survey, Magnitude and Frequeney of Floods in
Washington. Cummans, JE, Collings, MR, and Nassar, EG, Open-File Report 74-336,
1975.
STOCHASTIC ANALYSIS, MODELING,

AND SIMULATION (SAMS 2000)
J.D. Salas1, W.L. Lae2, D.K. Frevert3
'Civil Engineering Dept, Colorado State University , Fort Collins, Colorado 2 Prvate
Consultant, Golden, Colorado 3Technical Service Center, US Bureau of Reclamation,
Lakewood, Colorado
ABSTRACT
A new stochastic hydrology program called SAMS (Stochastic Analysis
Modeling and Simulation) has recently been developed by Colorado State
University with support from the US Bureau of Reclamation. SAMS offers
computational and graphical capabilitics in the areas of stochastic modeling,
analysis and simulation. SAMS has built on capabilities previously available in
the widely used LAST stochastic hydrology package - developed by William
L. Lae of the Bureau of Reclamation in 1978 and 1979, but now offers
updated and cnhanced capabilities in many areas.
7.1 H
This chapter will provide a detailed discussion of the stochastic

capabilities of the SAMS package
and will
Chapter
21 Ilstrate those capabilities for a
stochastic analysis of the Yakima River basin of Washington.
21.1.
INTRODUCTION
Stochastic simulation of water resources time series in general and hydrologic

time series in particular has been widely used for several doendes for various
problcms rclated to planning and management of
water resources systems. Typical examples are determining the capacity of a
reservoir, evaluating the reliability of a reservoir of a given capacity, verifying
the adequacy of a water resources management strategy under various potential
hydrologic scenarios, and evaluating the performance of an irrigation system
under uncertain irrigation water deliveries (Loucks et al, 1981; Salas 1993).
Stochastic simulation of hydrologic time series such as streamflow is typically
based on mathematical models. For this purpose a number of stochastic models
have been suggested in literature (Salas, 1993; Hipel and McLeod, 1994). Using
one type of model or another for a particular case at hand depends on several
factors such as, physical and statistical characteristics of the process under
consideration, data availability, the complexity of the system, and the overall
purpose of the simulation study. Given the historical record, one would like the
model to reproduce the historical statistics. This is why a standard step in
streamflow simulation studies is to determine the historical statistics. Once a
model has been selected, the next step is to estimate the model parameters, then
to test whether the model represents reasonably well the process under
consideration, and finally to carry out the needed simulation study.
The advent of digital computers several decades ago led to the development
of Computer software for mathematical and statistical computations of varied
degree of sophistication. For instance, well known packages such as IMSL,
STATGRAPHICS, ITSM, MINITAB, SAS/ETS, SPSS, and MATLAB have
been available for many years. These packages can be very useful for standard
time series analysis of hydrological processes. However, despite of the
availability of such general purpose programs, specialized software for
simulation of hydrological time series such as streamflow, have been attractive
because of several reasons. One is the particular nature of hydrological processes
in which periodic properties are important in the mean, variance, covariance, and
skewness. Another one is that some hydrologic time series include complex
characteristics such as long term dependence and memory. Still another one is
that many of the stochastic models useful in hydrology and water resources have
been developed speciflcally oriented to flt the needs of water resources, for
7-t')
21 / J.D. Salas, W.L. Lae, D.K. Frevert
instance temporal and spatial disaggregation models. Examples of specific

oriented software for hydrologic time series simulation are I lliC-4 (U.S Army
Corps of Enginecrs, 1971), LAST (Lae and Frevort, 1990), and
7S0
Stochastic Analysis, Modeling and Simulation (SAMS 2000) / 21
SPIGOT (Grygier and Stedinger, 1990). The LAST package was developed
during 1977-1979 by the U. S. Bureau of Reclamation (USBR). Originally, the
package was designed to run on a mainframe Computer (Lae, 1979) but later it
was modified for use on personal computers (Lae and Frevert, 1990). While
various additions and modifications have been made to LAST over the past 20
years, the package has not kept pace with either advances in time series modeling
or advances in Computer technology. This is especially true of the Computer
graphics. These facts prompted USBR to promote the initial development of the
SAMS package.
The first versin of SAMS (SAMS-96.1) was released in 1996. Since then,
corrections and modifications were made based on feedback received from the
users. In addition, new functions and capabilities have been implemented. SAMS
2000 is the most recent versin. It has the following capabilities and limitations:
Analyze the stochastic characteristics of annual and seasonal data. For

seasonal data the mximum number of seasons is 12 (time intervals
within a year).
Tests whether the original hydrologic data is normally distributed.
It includes several types of transformation options to transform the
original data into normal.
It includes a number of single site, multisite, and disaggregation

stochastic models that have been widely used in literature.
It includes two major modeling schemes for modeling and generation
of complex river network systems.
The mximum number of stations is 40.
The mximum number of stations for a group (for multivariate

disaggregation) is 10.
The mximum number of years for the input data file is 600.
The number of samples that can be generated is unlimited.
The number of years that can be generated is unlimited.
The purpose of this chapter is to provide a summarized description of the
current versin of SAMS. Section 21.2 defines the notation and Ihe stochastic
analysis of hydrologic data while section 21.3 describes (lio various stochastic
models, modeling schemes, and testing procedures that are includcd. Section
21.4 summarizes the basic rouccpls nuil proccilurcs uiulerlying stochastic
simulation ncluding
forecasting that can be made by simulation. Section 21.5 provides a

summary description of the software SAMS 2000. The chapter ends with a
section that includes some examples.
21.2.
STOCHASTIC ANALYSIS OF HYDROLOGIC DATA
A hydrologic time series can be characterized by a number of statistical

properties such as the mean, standard deviation, coeffcient of variation,
skewness coeffcient, season-to-season correlations, autocorrelations, crosscorrelations, and drought, surplus and storage related statistics. These statistics
are defined for both annual and seasonal data as shown below.
21.2.1. Basic Statistics

21.2.1.1. Annual Statistics
Consider an annual hydrologic time series y t ,t = \,...,N where N is the sample
size. The mean and the standard deviation of y, are estimated by
N
and
(21.2)
The respectively. The coeffcient of variation is defined as cv = s/y.

Likewise, the skewness coeffcient is estimated by
(21.3)
752
(21.1)
m
Stochast
ic
Analysis
The temporal dependence
of y, may be characterized
,
function. The sample
autocorrelation coeffcient
Modelin
time series y, mayg and
be determined by
Simulati
on
(SAMS
m,
2000) /
21
by the autocorrelation
of the
(21.4)
where
(2L5)
m k = (i / N ) i o'/+ ~ y)(y t ~ y)
t= 1
and k = time lag. Likewise, for multisite series, the lag-k sample crosscorrelation coeffficient between the time series for sites i and j, denoted by
rk J, may be estimated by
(< )
where
<4
(1 / N) i
{/l t -
- JM)
(21.7)
in which ///(" is the sample variance for site i. Note that m, = mi.
21.2.1.2. Seasonal Statistics
Lcty ViT be a seasonal time series, where v =1,...,N represent years and
r =/ ..... co, seasons, with A^=number of years and co =number of seasons.
This time series can be analyzed by using the overall statistics as in section
21.2.1.1, but seasonal hydrologic time series, such as monthly llows, are better
characterized by seasonal statistics. The mean and slnndard deviation (or
season r can be estimated by
1
= ^
(21.8)
Jy v=i
and
1
(A,->,)2
V=1
(21'9)
respectively. The seasonal coefficient of variation is cv z - Similarly,

the seasonal skewness coefficient is determined by
1^
X O'y.T ~^)3
jVv=1
=
, ------------------ (21.10)
.1=
The sample lag-A: season-to-season correlation coefficient may be
determined by
r. = --------------^ ----------
(21.11)
0,1k
where
1
m
in which m 0 T represents the sample variance for season r. Likewise, for
multisite series, the lag-k sample cross-correlations between site i and site j,
for season t, rf T may be estimated by
m{
rL = 7 ---------------- (21-13)
and
yvvi
75-1
in which m( x represents the sample variance for season x and site i. Note that
in Eqs. (21.11) through (21.14) when T- k < 1, the terms,
V = 1, y VtX _ k ,
m^_ k , J_ k , y (/! k , and m" t _ k are replaced by
^
2, yv-\f-vx-k ^ y^f+T-k ^Ojco+r-k yv-l,(0+T-k ym+T-k and
Ko + r-k respectively.
21.2.2.
Drought, Surplus, and Storage Related Statistics
21.2.2.1. Drought Related Statistics

Drought-related statistics are also important in modeling hydrologic time
series. For the hydrologic series y t ,t = l,...,N and demand level d=ay,0 <
a
<
1
(for example, for a = \ ,d =
y)
a
dficit occurs
when y.
<
d
consecutively during one or more
years
until y t < d
again. Such a dficit can be defined by its duration L, its magnitude M, and its
intensity / = M/L. Assume that m dficits occur in a given hydrologic sample,
then the mximum dficit duration (longest drought or mximum run-length)
is given by
iT = max(Z1,...,ZJ
(21.15)
and the mximum dficit magnitude (mximum run-sum) is defined by

M* = max(M 1 ,...,MJ
(21.16)
Similarly, the mximum dficit intensity can de computed. In SAMS, llic
longest dficit duration and the mximum dficit magnitude are cstimated for
both annual and seasonal series.
21.2.2.2. Surplus Related Statistics

l;or our purpose here, surplus related statistics are simply the opposite of
drought related statistics. Considering the same threshold level d, a surplus
occurs when y. < d consecutively until y < d again. Then, ussuming that m
surpluses occur during a given time period N, the mximum surplus period
and mximum surplus magnitude may be delermincd also from Eqs. (21.15)
and (21.16).
7M
21.2.2.3.
Storage Related Statistics
The storage-related statistics are particularly important in modeling time series

for simulation studies of reservoir systems. Such characteristics are generally
functions of the variance and autocovariance of a time series. For the time
series / = 1,N a subsample y v y n with n < N is taken and the sequence of partial
sums S is formed as
(21.17)
where S a = 0 and y n is the sample mean of y v y n . Then, the adjusted range R*
and the rescaled adjusted range R** can be calculated by
K = max(^0, j;, ..., s) - min(J0, J,..., S)
(21 .18 )
and
(21.19)
respectively, in which s n is the standard deviation of y v y n that is determined by
Eq. (21.2). Likewise, the Hurst coeffcient for the series may be estimated by
The calculation of the storage capacity is based on the sequent peak

algorithm (Loucks, et al, 1981), which is equivalent to the well-known Rippl
mass curve method. The algorithm, applied to the time series^, t \,N may
be described as follows. Based on y i and the demand level d, a new sequence
S' can be determined
as
(21 .21 )
75i
where Sg = 0. Then the storage capacity is obtained as

S e = max.[%,...,S]
(21.22)
Note that algorithms described in Eqs. (21.17) to (21.22) apply also to

seasonal series. In this case, the underlying seasonal series y is simply denoted
as y r
21.2.3. Data Transformations and Standardization

In cases where the normality tests indicate that the observed series are not
normally distributed, the data must be transformed into normal before
applying the models. To normalize the data, the following transformations are
available in SAMS:
- Logarithmic transformation
Y=\n{X+a)
(21.23)
r=(jr+7y
(21 .24 )
- Power transformation
Box-Cox transformation
Y=
{X+a)b
b* 0
(21.25)
where Y is the transformed series, X is the original observed series, and a and
b are transformation coefficients. Note that the logarithmic transformation is
simply the limiting form of the Box-Cox transform as I he coefficient b
approaches zero. Also, the power transformation is a Nhifted and scaled form
of the Box-Cox transform. The variables Y and X can represent either annual
or seasonal data. For seasonal data a and b enn be choscn to vary with the
season. In addition, the transformed data enn be standard i zed by subtracting
the mean and dividing by the Hliiiuliird deviation (standardization is actually
an option in SAMS). For example, Ibr seasonal series, the standardiza!ion
may be expressed as:
7S7
21 / J.D. Salas, W.L. Lae, D.K. Frever
(21.26)
where Z VT is the standardized series, and Y T and S T (Y) are the mean and the
standard deviation of the transformed series for month T. Then, the stochastic
models can be fitted to the standardized series Z v1 . .
21.3.
STOCHASTIC MODELING OF HYDROLOGIC DATA
Several stochastic models have been suggested in literature for modeling

hydrologic time series such as streamflow. They include univariate and
multivariate models and stationary and periodic stochastic models. In addition,
in many instances it is desirable to model seasonal data in steps or stages by
using the so called temporal disaggregation models. Likewise, complex
hydrologic systems involving mltiple sites can be approached by several
modeling schemes involving not only temporal disaggregation but spacial
disaggregation. This section summarizes the various models and options that
are included in SAMS.
21.3.1.
Univariate ARMA(p,q) Model
The ARMA(p,q) model may be expressed as:

(21.27)
where Y t represents the streamflow process for year t, it is normally distributed
with mean zero and variance (T2(7), e t is the normally distributed uncorrelated
noise term with mean zero and variance cr 2 (e), and (p(B) and Q(B) are
polynomials in B defined as
<KB) =
-------- typBP
0(B) = 1 - 6 y & - 0 2 iP -------------e t/ B</
75H
(21.28a)
(21,28b)
where </> x , are the autoregressive parameters; 0,,..., 0 are the

moving average parameters; B is the backward shift operator, i.e., BcYt - Y f c ,
and p and q define the order of the ARMA model.
The method of moments (MOM) may be used for parameter estimation of
ARMA(p, q) models. For example, the moment estimators for the ARMA
(1,0), ARMA (1,1) and ARMA (2,1) models are shown below:
-ARMA (1,0) model:
Y, =
+ e,
(21.29)
0, = mx / m0
(21.30)
o 2 () = (1 - f)s 2
(21.31)
- ARMA (1,1) model:

Y t = W t _ x +e-e x e t _ x
* =
-x -
(21.32)
m~,
(21.33)
mx
(s2 - ,m,)
0.=<b.+
41 Yl
(^s 2 -^)
2(e) = ^
0,
(21.34)
(21.35)
0
ARMA (2,1) model:
^ = ^-1+02^2+^-^-.
7V>
(2L36)
,2/%'2
m\
- s^-m,
m~,m,
0, = -2 -
(21.37)
m{ - s-m-1/n m
m^m
x
2 3
s m102 m-,m,
m-.tru
~ 02mi>
%
\
1 S:>
;> %
1
0j s
+ (j)2m]
mx
e,
"
- mx + 02//?j)
/V ^
mx + 02 /?y
)6{
(21.40)
where s2 is the varianee of Y, and is the estimate of the lag-A: autocovariance

of Y, which is defined as = E[Y t Y t -k\. In the foregoing model it is assumed
that the mean has been removed or E(F,)=0. Note also that s 2 = mo, i.e. the
sample estimate of <J 2 (Y).
However, the Least Squares (LS) method is generally a more effcient
parameter estimation method than MOM. In the LS method, the parameters
0s and 0s are estimated by minimizing the sum of squares of the residuals
defined by
F = = [r, - II, W-, + 1%
(21-41)
/=! /=!
where N is the sample size. Once the 0s and 0s are determined, then the
noise varianee c 2 (e) is determined by (1 / JV)E 2 . The minimization of the
sum of squares of Eq. (21.41) may be obtained by a numerical scheme.
Powell's algorithm has been commonly employed for least squares estimation
of parameters of ARMA models. The Powell algorithm (Gil et al, 1981 and
Himmelblau, 1972), is an expanded versin of the univariate gradient search
which is a useful optimization technique that does not require derivatives. The
moment estimates of ARMA(p,q) models may be taken as the initial vales in
the search algorithm. The non-derivative optimization techniques depend very
7<0
(21.38)
much on the starting points when the objective function is not convex. In these
cases there is no guarantee that the solution found corresponds to the global
minimum. The solution may be improved by choosing a different starting
point.
21.3.2.
Univariate GAR(l) Model
Gamma-autoregressive (GAR) models assume that the underlying series is

autocorrelated with a gamma marginal distribution and the models do not
require variable transformation. SAMS provides modeling and data generation
based on the GAR(l) model. The model parameters are estimated based on a
procedure suggested by Fernandez and Salas (1990).
The GAR(l) model can be expressed as (Lawrence and Lewis, 1981)
, = <!> Y t _ x + e t
(21.42)
where Y, is a gamma distributed variable, <> is the autoregression

coefficient, and e / is the independent noise term. Then Y, has a marginal
density function given by:
/AS> =
{214J)
where A, a, and 5 are the location, scale, and shape parameters, respectively.
Lawrence (1982) found that e can be obtained by the following scheme:
e = A(l -</>) +?7
(21.44)
where
r\ = 0
(21.45)
if M= 0
M
if
11=Z
./ i
7(.l
M> 0
where M is an integer random variable Poisson distributed with mean -/?

ln((f)) and U , j =1 ,2 , .... are independent identically distributed (iid) random
variables with uniform (0,1) distribution. Additionally, Wj, j =1 ,2 , .... are iid
random variables exponential distributed with mean 1 / a.
The stationary GAR(l) process of Eq. (21.42) has four parameters, namely
A, a, 5, and 0. It may be shown that the relationships between the model
parameters and the population moments of the underlying variable Y t are:
i - X +
a
(21.46)
(21.47)
P_
a2
7 =7 =
(21.48)
where fl, o 2 , 7 , and Pj are the mean, variance, skewness coeffcient, and the
lag-one autocorrelation coeffcient, respectively.
Based on results given by Kendall (1968), Wallis and OConnell (1972),
and Matalas (1966) and based on extensive simulation experiments conducted
by Fernandez and Salas (1990), they suggested the following estimation
procedure:
Stochastic Analysis, Modeling and

Simulation (SAMS 2000) / 21
K=
[JV(l - pf) - 2^(1 - pT)]
(21.52)
in which r x is the lag-1 sample autocorrelation coeffcient and s 2 is the sample

variance. In addition,
70
(21.53)
7 = (l- 3. 12p 3. 7yV- 0 - 49 )
si
where y 0 is the skewness coeffcient suggested by Bobee and Robitaille (1975)

as
70
(21.54)
in which g x is the sample skewness coeffcient and the constants A, B, and L

are given by
A= 1 + 6.5 IJV~1 +20.2JV-2
(21.55)
B= 1.487V-1 + 6.77 N~ 2 ,
(21.56)
and
L=^ ~ 2 ) ,
V(N-1)
(21.57)
respectively. Furthermore, the mean is estimated by the usual sample mean y.

Therefore, substituting the population statistics [i, o, y, and p, i n Rqs.
(21.46) - (21.54) by the corresponding estimates y, , y, and p, is above
suggested and solving the equations simultaneously give the MOM estimates
of the GAR(l) model parameters.
For more details, the
inlerested reader is rcfcrrcd to Fernandez and Salas
(1990).
7<i'
21 / J.D. Salas, IV. L. Lae, D.K. Frevert
21.3.3.
Univariate PARMA(p,q) Model
Stationary ARMA models have been widely applied in stochastic hydrology to

annual time series where the mean, variance, and the correlation structure do
not depend on time. Seasonal statistics such as the mean and standard
deviation may be reproduced by a stationary ARMA model by means of
standardizing the underlying seasonal series. However, this procedure does not
account for the season-to-season correlations that are generally exhibited by
hydrologic time series such as monthly streamflows. Thus, periodic ARMA
(PARMA) models have been suggested in the literature for this purpose.
A PARMA(p,q) model may be expressed as (Salas, 1993):
= d T (B)e vt
(21.58)
where Y v T represents a seasonal hydrologic process for year v and season r; it

is normally distributed with mean zero and variance <7^(Y); e v T is the
normally distributed and uncoiTelated noise term which has mean zero and
variance
and (j> r (B) and 0 T (B) are periodic
polynomials in B that are defined as
= 1 -^ %B-^-...-^> p %BP
e r (B) = 1 - d lz B - e 2z & -...- 0^
(21.59a)
(21.59b)
where </>, T, , 0 are the seasonal autoregressive parameters;

0 lT , ...,6 qX are the seasonal moving average parameters; i? is a
backward shift operator, i.e., BcYr X Y v T _ c , and p and q define the order
of the PARMA model.
Method of moments (MOM) may be used for parameter estimation of low
order PARMA(p, q) models. In SAMS the MOM estimates are available for
the PARMA(p,l) model. For example, the moment estimators for the PARMA
(1,1) and PARMA (2, 1) models are shown below (Salas et al, 1982):
-PARMA (1,1) model:
K., = 01, V, +
C21.60)
7(.l
///,2,Tand Simulation (SAMS 2000) / 21

Stochastic Analysis, Modeling
(21.61)
01,
]
A2,
= ,^ +
ni.1,T
1
(01,r+l4 ~
~^
(21.62)
(01,t4-1 ~ m \.j)(0!,t4-1 - m u) 0 \x + l
0l,*+i 4-1 - WI,T+I
"W
(21.63)
0,1,T
+1
lie) =
PARMA (2,1) model:

~ ^1,T^V,T-1 + $2, T K,T-2 + ^V,T ^1,T^V,T-1
(21.64)
2,r ^l,r-2 ~ <-2 *'3.r

*V-1 ^,T-2 ~ 4-2 //?2,r-l
01.T
=
3,T
t.^-1
2 T
(21.65)
m. x m,, -/,_,w9 _,
(21.66)
W
l,-r-l //?1,T-2 _ 4-2 ^,T-1
A I . (4 - m u - K m 2^
(i+i^-%r+^+i^)
0, = 0, + -------- --- ---^------- -------72 ----------- 2 ~ ^
W
OT
(01,r 4-1 - ^l.r + 02,T l.T-l) (01,r 4-1 -
l,r + 02,T ^.r-lHt+l
(21.67)
CT2(?)
-T-y2/+1 UT -------- ^
0
1,T + 1
1 T
(21.68)
where s- is the seasonal variance, and m k T is the estmate of the lag-/c 8easonto-season covarianee of J^T, i.e. M kx = E[Y VX Y vr _ /c ] beciuse K(Y vt ) 0. Note
also that s7 /// r.
75
In a similar manner as for the ARMA(p,q) model, the Least Squares (LS)
method can be used to estmate the model parameters of PARMA(p,q) models.
In this case, the parameters 0s and 0s are estimated by minimizing the sum
of squares of the residuals defmed by
N
\2
V = 1
(O
i s ( K ., - u,
T =1
(21.69)
V=1
where (O is the number of seasons and N is the number

years of data. Once the 0s and ds are determined, the
seasonal noise variance o](e) can be estimated by (1 /
N(D)ZZ T . For obtaining the least squares estimates
of the 0s and 0s are, by using Powells algorithm, the moment estimates of
low order PARMA(p,q) models such as PARMA(p,l) may be taken as the
initial vales in the search algorithm.
of
=1
21.3.4.
Multivariate MAR(p) Model
Let us consider a multivariate (vector) series Y t = [..., i^^7rwhere .F1/' is the

series at site i, n=number of sites, and F>] = 0 for i=\,...,n. The MAR(p)
model can be expressed as
0(B)Y( = et
(21.70)
where Q(B) is a square matrix of polynomials in B which is defmed as
0(B) = I - <t>xB{ - 02B^ - - pBP (21.71)

in which I is an (nXn) identity matrix; &/ , j = 1,..., p , are nxn
parameter matrices; B J is a scalar difference operator such that B J Z t = Z and e t
is an (xl) vector of normally distributed noise terms with mean 0 and
variance - covariance matrix G. The noises e, are independent in time but are
dependent in space. Such spatially correlated noise can be modeled by
e, = Be i
(21.72)
where e/ is a (xl) vector of standardized normal variables independent in time

and in space and B is an (nXn) parameter matrix.
It may be shown that the variance-covariance matrices of the MAR(p)
model are given by
K = 't <M? + G
i=i
(21.73)
k^1
(21.74)
= 1
where M k = E[ Y^j'_k] is the lag-/c cross covariance matrix of yT (since E(Y t ) =
0.) In finding the MOM estimates, Eq. (21.74) for k= 1, is solved
simultaneously for the parameter matrices <P y -,j = 1, ..., p, by
substituting in Eq. (21.74) the population covariance matrices M k , k= 1,
by
the sample covariance matrices M k ,
k= 1, 2,
...,p. Then
Eq.
(21.73) is used toestmate the variance-covariance
matrix of the
residuals G. For example, the moment estimators of the MAR(l) model are:
=
(21.75)
G= f 0 -
(21.76)
in which superscript -1 indicates a matrix inverse. After estimating O, / = 1p

and G as indicated above, B of Eq. (21.72) can be determined from
G BB T
(21.77)
Ihe abovc matrix equation can have more than one solution. However, a
nniquc Holulion can be obtained by assuming that B is a lower triangular
767
21 / ./. I). Salas, W.L. Lae, D.K. Frevert
matrix. This solution, however, requires that G be a positive defmite matrix.
21.3.5. Multivariate CARMA(p,q) Model

When modeling multivariate hydrologic processes based on the full
multivariate ARMA model, often problems arise in parameter estimation. The
Contemporaneous Autoregressive Moving Average (CARMA) model was
suggested as a simpler alternative to the full multivariate ARMA model (Salas,
et al., 1980). In the CARMA model, both autoregressive and moving average
parameter matrices are assumed to be diagonal such that a multivariate model
can be decoupled into component univariate models. Thus, the model
parameters O and
O do not need to be estimated jointly, but, instead, they can be estimated
independently for each single site by using the standard univariate ARMA
model estimation procedures. This allows that the best univariate ARMA
model can be identified for each single station. The CARMA(/?,/) model can
be expressed as
, = t + e, -
(21-78)
j= i
J=i
where Zt is a multi-dimensional vector of the normalized series with mean zero,

e( is a multi-dimensional vector of normal noises (residuals) with mean zero
and variance-covariance matrix G, and 0 , and 0 . are
respectively the autoregressive and the moving average diagonal parameter
matrices. Consequently, Eq. (21.78) can be decoupled into the model
components as
2f;> = y/ + ;> -
y=i
'
(2i.79)
y=i
Thus, Eq. (21.79) is the expression of a univariate ARMA(/j,^) model for site i
such that the parameters <//. and 0' can be estimated by the usual ARMA
model estimation methods.
Further, the vector of the noise terms et = []],..., 'n / / can be expressed as
et = ^
(21.80)
where the random vector ^ is uncorrelated in time and space, i.e.
7(H
) 1. It may be shown that the variance covariance matrix G of the

cross-correlated noises et is equal to
G = E(f) = BB r
(21.81)
Thus, a CARMA model implies that the cross-correlations between sites are
carried through the residuals.
Two methods can be used for estimating the G matrix:
(1) . The MLE estmate of G may be obtained by
& = -- ^ (21-82)
where , t = 1,..., N are the residuals calculated from model (3.53).
(2) . The MOM estmate of G may be obtained as a function of the
parameters 0 and 0 and the cross-covariances Mu- of Z,, i.e.,
G = f(0, 0, M k )
(21.83)
where k = 0,..., max(p, q) - 1. Further details on these estimation procedures

may be found in Salas et al. (2000.)
21.3.6. Multivariate Periodic Autoregressive, MPAR(p),

Model
The MPAR(p) model can be expressed as
X (B)Y VX = ^
(21.84)
where </> (/i) is a square diagonal matrix of periodic polynomials in B which

is dofmod lis
7ti'J
X (B) = I - <P U Bi - 2x B ------------------<3?pxBP

in which I is a (nxn) identity matrix; <> JX ,
(21.85)
p are nxn diagonal
parameter matrices for season t; f is a scalar difference operator such that

B J 'Z T = T Y, r is a (n X ]) column vector with elements, '
V f L Fj t /
y
y
t
K, fc
1 =1 ,..., n; and e vX is a (xl) vector of normally distributed noise terms
with mean 0 and varianee - covariance matrix G x . The noises e v T are

independent in time but are dependent in space and n is the number of sites.
Such spatially correlated noise can be modeled by
(21-86)
V,T =
where e yx is a (nx 1) vector of standardized normal variables independent in

both time and space and B x is an (n Xn) parameter matrix.
The parameters of the MPAR(p) model are estimated by MOM by
substituting the population moments by the sample moments in a similar
manner as for the MAR(p) model. The moment equations of the MPAR(p)
model may be shown to be:
+<?<
(21.87)
!= 1
Mt., =
fir T - i 2 Oandk > /
(21.88a)
1=1
k,i =
-*<
0andk
(21.88b)
/= i
where M kx = F[Y vx F vx _ /c ] is
the seasonal lag-k cross
covariance
matrix of Y v in which E(Y

) = 0. The MOM estimates of the
parameters 0 can be found by substituting the population covariance matrices
M k , k = 1by the corresponding sample covariance
matrices M kx , k =
p and solving Eq. (21.88) simultaneously
for the 0s. Then Eq. (21.87) is used to estimate the variance-covariance
770
matrix of the residuals G x .

After estimating
p and G t as indicated above, B % can
be estimated from
<7T = BXB
(21.89)
As for the MAR(p) model, a solution for the above equation can be obtained
by assuming that B% is a lower triangular matrix. This requires that Gx must be
positive definite.
21.3.7. Disaggregation Models

Disaggregation stochastic modeling of hydrologic time series are efficient
techniques for cases where the preservation of statistical characteristics of both
annual and seasonal scales is essential for the project under study. Valencia
and Schaake (1973) and later extensin by Mejia and Rousselle (1976)
introduced the basic disaggregation model for temporal disaggregation of
annual flows into seasonal flows. However, the same model can also be used
for spatial disaggregation. For example, the sum of flows of several stations
can be disaggregated into flows at each of these stations or the total flows at
key stations can be disaggregated into flows at substations which usually, but
not necessarily, sum to form the flows of the key stations. The Valencia and
Schaake and the Mejia and Rousselle models require that many parameters to
be estimated especially for the temporal disaggregation. For example,
Valencia and Schaake model requires 156 parameters for the case of
disaggregating annual flows into 12 seasons for one station. Mejia and
Rouselle model require 168 parameters. If the same disaggregation is to be
held for 3 sites, the models require 1,404 and 1,512 parameters, respectively.
Lae (1979) introduced the condensed model for temporal disaggregation
which reduces the number of parameters required drastically. For example, for
the cases mentioned nbove, Lane's model requires 36 parameters for the one
site case and <24 parameters for the 3 site case. In SAMS, Lanes model will
be used for temporal (seasonal) disaggregation. The Valencia and Schaake and
Mejia and Rousselle models will be used for spatial disaggregation and
univariate seasonal disaggregation where the annual flows for only one site
will be disaggregated into seasonal flows for the same site.
In using disaggregation models for data generation, adjustments may be
needed to ensure certain additivity constraints. For instance, in spatial
disaggregation, to ensure that the generated flows at substations (or at
subsequent stations) add to the total or a fraction (depending on the particular
771
21 / .1.1). Salas, W.L. Lae, D.K. Frevert
case at hand) of the corresponding generated flow at a key station (or subkey
station) or, in temporal disaggregation, to ensure that the generated seasonal
vales add exactly to the generated annual valu, three methods of adjustment
based on Lae and Frevert (1990) are provided in SAMS. These methods will
be described in detail in the following sections.
21.3.7.1.
Spacial Disaggregation
Valencia and Schaake Model
The disaggregation model can be expressed as (Valencia and Schaake, 1973)
Yt = AXt + Bet
(21.90)
in which Yt is an (fx 1) column vector with elements , Y' i = 1, ... , f; Xt is an

(Tzxl) column vector with elements , Xt i = 1, ... , h where h and/are
appropriate matrix dimensions. For example, in the key station to substation
disaggregation / and h represent the number of key and substations,
respectively. et is an (fx 1) vector of normally distributed noise terms with
mean 0 and the identity matrix as its variance - covariance matrix. The noises ef
are independent in both time and space. A and B are (fxh) and (hxh) parameter
matrices. The number of key stations/in the above equations can be more than
one so the above model can be used to disaggregate annual flows at several
key stations to their corresponding flows at substations in a multivariate form
which would be able to preserve the inter (cross) correlations among the
stations.
The model parameter matrices A and B can be estimated by using MOM as
(Valencia and Schaake, 1973):
772
A= MYX)M^(X)
BBT = M 0 (Y) - M 0 (YX)M^(X)M 0 (XY)
(21.91)
(21.92)
and
M,(X) = F[X t Xl ],
M Y )
= 7 ^,7 ,
M k (YX) = E[Y t Xl k], and

= E[Xfi_ k].
Equations (21.92) and (21.93) can be used to obtain estimates of A and B by
substituting the population moments M 0 (X), M0 (Y), M0 (XYJ, and Mg (YX) by their
corresponding sample estimates.
Mejia and Rousselle Model
This model can be expressed as

Yt = AXt + Be f + CY t _ x
(21.93)
in which Y n X p e p A, and B are defmed in the same way as for the Valencia
and Schaake model and C is an additional (hxh) parameter matrix. As for the
Valencia and Schaake model, the number of key stations / in the above
equations can be more than one so the above model can be used to
disaggregate annual flows at several key stations to their corresponding flows
at substations.
The model parameter matrices A, B, and C can be estimated by using
MOM as:
A = {[ M 0 (YX) - M X (Y) M Q '(Y) Mf(XY)]
[M 0 (XY) - M x (XY) M Q X (Y) Mf(XY)]-'}
C=
RB T =
[M X (Y) - AM x (XY)]M^(Y)
M 0 (Y) - AM 0 (XY) - CM'((Y)
77
(21.95)
(21.96)
Equations (21.95) through (21.97) can be used to obtain estimates of A, B, and

C by substituting the population moments M0(X), M0(Y), M/XYJ, M0(YX), M/XJ, M/Y),
M/XYJ and M/YXJ by their corresponding sample estimates. Lae (1981) showed
that some problems exist if one uses the above equations to estimate the
parameters. Specifically, the problem is in using M/XYJ. He showed that the
generated moments are affected and some key moments are not preserved. As
a result, he suggested that, instead of using a sample estimate of M/XYJ, one
should use the model (population) M/XYJ that would result from the model
structure (for further details, the reader is referred to Lae and Frevert, 1990).
In the final analysis, the suggested equation is
M* x (XY) = M x (X)M^(X)M 0 (XY)
(21.97)
The valu of M\(XY) calculated in Eq. (21.98) should be used in Eqs. (21.95) (21.97) for estimating the model parameters. Lae suggested also that M(Y)
should be calculated as:
M(Y) = M l (Y)+ M 0 (YX)M^(Y)[ M* x (XY) - M x (XY)J
(21.98)
Adjustment for spatial disaggregation
Three approaches are available for the adjustment of spatial

disaggregated data. They are:
approach 1:
I cf i] - U () I
= !' + rqt ~ X q / ] --------------------------------j=i | tfj) _ tfj) ,
]
( ]
(21.99)
j= i
approach 2:
=
fiLiA
(21.100)
$/'
J= 1
approach 3:
qi] =
n
+ (rq, - X
#) -
-------
(21.101)
CX^2
/'= 1
y =1
'
where
N
r=(\/N)%d r t
(21.102a)
/=i
I P
r, = ^,
(21.102b)
9t
and N is the number of observations, n is the number of substations (or

subsequent stations), q t is the t-th observed valu at a key station (or
substation), c/' is the t-th observed valu at substation j (or subsequent
station), qt is the generated valu at the key station (or substation), cj'1
is the generated valu at substation i (or subsequent station), q*<> is the
adjusted generated valu at substation i (or subsequent station), and L (/)
and (/) are respectively the estimated mean and standard deviation
ci of for site i.
21.3.7.2.
Temporal Disaggregation Lane's Condensed Model
The model eun ho expressed as
21/ J.D. Salas, W.L. Lae, D.K. Frevert
K,r =
,,-1
(21-103)
in which Y vx is a (nX 1) column vector with elements Y^'], i 1, ... ,n; X v is

an (x 1) column vector with elements , / = 1, ... , n; e v is a (n x 1) vector of
normally distributed noise terms with mean 0 and the identity matrix as its
variance-covariance matrix. The noises ev are
independent in time and space and n is the number of sites.
The model parameter matrices A, B, and C can be estimated
by using MOM as (Lae and Frevert, 1990):
4 = {m*<m - M X X (Y) M ^_ X (Y) M( x (XY)i
[M 0 (X) - M U (XY) M^_ X (Y) M{ x (XY)]~ 1 }
(21.104)
C x = [M lt (Y) - 4M U (JT)]M^(Y)
(21.105)
BXBT = M 0t (Y) - 4M 0x (XY) - C X M( X (Y)
(21.106)
where J//X) = E[X v X*_ k], M kx (Y) = E[Y vx YJ x _ k],

M k T (YX) = E[YV X X?_ J, and M JtiT (XY) = E[X v ^ T _ i ]. The
MOM
parameter matrices can be estimated by substituting the population moments
by their corresponding sample estimates in Eqs. (21.105) - (21.107).
In a similar manner as for the Mejia and Rousselles model, Lae (1981)
suggested that the following moments should be adjusted as:
M* x (XY) = M l (X)M^(X)M 0x _ l (XY) (21.107)
M* Xx (Y) = [M lx (Y) + M 0 x ( Y X ) ( X ) ] [ M* x (XY) - M X x (XY)]

(21.108)
The above adjustments are needed only for the frst season.
771
Adjustment for temporal disaggregation

Three approaches are also available for the adjustment of temporal
disaggregated data. They are: approach 1:
approach 2:
(21.110) &V,T
t= 1
and
approach 3:
(21.111)
\ (j2
where co is the
Qv is the
valu,
/= i
qvX
is
the
number of seasons,
generated annual
generated seasonal
valu, q*r is the adjusted generated seasonal valu, flr is the estimated mean of
qv% for season t, and T is the estimated standard deviation of qvX for season t.
21.3.8. Modeling Schemes

In modeling complex hydrologic systems involving many sites, although in
principie it may be possible to model the hydrologic time series at all sites
jointly, it is generally more convenient to model them by combining a number
of univariate and multivariate models and concepts ofaggrcgation and
disaggregation techniques. Obviously for a complex syslcm lliere is not ;i
uniquc way of combining various models.
777
A particular combination of models and techniques is sometimes called a

modeling scheme. SAMS uses two modeling schemes. We will Ilstrate them
assuming that we want to model a river system involving key stations (the
furthest downstream site), substations (stations draining into a key station), and
subsequent stations (stations draining into a substation). In Scheme 1, the
annual flows of all key stations are aggregated into an artificial index station.
Such aggregated annual flows are modeled by using a stationary univariate
model such as an ARMA(1,1) model. Then a spatial multivariate
disaggregation model is applied so that annual flows that are generated at the
artificial station can be spatially disaggregated into the annual flows at the key
stations. Similar spatial disaggregation models and procedures are applied to
disaggregate the annual flows at the key stations into annual flows for the
substations, and subsequently into the annual flows at the substations. Once the
annual flows at all stations are generated then multivariate disaggregation
models are applied so that the annual flows can be temporally disaggregated to
obtain the corresponding seasonal flows. In Scheme 2, rather that creating an
artificial station, the annual flows at the key stations are modeled (and
generated) by using a stationary multivariate model such as the MAR(l) model.
Then the remaining steps to model and generate the annual and seasonal flows
at all other stations are accomplished by following identical steps as in Scheme
1. Further description on these modeling schemes are illustrated in Sections
21.5 and 21.6.
21.3.9. Model Testing

The fitted model must be tested to determine whether the model complies with
the model assumptions and whether the model is capable of reproducing the
historical statistical properties of the data at hand. Essentially the key
assumptions of the models refer to the underlying characteristics of the
residuals such as normality and independence.
21.3.9.1.
Testing the properties of the residuals
Testing the residuals properties generally involves testing the normality and
the independence of the residuals. First, the residuals are obtained from the
specified models after the parameters are estimated. For instance, in the case of
the univariate PARMA model of Eq. (21.58), the residuals are the numbers
e {v e x2 ,e xv ... that are derived from the
77N
model. On the other hand, in the case of the MPAR model of Eq. (21.84), the
residuals are the set of numbers
... i = 1,..., n
each set i corresponding to each site or station. Testing the residual properties
can be done in several ways depending on how the residuals are arranged.
Several tests are available for testing the normality of the residuals.
Common normality tests include the skewness test, the chi-square goodness of
ft test, the Kolmogorov-Smimov test, and the product moment correlation test
(Salas et al, 2000). For periodic-stochastic models, the normality tests should
be applied on a month-by-month basis. Often though the tests are applied
considering the entire sample of residuals. In the case of multivariate models,
the normality tests should be applied for each set of data (site by site). In
SAMS, the skewness test of normality is applied on a month-by-month basis
and on a site by site basis.
Likewise, several tests are available for testing the independence of the
residuals. The Portmanteau lack of ft test and the Anderson test (Salas et al,
1980) are commonly used for testing independence in time when the residuals
are derived from stationary stochastic models. On the other hand, the crosscorrelation t-test may be used for testing independence in time when the
residuals are derived from periodic- stochastic models such as those described
in the previous sections. The t-test is applied for the correlation between the
residuals of two successive months, i.e. twelve tests for monthly data.
However, the Portmanteau or Anderson tests may be also applied for testing
the independence of residuals derived from periodic-stochastic models, based
on the autocorrelation of the entire residuals series. In SAMS, the Portmanteau
test of independence was applied. For testing the independence between
residuals of two different sites (independence in space), the usual test is based
on the cross-correlation t-test. Also this test should be applied for the crosscorrelation between residuals of two sites on a season-by-season basis (twelve
tests for monthly data), although the test can be applied based on the crosscorrelation of the entire residual series for each pair of sites.
21.3.9.2.
TestingARMA modelparsimony
For a fitted ARMA(p,q) model, SAMS tests its model parsimony using Akaike
Information Criterion (AIC) (Salas, et al., 1980). For comparing among
competing ARMA(p,q) models, the following equation is used:
AIC(p, q) = N ln(G 2 e ) + 2(p + q)
(21.112)
where N is the sample size and cr is the mximum likelihood estimate of the
residual variance. Under this criterion the model which gives th minimum
AIC is the one to be selected. SAMS computes AICs for the fitted model and
77'
the models of both one step higher order and one step lower order for
comparison. For instance, for a fitted ARMA(1,1) model, SAMS will compute
the AIC vales for ARMA(1,1), ARMA(2,1), ARMA(1,2), ARMA(1,0), and
ARMA(0,1) models for comparison. Besides, to test the assumption of white
noise, the AIC of the ARMA(0,0) is also computed.
21.3.9.3.
Testing the properties of the process
Testing the properties of the process generally means comparing the statistical
properties (statistics) of the process being modeled, for instance, the process
Y vr in Eq. (21.58), with those of the historical sample. In general, one would
like the model to be capable of reproducing the necessary statistics that affect
the variability of the data. Furthermore, the model should be capable of
reproducing certain statistics that are related to the intended use of the model.
If Y vz has been previously transformed from X v T, the original nonnormal process, then one must test, in addition to the statistical properties of Y,
some of the properties ofX Generally, the properties of
Y include the seasonal mean, seasonal variance, seasonal skewness, and
season-to-season correlations and cross-correlations (in the case of multisite
processes), and the properties ofX include the seasonal mean, variance,
skewness, correlations, and cross-correlations (for multisite systems).
Furthermore, additional properties of X VT such as those
related to low flows, high flows, droughts, and storage may be includcd
depending on the particular problem at hand.
In addition, it is often the case that not only the properties of the seasonal
processes Y v T and A v T must be tested but also the properties of the
corresponding annual processes AY and AX. For example, this case arises
when designing the storage capacity of reservoir systems or when testing the
performance of reservoir systems of given capacities, in which one or more
reservoirs are for over year regulation. In such cases the annual properties
considered are usually the mean, variance, skewness, autocorrelations, crosscorrelations (for multisite systems), and more complex properties such as those
related to droughts and storage.
The comparison of the statistical properties of the process being modeled
versus the historical properties may be done in two ways. Depending on the
type of model, certain properties of the Y process such as the mean(s),
variance(s), and covariance(s), can be derived from the model in cise form. If
the method of moments is used for parameter estimation, the mean(s),
variance(s), and some of the covariances should be reproduced exactly,
7H0
however, except for the mean, that may not be the case for other estimation
methods. Finding properties of the Y process in cise form beyond the first two
moments, for instance, drought related properties, are complex and generally
are not available for most models. Likewise, except for simple models, finding
properties in cise form for the corresponding annual process A Y, is not
simple either. In such cases, the required statistical properties are derived by
data generation.
Data generation studies for comparing statistical properties of the
underlying process Y (and other derived processes such as A Y, X and AX) are
generally undertaken based on samples of equal length as the length of the
historical record and based on a certain number of samples which can give
enough precisin for estimating the statistical properties of concern. While
there are some statistical rules that can be derived to determine the number of
samples required, a practical me is to generate say 100 samples which can
give an idea of the distribution of the statistic of interest say 6. In any case, the
statistics 0(i), i = 1,...,100 are estimated from the 100 samples and the mean 9
and variance S 2 (0) are determined. Then, the mean deviation, MD(Q)
MD(0) = 6 - 6(J7)
(21.113)
:md die relativo root mean square deviations, RRMSD(Q)

100
X [6(0 - 8(S/)P
are obtained in which 6( H) is the statistie derived from the historical sample
(historical statistie). The statistics MD(Q) and RRMSD(Q) are useful for
comparing between the historical and model statistics derived by data
generation. In addition, one can observe where 6( H) falls relative to 6 - S(0)
and 6 + S(6). Also graphical comparisons such as the Box-Cox diagrams are
useful.
21.4.
STOCHASTIC SIMULATION
In section 21.3 we have presented a number of stochastic models and

modeling schemes that can be used for simulating or generating artificial
records. Generally stochastic simulation begins by generating uniform random
numbers. Then normal random numbers must be obtained by using
appropriate transformations of the uniform random numbers. Several uniform
and normal random numbers generators are available in literature (see for
instance, Bradley, 1987 and Press et al. 1986). Subsequently, the normal
random numbers must be incorporated into the stochastic model. Section
21.4.1 summarizes a procedure to generate synthetic hydrologic time series by
using stochastic models. Section 21.4.2 discusses how stochastic simulation
can be used for forecasting.
21.4.1.
Synthetic Generation of Hydrologic Data
Let us assume that our original monthly flow data denoted by have X v x been
transformed into normal flows by using the logarithmic transformation, i.e.
(21.115)
Then it has been further standardized seasonally as
Subsequently, we fitted a PARMA(1,0) model to the Z series, i.e.

(21.117a)
As described in section 21.3.3 this model assumes that e vx is normally
distributed with mean 0 and standard deviation z (e). Therefore the generating
model can also be written as
ZV_T = $ u Z vr _ l+ c7 r (e) VT ,
(21.117b)
in which eVT is normally distributed with mean zero and standard deviation
one.
For generating monthly flows, the reverse procedure is followed. We start by
generating standard normal random numbers e v T. Then Eq.
(21.117b) is used to generate the Zs. After generating Z vx then Y VT
can be obtained by
Y vr
= Y z + S X (Y)Z VT
and X VT can be generated by applying the appropriate inverse

Iransformation to the Y v T data. In our case, since X v T was transformed
7N }
(21.118)
by using Eq. (21.115), then the Zs are generated by applying the

in verse transformation
= ex P(K,r) ~ a r
(21.119)
Note that for generating the autocorrelated series Z vr , the warm-up

procedure is followed. Suppose that we want to generate N years of monthly
flows. In this procedure, we start by generating ^ ,, from Eq.
(21.117b) in which the previous valu Z 0 12 is assumed to be equal to
zero (i.e. the mean of the process, which is zero in this case). Thus, Z ,....
Z 2 , Zj_ [,...,
Z2 2 , Z N J r L X 2
are generated
(and
subsequently
(he Y 's andthe Zs) where L is the warm-up length
required to remove
(lie elToct of Ihc initial assumption. In SAMS L is arbitrarily chosen equal lo
50. The dvanlagc of (lie warm up procedure is that it can be
7N }
used for low order and high order stationary and periodic models while exact
generation procedures available in the literature apply only for stationary
ARMA models or the low order periodic models. Generation based on
multivariate models is carried out in a similar manner except that vector of
standard normal random numbers must be generated.
21.4.2.
Forecasting Based on Stochastic Simulation
The basic times series models are perfectly adequate for generating data for
planning studies, where the main concern is not the immediate next one or two
years. The generation of data for planning studies is usually performed using a
random set of starting conditions that have nothing what-so-ever to do with the
current flows, the immediate past flows or any available forecasts. However,
in real time operations, the main concern is what happens in the next few
months or at the most the next few years. In this case, the current state of the
system and all associated forecasts or variable that can help to forecast the next
few months or years is indeed important. In order to make use of this
information, the time series models are either used differently or modifled to
better use the available information. These changes will in most cases only
slightly alter the generated flows and then only for a short time. However,
minor changes can be of relatively large importance terms of the safety and
efficiency of operations. Besides the immediate past flows, current forecasts
could be important as would be variables such as the ocean temperatures or
other variables of potential valu in forecasting future flows.
Three ways will be discussed as to how the models may be adopted for
stochastic forecasting. The flrst is simply to utilize the time series models but
making use of the recent past flows. Rather than a random start, the models are
started with the most recent flows or their corresponding transformed and
standardized counterparts. For example, consider a simple annual
autoregressive model of order one. This model has but one lagged term. Rather
than use a random term for the lagged flow when starting the generation, it is
easy to inser the present years flow when generating flows traces. Often the
lagged flow term is in a transformed and standardized form in the model and
some simple calculations are required to modify the actual flow into the corred
form for the model. For autoregressive models of any order, Ihis approach is
7M-I
very simple. For ARMA models, the process is more complicated as one has
no idea what the correct vales are for the random terms for the recently
experienced years.
Generally,
theyModeling
are easiest
approximated
frst/ 21solving
Stochastic
Analysis,
and Simulation
(SAMSby
2000)
the model equation for the most recent random term. By successive
substitution an equation can then be developed which gives the random term at
any time as an infinite series consisting only of past flows. Of course, only a
few terms are usually needed to adequately estimate the vales. The advantage
of this approach is that it makes use only of the time series models and, after
estimating the initial conditions, the generation of data is the same process as
normally used for planning studies. The disadvantage of this approach is that it
does not include some other information, which might be of help.
The second approach is to expand upon the first case by additionally
adding terms to the time series model to represent current forecasts or vales
of forecasting variables. This is easily done, however the parameter estimation
may be complicated and care must be taken to avoid pitfalls. Least squares
parameter estimation my in fact be the least prone to problems. The main
advantage is that all current and past knowledge is now utilized. A major
disadvantage is that the model now has many more sets of parameters. For
example, using a monthly model as an example, the model for generating May
flows is dependent upon the starting time. If the generation is started in
January, the model for the May flows has different parameters than if the
generation has started in say March. Further, the model parameters for
generating the first May llows is not the proper set for generating May flows a
year henee. If the goal is to generate flows for the next 18 months, 18 sets of
parameters (or equivalent if more than one month is generated at one time) are
needed for each of the 12 starting times or a total of 18 times 12 or 216 sets of
parameters. If the goal is to generate 36 months into the future, lliree times as
many are needed.
The third approach is where an entire series of variables are available into
the future and the time series model is modified to include these exogenous
variables. In cases where the future variable vales are nccurate into the
future, the time series model may only one set of parameters. However, if the
accuracy changes with distance into the future, the same approach is needed as
for the second approach.
21.5.
DESCRIPTION OF SAMS
In section 21.5.1, a general description of SAMS is presented in which

different operations undertaken by SAMS are briefly explained. Then, each
operation is explained and illustrated more thoroughly in sections 21.5.2,21.5.3
and 21.5.4.
21.5.1.
General Overview
SAMS is a Computer software package that deais with the stochastic analysis,
modeling, and simulation of hydrologic time series. It is written in C and

Fortran and runs under modera windows operating systems such as
WINDOWS NT and WINDOWS 98. The package enables the user to choose
between different options that are currently available. SAMS performs these
main functions: 1) Statistical Analysis of Data, 2) Fitting a Stochastic Model
(includes parameter estimation and testing) and 3) Generating Synthetic Series.
SAMS has the capability of analyzing single site and multisite annual and
seasonal data and the results of the analysis are presented in graphical or
tabular forms or are written on output fdes. The current versin of SAMS can
be applied to annual and seasonal data, such as quarterly and monthly data.
The Statistical Analysis of Data module consists of data plotting,
checking the normality of the data, data transformation, and data statistical
characteristics. Plotting the data may help detecting trends, shifts, outliers, or
errors in the data. Probability plots are included for verifying the normality of
the data. The data can be transformed to normal by using different
transformation techniques. Currently, logarithmic, power, and Box-Cox
transformations are available. SAMS determines a number of statistical
characteristics of the data. These include basic statistics such as mean, standard
deviation, skewness, serial correlations (for annual data), season-to-season
correlations (for seasonal data), annual and seasonal cross-correlations for
multisite data, and drought, surplus, and storage related statistics. These
statistics are important in investigating the stochastic characteristics of the
data.
The second main application of SAMS Fitting a Stochastic Model
includes parameter estimation and model testing for altcrnative
7N(>
univariate and multivariate stochastic models. The following models are

included: (1) univariate ARMA(p,q) model, where p and q can vary from 1 to
10, (2) univariate GAR(l) model, (3) univariate periodic PARMA(p,q) model,
(4) univariate seasonal disaggregation, (5) multivariate autoregressive MAR(p)
model, (6) contemporaneous multivariate CARMA(p,q) model, where p and q
can vary from 1 to 10, (7) multivariate periodic MPAR(p) model, (8)
multivariate annual (spatial) disaggregation model, and (9) multivariate
temporal disaggregation model. Two estimation methods are available, namely
the method of moments (MOM) and the least squares method (LS). MOM is
available for most of the models while LS is available only for univariate
ARMA, PARMA, and CARMA models. For CARMA models, both the
method of moments (MOM) and the method of mximum likelihood (MLE)
are available for estimation of the variance-covariance (G) matrix. Regarding
multivariate annual (spatial) disaggregation models, parameter estimation is
based on Valencia- Schaake or Mejia-Rousselle methods, while for annual to
seasonal (temporal) disaggregation Lane's condensed method is applied.
For stochastic simulation at several sites in a stream network system a
direct modeling approach based on multivariate autoregressive and CARMA
processes are available for annual data and multivariate periodic
autoregressive process is available for seasonal data. In addition, two schemes
based on disaggregation principies are available. i;or this purpose, it is
convenient to divide the stations into key stations, substations, and
subsequent stations. Generally the key stations are the farthest downstream
stations, substations are the next upstream stations, and subsequent stations are
the next further upstream stations. In the lirst scheme, the annual flows at the
key stations are added creating an annual flow data at an artificial or index
station. Subsequently, a univariate ARMA(p,q) model is ftted to the annual
flows of the index station. Then, a spatial disaggregation model relating the
annual flows o" the index station to the annual flows of the key stations is
fitted. further, a statistical disaggregation model relating the annual flows of
the key station to Ihose of the substations and another disaggregation model
relating the annual llows ol'lhc substations and the subsequent stntions, are
litlod. In taet, this is a Ihroe-levcl (spatial) disaggregration procedure. In the
second scheme a multivariate AR(p) model is fitted to the annual data of the
key stations, then the rest of the model relating the annual flows at the key
station, substations, and subsequent stations are conducted in a similar manner
as in the first scheme. Furthermore, if the objective of the modeling exercise is
to generate seasonal data by using disaggregration approaches, then an

additional temporal disaggregration model is fitted that relates the annual
flows of a group of stations with the corresponding seasonal flows.
The third main application of SAMS is Generating Synthetic Series, i.e.
simulating synthetic data. Data generation is based on the models, approaches,
and schemes as mentioned above. The model parameters for data generation
can be those which are estimated by SAMS or they can be provided by the
user.
21.5.2.
Statistical Analysis of Data
Statistical analysis options available in SAMS include:

1. Plot time series data.
2. Check normality and transform time series.
3. Statistical characteristics of time series.
In the following sections, we will examine each of these options.
21.5.2.1.
Plot Time Series Data
Plotting of the data can help in detecting trends, shifts, outliers, and errors in
the data. SAMS can plot the data as curve, stick, and bar graphs. Figure 21.1
illustrates a time series plot for annual data. The scale of the plot is determined
based on the .sample mximum and minimum as shown in the control bar at
the bottom, but the user can change it by entering the desired graph scale
range. This enables the user to zoom in and out of the plot to examine the data
and do an on- screen graphical check for the variability of the data.
Fig. 21.1. Plotting of annual time series.
21.5.2.2.
Check Normality and Transform Time Series
SAMS tests the normality of the data by plotting the data on normal
probability paper and by using the skewness test of normality. To examine the
adequacy of the transformation, the comparison of the theoretical generated
distribution based on the transformation and the counterpart historical sample
distribution are plotted as shown in Fig. 21.2 for annual data. For seasonal
data, the results of the seasonal skewness tests are presented in graphical and
tabular formats. The test critical vales are also shown on the screen which are
guides to check whether the data is within the normal range. For example, if
the sample skewness coefficient for a given season is less than or equal to the
critical valu, the hypothesis of normality of the data cannot be rejected. On

the other hand, if the sample skewness coefficient is greater than the table
valu, the hypothesis of normality is rejected. In addition, for the specified
season, the normal probability plot for the transformed seasonal data and the
comparison of the theoretical generated distribution and the sample
distribution for that season are also displayed.
7N*
Type ofTransformation: Logarlthmlc Y(D = In

(X(l) * a) where X(t) : Original Serles. Y(t):
Transformed Serles a: Parameter
ofTransformation
nlw Stolion B: fT
, ; ;!, v .. . j~~
Skewness Test of Normallty- Computed

Valu : -0.017 Table Valu (10% significante
level): 0.544 Result: Hypothesls of Normality
not Rejected
'
|7~
Xianarfonnationi
ave
Diipey
TtaoriennftIS
blatas| scepiTians(matiOn|
PioycjsMnnu{
Fig. 21.2. Annual data transformation result.
If the data at hand is not normal, one can check whether it can be
normalized by a certain transformation function. The user can choose any type
of transformation by simply clicking on the corresponding button. Three types
of transformations are available: logarithmic, power, and Box-Cox
transformations. The transformation can be done all at once for all seasons or
on a season-by-season basis. Figure 21.3 shows an example of seasonal
transformation results.
In the event that the user wants to model site 1 data with an ARMA (p,q)
model. Then, the ARMA model will be ftted to the transformed data and not
the original data.
A save option allows the user to save the transformation parameters in a
special file. To understand this feature of SAMS, suppose that a user
transformed the data and ftted the PARMA (1,1) model to the data.
Subsequently, the user wants to fit a different model to the transformed data.
Instead of doing the transformation process over again, the user can simply
open the transformation file, which was saved previously.
Stochastic Analysis,
Modeling and Simulation
(SAMS 2000) / 21
OMAt
wmmwhnm m, testihs:
Site 1 - KEECHELUS_RESERVOIR
Skewness Test of Normality
2 Check normality of
data and use
transformation options:
7'i
Trans
.
Log
Log
Log
Log
Log
Log
Log
Log
Log
Log
Log
Log
Co-eff.a
Coeff.b
0 0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
0.0000
Skewness Test ofNormalit)
Comp.Val. Tab.Val.(1
0.0004
0%
0.0004
level)
-0.0022
0.543
-0.0021
6
-0.0078
0.543
-0.0361
6
-0.0151
0.543
-0.0050
6
-0.0143
0.543
0.0004
6
-0.0438
0.543
0.0070
6
0.543
Type of Transformation: Logarithmic Y(t) = In
6
(X(t) + a) where X(t): Original Series, Y(t)
0.543
:Transformed Series a: Parameter of
6
Transformation
0.543
(IX: SAMftK
_
HR
6
Site 1 Season 1
0.543
NORMAL PPBABIUTY PAER
6
Satriple
Theotelicai
0.543
6
0.543
6
0.543
6
8.0000
3.5000
1.7000
-1.7000
-7.3000
40.0000
120.0000
80.0000
-1.4000
0.0000
-1.1000
2.5000
'OisTiHijn'fin
N orv
Site 1 Season 1 NORMAL PRBABILITV PAPER
NON-EACEEDANCE PPOBABILITY
Enter the valu ofa:. JO.O V 'la
.'.IBWUco | Statjff | 6cceptTiarofoima>ion| Eiinl |
:rilSlation#: |T
10
30 50 70
%
NON-EMCEEOANCE PRBABIUTV
Itarofomalions ave fiiplay Transfoim Ai Sites |
EnterSeason#:
|T
7<>2
21.5.2.3.
Statistical Characteristics of Time Series
A number of statistical characteristics can be calculated for the original and

transformed data. They can be available in graphical and tabular formats and
can be saved in an output file. These are summarized below.
- For Annual Data:
1.
2.
3.
4.
Basic statistics such as mean, standard deviation, skewness

coefficient, coefficient of variation, mximum, and mnimum
vales.
Serial correlation coeffcients.
Cross-correlation coeffcients for multisite data.
Drought, surplus (flood), and storage related statistics.
- For Seasonal Data:

1. Basic statistics such as seasonal means, standard deviations, skewness
coeffcients, coeffcients of variation, mximum, and minimum vales.
2. Season-to-season correlation coeffcients.
3. Season-to-season cross-correlation coeffcients for multisite data.
4. Drought, surplus (flood), and storage related statistics.
Graphs also display the 95% limits. If a correlation coefficient lies
between these two lines, it means that the correlation is not statistically
signifcant.
21.5.3.
Fitting a Stochastic Model
The LAST package included several programs to perform several objectives

regarding stochastic modeling of time series. The basic procedure involved
modeling and generating the annual time series using a multivariate AR(1) or
AR(2) model, then using a disaggregation model to disaggregate the generated
annual flows to their corresponding seasonal flows. In contrast, SAMS has two
major modeling strategies which are direct and indirect modeling. Direct
modeling means fitting an stationary model (univariate ARMA or multivariate
AR or CARMA) directly to the annual data or fitting a pcriodic (seasonal)
model (univariate PARMA or multivariate PAR) directly lo the seasonal dala
7<>2
of the system at hand. Annual to seasonal disaggregation modeling on the

other hand is an indirect procedure since the modeling of seasonal data
involves also modeling of the corresponding annual data as well. Regardless of
whether the input data available is annual data or seasonal (for example
monthly data) the user must select on the annual button if the final objective
of the modeling exercise is to generate annual flows only. Otherwise, if the
objective is to generate monthly quantities then the seasonal button must be
selected.
The following specific models are currently available in SAMS under each
category:
For Annual Modeling:
1.
2.
3.
4.
Univariate ARMA(p,q) model.

Univariate GAR(l) model.
Multivariate AR(p) model (MAR).
Contemporaneous ARMA(p,q) model (CARMA).
Multivariate annual (spatial) disaggregation.

For Seasonal Modeling:
1.
2.
3.
4.
Univariate PARMA(p,q) model.

Univariate seasonal disaggregation.
Multivariate PAR(p) model (MPAR).
Multivariate seasonal disaggregation.
Standardization implies that not only the mean will be subtracted but in
addition the data will be further transformed to have a standard deviation equal
to one. For example, for the season 5 data, the mean for season 5 will be
subtracted from each data point, then each observed data point for that season
will be divided by the standard deviation of the 5th season. As a result, the
mean and the standard deviation of the standardized data of the 5th season will
become equal to zero and one, respectively. Then, the order of the model to be
fitted can be selected. Subsequently, the method of estimation of the model
parameters must be selected.
Currently SAMS provides two methods of estimation namely the method
of moments (MOM) and the least squares (LS) method. MOM is available for
the ARMA(p,q), GAR(l), MAR(p), PARMA(p,l), and MPAR(p) models whilo
I ,S is available for ARMA(p,q), CARMA(p,q), and l'ARMA(p,q) models l'he
I S melhod requires initial parameters estimates (starting points). These
starting points can be selected by the user or the MOM parametcrs estimates
can be used as the starting points. For cases where the MOM estimates are not
available such as for the PARMA (p,q) model where q>l, the MOM parameter
estimates of the closest model will be used instead. For example, for the
PARMA(3,3) model, the MOM estimates of the PARMA(3,1) model
(including zeros for the two remaining parameters) will be used as the starting
points. For fitting CARMA(p,q) models, the residual variance- covariance G
matrix can be estimated using either the method of moments (MOM) or the
mximum likelihood estimation (MLE) method (Stedinger et al., 1985). The
estimated model parameters can be saved in a file selected by the user.
After the model has been fitted and the estimated parameters have been
saved, it is recommended that the fitted model be tested to ensure that it is
appropriate for the data at hand. In general, this can be done by testing the
residuals and comparing the model and historical properties of the data. SAMS
has the ability to perform such testing. Testing of the residuals is an important
part of the modeling process by which the modeler can test whether the fitted
model is adequate. In all the models available in the current versin of SAMS
except the GAR(l) model, the basic assumptions about the residuals are that
they are normal and independent. SAMS performs certain statistical tests to
check the validity of these assumptions. The hypothesis that the residuals are
normally distributed is tested based on the skewness test of normality. The
results are presented in terms of rejecting or not rejecting the hypothesis. In
addition, the residuals are plotted on normal probability paper in order to
check graphically whether the residuals are normally distributed. For testing
the independence of the residuals, the Porte Manteau test of independence
(Salas, et al, 1980) is utilized. The correlogram of the residuals is also plotted
to help the user in checking the independence of the residuals. Figure 21.4
shows an example of results of both normality and independence tests of the
residuals.
7*)5
Skewness Test of Normality - Computed

Valu: 0.622 Table Valu (10% significance
level): 0.544 Result: Hypothesis of
Normality Rejected
Enter Seaon Number:

|T
Porte Manteau Test- Computed Valu.

21.135 Table Valu (5% significance level):
19.680 Result: Hypothesis of Independence
Rejected
: Hds | Slstyt | Eiint | PievioutMenu | fiisplay |
Fig. 21.4. Testing the normality and the independence of the residuals.
Once the model has been fitted to the data, the moments, e.g. the theoretical
covariance structure can be calculated based on the estimated parameters.
Comparing the model and historical covariance (correlation) structure is
another method of testing. SAMS provides the user with the ability to perform
such comparisons. Figure 21.5 is an example of graphical comparison of
model and historical month-to-month correlations. Additional examination of
the model can be made regarding model parsimony. The so called Akaike
Information Criteria (AIC) may be used for this purpose. SAMS uses AIC for
testing model parsimony when stationary ARMA models are utilized.
21 / J.D. Salas, W.L. Lae, D.K.

Frevert
IiMPARINO
MODEL
and
HISTORIO
:
AL'CRRELC3RfMS . ral ________________ Modal
___________ Station 1
COMPARIMd MODEL d
HISTORICAL HlitQiK al ...
MoJgl
<: OMPARINO MODEL and HISTORICAL C ORRELOORAMS

Hr-stoiK al _______ - Model ________
Station i
jRRELOGRA
MS . Stat-on
I
Fig. 21.5. Comparing the model and the historical correlograms.
The system structure for adjustment usually depends upon the orders
and positions of the stations relative to each other. This is important when
adjustments need to be done to the generated series based on spatial
disaggregation. The system structure means defning for each main river
system the sequence of stations (sites) that conform to the river network.
SAMS uses the concept of key stations and subkey stations (substations
and subsequent stations). A key station is the farthest downstream station
along a main stream. For instance, station 1 is a key station in the river system
shown in Fig. 21.6. Likewise, 2 and 3 are also key stations. On the other hand,
if station 1 would not exist (or not used in the analysis), then stations 4 and 5
would become key stations. Let us continu the explanation assuming that
stations 1, 2, and 3 in Fig. 21.6 are key stations. Substations are the next
upstream stations draining to a key station. For instance, stations 4 and 5 are
substations draining to key station 1. Likewise, slalions 6 and 7 and K and l )
are,
7%
respectively, substations for key stations 2 and 3. Subsequent stations are the
next upstream stations draining into a substation. For instance, stations 11 and
12 are subsequent stations relative to substation 5 and station 10 is a
subsequent station regarding substation 4.
On the other hand, for defining a "disaggregation configuration" SAMS

uses the concept of groups. A group consists of one or more key stations and
their corresponding substations. Groups must be defmed in each
disaggregation step. Each group contains a certain number of stations to be
modeled in a multivariate fashion or "jointly" in order to preserve their crosscorrelations. For instance, if a certain group has two key stations and three
substations, then the disaggregation process will preserve the crosscorrelations between all the key and the substations. On 1 he other hand, if two
separate groups are selected, then the cross- correlations between the stations
lliat belong to the same group will be
preserved, but the cross-correlations between stations belonging to different
groups will not be preserved.
The defnition of a group is very important in the disaggregation process.
7H
For instance, key stations 1 and 2 and substations 4, 5, 6, and 7 form one group
in which the flows of all these stations are modeled jointly in a multivariate
framework, while key station 3 and its substations 8 and 9 form another group.
In this case, the cross- correlations between the stations within each group will
be preserved but the cross-correlations among stations in different groups will
not be preserved. For example, in the above configuration, the crosscorrelations between stations 1 and 3 will not be preserved but the crosscorrelations between stations 1 and 2 will be preserved. On the other hand, if
all the stations are defined in a single group, then the cross-correlations
between all the stations will be preserved. In the final step of disaggregation, a
group may contain stations 4, 5, 10, 11, and 12. In the current versin of
SAMS, the total combined number of stations in any defined group must not
exceed 10 stations. After modeling the annual flows using the above
configuration, the annual flows can be disaggregated into seasonal flows. This
is handled again by using the concept of groups as was explained above. The
user, for example, can choose stations 3, 8, 9, 17, 18, and 19 as one group. In
this case, the annual flows for these stations will be disaggregated into
seasonal flows by a multivariate disaggregation model so as to preserve the
seasonal cross-correlations between all the stations.
Currently, SAMS has two schemes for modeling the key stations. The flrst
scheme, denoted as scheme 1. will aggregate the annual flows of the key
stations that belong to a certain group, then use a univariate ARMA(p,q) to
model the aggregated flows, then the aggregated annual flows are
disaggregated (spatially) back to each key station by using the Valencia and
Schaake or the Mejia and Rouselle disagregation method. The second scheme,
denoted as scheme 2. will model the annual flows of the key stations belonging
to a given group by a multivariate MAR(p) model. Once the flows at key
stations are modeled, the rest of the procedure for generating annual flows at
all substations and subsequent stations and then for generating the seasonal
flows at all stations is the same as in scheme 1 (as above mentioned).
21.5.4.
Generating Synthetic Series
Data generation is an important subject in stochastic hydrology and has

received a lot of attention in hydrologic literature. Data generation is used by
hydrologists for many purposes. These include, for example, reservoir sizing,
planning and management of an existing reservoir, and reliability of a water
resources system such as a water supply or irrigation system (Salas et al,
1980). Stochastic data generation can aid in making key management
decisions especially in critical situations such as extended droughts periods
(Frevert et al, 1989). The main philosophy behind synthetic data generation is
that synthetic samples are generated which preserve certain statistical
properties that exist in the natural hydrologic process (Lae and Frevert,
1990). As a result, each generated sample and the historie sample are equally
likely to occur in the future. The historie sample is not more likely to occur
than any of the generated samples (Lae and Frevert, 1990).
Generation of synthetic time series is based on the models, approaches and
schemes presented in section 5.3. Once the model has been defned and the
parameters have been estimated, one can generate synthetic samples based on
this model. SAMS allows the user to generate synthetic data and eventually
compare important statistical characteristics of the historical and the generated
data. Such comparison is important for checking whether the model used in
generation is adequate or not. If important historical and generated statistics
are comparable, then one can arge that the model is adequate. The generated
data is stored in a file. This allows the user to further analyze the generated
data as needed. Furthermore, when data generation is based on spatial or
temporal disaggregation, one may like to make adjustments to the generated
data. This may be necessary in many cases to enforce that the sum of the
disaggregated quantities will add up to the original total quantity. For
example, spacial adjustments may be necessary if the annual flows at a key
station are exactly the sum of the annual flows at the corresponding
substations. Likewise, in the case of temporal disaggregation, one may like to
assure that the sum of monthly vales will add up to the annual valu. Various
options of adjustments are ineluded in SAMS.
The user must specify necessary information for the generation process.
The type of data to generate (either annual or seasonal) and the type of
modeling, which is either univariate (single site) or multivariate (mullisite)
must be selected. For example, if the user wants to generate annual data at a
single station by using an ARMA model, then the option "Annual" and "Single
site" must be selected. On the other hand, to generate seasonal data at several
stations from a disaggregation model, one must select "Seasonal" and
7W
"Multisite". In addition, the data length (in years) and the number of samples
to be generated, and a seed number to initiate the generation process need to
be specified. In SAMS, both the number of samples and the length of data to
be generated are unlimited. The user should consider however the Computer
time it will take to generate many samples or very long samples especially if
the generation is to be done for multisite seasonal data.
Furthermore, one of four options regarding the generation model, must be
chosen.
Statistical analysis of the generated data is available. In the case of
analysis pertaining drought, surplus, and storage related statistics, SAMS will
analyze the data in terms of a desired threshold demand level. The default
demand level is the sample mean, but one can change it by keying a fraction of
the sample mean or the actual desired demand level. The results of the
statistical analysis of the generated data can be saved into a file with the
extensin .gst and this file will be automatically attached to store the results.
Note that the referred feature of the statistical comparison of the historical and
generated data can be also used for further testing and verifying whether the
fitted model performs as desired.
In estimating the generated statistics, the statistics of each generated
sample are first estimated then the means and standard deviations of those
statistics are computed which will be used to compare with their historical
counterparts. The results are presented in graphical or tabular formats. Figure
21.7 shows a comparison of the (observed) historical annual series and the
generated series for one sample. The user can change the station number,
sample number, and the graph scale as needed. For annual series, the
comparisons of the historical and generated mean, standard deviation,
skewness coefficient, coefficient o' variation, and sample mximum and
minimum are presented in tabular form. For seasonal series, the comparisons
are presented in both graphical and tabular formats as shown in Fig. 21.8. The
comparisons of correlations for annual and seasonal data may be presented in
graphical or tabular formats as shown in Fig. 21.9 (for seasonal data).
Fig. 21.7. Time series plots of the historical and generated annual flows.
Fig. 21.8. Comparison between the historical and the generated monilily
mean and standard deviations.
Fig. 21.9. Comparisons of the historical and generated seasonal cross- correlations.
The comparisons of drought, surplus, and storage related statistics include

the longest drought, mximum dficit, longest surplus, mximum surplus,
storage capacity, rescaled range, and Hurst coeffcient. Before showing these
results, a window will pop up again to allow the user to change the demand
level if needed. The results are presented in tabular format and box plots as
shown in Fig. 21.10. The box plots reflect the ratios of the means, quartiles,
mximums, and minimums of those statistics calculated from the generated
series to the observed historical vales. The scale of the box plot can be
adjusted by the user based on the ratio ranges provided in the dialog box.
NO 2
Fig. 21.10. Comparison of drought, surplus, and storage related statistics.
21.6. EXAMPLES
Statistical Analysis of Data
In this section, SAMS will be used to model actual hydrologic data. The data
used is the monthly data of the Yakima basin. The data will be read from the
file yakima.dat (refer to SAMS Users Manual, Salas, et at, 2000). The file
contains data for 12 stations in the Yakima basin. Each station's data consists
of 12 seasons and is 48 years long. SAMS was used to analyze the statistics of
the seasonal and annual data. Some of the annual and seasonal statistics
calculated by SAMS are shown below.
Annual Statistics
Site Number:
KEECHELUS_RESERVOIR
Historical
Mean
242.9312
Standard Deviation
55.3134
Skewness Coefficient
0.3416
Coef. Variation
0.2277
Mximum
375.5001
Minimum
151.7 000
Correlation Structure
LAG
0
1.0000
0.2773
HIM
21 / ././). S/s, W.L. Lae, D.K. Frevert
2
3
4
5
6
7
8
9
10
-0.0591
0.0644
0.0104
0.0736
-0.1389
-0 .1669
-0 . 0322
-0.1162
0 . 0034
Lag-0
Cross
Correlations
Sites
1 and 1 (K & KE)
1.
E
0000
1 and 2 (K
& KA)
0
.
E
9877
1 and 3 (K
& YA)
0.7864
E
1 and 4 (K
& CL)
0.9826
8c YA)
E
1 and 5 (K
0.9834
Se
E
1 and 6 (K
YA)
0
Se
E
.9525
1 and 7 (K
BU)
0
.
Se
E
9190
1 and 8 (K
NA)
0
.
Se
E
8831
1 and 9 (K
TI)
0.8787
E
1 and 10
(K Se TI)
0.8698
Se
E
1 and 11
(K
NA)
0.8626
Se
E
1 and 12
(K
YA)
0.9243
E
Storage and Drought Statistics
Demand Level = 1.0000 * sample mean
Longest Drought
7.0000
Mximum Dficit
344.2187
Longest Surplus
6.0000
Mximum Surplus
244.0125
Storage Capacity
576.3561
Rescaled Range
10.419 8
Hurst Coefficient
0.7375
Seasonal Statistics
Site Number:
KEECHELUS_RESERVOIR
'k'k'k - k'k - k'k' k'k' k - k
Season
Historical
HIM
Mean
1
21.6250
2
22 . 5979
3
17 . 8708
4
14 .1542
5
15.5708
6
26.8333
7
47.4375
8
38.1917
9
14.9604
10
4.7375
11
5.4792
12
13.4729
Standard Deviation
1
13.5856
2
13.9981
3
10.2554
4
8.9925
5
8.5916
6
8.5001
7
14.4123
8
19.0200
9
11.6909
10
2.6210
11
4.3821
12
8 . 4761
" ^>_
... .1.1__SSE
BASIC SEASONAL STATISTICS Mean
---Std.Dev. Station
1 <KE)
___ ___ __ ________ __

____ EM!
BASIC SEASONAL STATISTICS CoalT.Var.
Station 1 (KE)
SltCoeff.
f
l
t
1
/
1
1. 0570
2
1.6400
3
0.8679
4
1.0953
5
2.2601
6
0.2109
7
0.1997
8
0.2420
9
1.1964
10
1.3112
11
2.8219
12
0.8688
NO
".
Season
to
Correlations
LAG 1 1
2
3
4
5
6
7
8
9
10
11
12
Season
0 . 5775
0.2969
0.2198
0.4555
0.4143
0.3211
-0 . 0872
0.5527
0.8343
0.8618
0.2814
0.4562
Sites 1 and 2
KA)
1
0 . 9853
2
3
4
5
6
7
8
9
10
11
12
0.9828
0.9793
0.9847
0.9924
0.9632
0.9788
0.9906
0 .9888
0.8572
0 .9504
0.9888
Lag-0 Season to Season Cross Correlations

(KE
11.0000
123.7427
80(>

-i niara
lag
0
SEASON TO SEASON CROSS CORRELATION lag

- - lag 2 Statlon 1 &2
(KE&KA)
A'
f
f o fc
v
v
___
/ ......... v;-'
95% L
'^-\/
95% L
-0.5
6 1 ib ii
Season
1b

Demand Level = 1.0000 * sample mean
Longest Drought
Mximum Dficit
N(I7
7.0000
163.890
1
640.110
3
39.0407
21.6.1.
Stochastic Modeling and
Generation of Data
0 . 6471
SAMS was used to model the annual and monthly flows of site 1 of Yakima
basin. Both annual and monthly data used in the following examples are
transformed using logarithmic transformations.
Longest Surplus
Mximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient
21.6.2.1.
Univariate ARMA (p,q) Model
SAMS was used to model the annual flows of site 1 with an ARMA(1,1)
model. The MOM was used to estmate the model parameters. SAMS was also
used to generate 150 samples each 48 years long using the estimated
parameters. The following is a summary of the results of the model fitting and
generation by using the ARMA(1,1) model.
Results of fitting an ARMA(1,1) model to the transformed and
standardized annual flows of site 1:
Model:ARMA
Model:ARMA
Number_of_sites:
Site(s)_ID:
Data_Transformations:
Site_l: a-coef= Data_Standardization:
Mean_of_the_process:
1
1
LOG
49.000000
YES
5.658607
Standard_deviation_of_the_process:
0.189585
Model_order(p,q): phi_parameters:
(Annual)
phi_l
-0 .138036
l.
hut n p51 amntor ii (Annunl.)
11

theta_l O.49494
7
Variance_of_the_residuals:
(Annual)
Results of statistical analysis of

the ARMA(1,1) model:
Model:
Univariate
Generated Data)
Site Number:
k'k'k'k'k'k'k'kMean
kiz'k
0.885071
the data generated from

(Statistical Analysis of
ARMA,
KEECHELUS RESERVOIR
Standard Deviation
Coef. Variation
Mximum Mnimum
Historical 2 4 2 . 9 3 1 2
Generated
55.3134 0
.3416
0 .227
7
375.5001
151.7000
242.9985
53.8040
0.4131
0.2212
385.1967
138 .7450
LAG
1.0000 1.0000
0
1
2
3
4
5
6
7
8
10
0.2773
-0.0591
0.0644
0.0104
0.0736
-0.1389
-0.1669
-0.0322
-0.1162
0.0034
0.2691
-0.0625
-0.0349
-0.0237
-0.0202
-0.0310
-0.0308
-0.0448
-0.0426
-0.0277

Demand Level =
Longest
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage Capacity
Rescaled
Range
Hurst
Coefficient
1.0
* sample mean
7.0
344.218
7
6.0
244.012
5
576.356
1
10.4198
0.7375
NON
N(I7
6.0267
287
.2662 5
.2533
311.2614
488.3525
9.1089
0.6879

SAMS was also used to model the transformed and standardized annual
flows of site 7 with an ARMA(2,2) model using the Approximate LS method.
The result of modeling for this site are shown below:
Model:ARMA
Number_of_sites:
Site(s)_ID:
Site_7:
a-coef=
Data_Standardization:
Mean of the process:
6.488171
Standard_deviation_of_the_j>rocess:
0.081923
1
7
LOG
450.000000
YES
/
Model_order(p,q):
phi_parameters:
2 2
(Annual)
phi_l
0.316854
phi_2
-0.122860
theta parameters:
(Annual)
theta_l
-0.002752
theta_2
0.003944
0.918059
21.6.2.2.
(Annual)
Univariate PARMA(p,q) Model
A PARMA (1,1) model was fitted to the transformed and standardized

monthly data of site 1 of the Yakima basin using MOM. Part of the modeling
results obtained by SAMS are shown on the following page:
Model:
Univariate PARMA, (Statistical Analysis of
Generated Data)
Site Number:
1
KEECHELUS_RESERVOIR
'k'k'k'k'k'k'k'k'k'k'k
N0>

***********
Season Historical
Mean
1
21.6250
2
22 .5979
3
17 .8708
4
14 .1542
5
15.5708
6
26 . 8333
7
47.4375
8
38.1917
9
14.9604
10
4.7375
Standard Deviation
11
5 .4792
1
13
.5856
12
13
.4729
2
13.9981
3
4
5
6
7
8
9
10
11
12
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
10.2554
8.9925
8.5916
8.5001
14.4123
19.0200
11.6909
2.6210
4.3821
8 .4761
1.0570
1.6400
0.8679
1.0953
2 . 2601
0 . 2109
0.1997
0.2420
1.1964
1.3112
2.8219
0 . 8688
21.4531
22 . 5754
17 .8748
13.9850
15.4822
26 .5404
47 . 5850
38 .5255
15.4387
4.7413
513.3797
.4180
13.2594
13
.4388
10.6862
9 .4890
8 .8690
8 .3496
13.9888
18 . 9623
13.9993
2.5598
4.1501
8.5231
1.0899
1.2611
1.3163
1.8644
2 .4466
0.3551
0.2544
0 .4822
2 .3082
1. 3478
2.1814
1.2928
to Season Correlations
Season
1
2
3
4
5
6
7
LAG 1
0.5775
0.2969
0.2198
0.4555
0.4143
0.3211
-0.0872
Historical
Generated
0.6249
0.4015
0.1513
0.4693
0.2756
0.2770
-0.0946
NIO
MEAN Generated
/\
/v\
Station 1 (KE)
V/
\i
4h)
Season
ib Vi ib
8
9
10
0.5527
0.8343
0.8618
0.5754
0.8147
0.6320
11
12
0.2814
0.4562
0.4625
0.3269
LAG
HHB
1 (KE)
0 . 3728
2
3
4
5
6
0.4746 0.1630 0.3682

0.0556
0.0162
0.2264
-0.0199
0.1639
0.0267
7
8
9
10
11
12
-0.1219
-0.3637
0.3692
0.7047
0.2319
0.1770
-0.1336
-0.3810
0.4268
0.6075
0.2310
0.1110
Storage
SEASON TO SEASON CORRELATION Historical - - - Generated Station

Lag 2
and Drought
0.2491
0.5
f
....... .. f \ 95% L
0.2180
E
o
-0.5
................... /
^ 95% L
I 5
\h
season
Statistics
Demand Level =
1.0000 * sample mean
Longest Drought
11.0000
10.7400
Mximum Dficit
Longest Surplus
Mximum Surplus
Storage Capacity
Rescaled Range
Hurst Coefficient
123.7427
7.0000
163.8901
640.1103
39.0407
0.6471
131.8937
6.5900
177.8746
487.0978
29.0030
0.5907
Es!13
STORAGE AND DROUGHT STATISTICS
ib l'l 1*2
21.6.2.3. Multivariate MAR(p) Model

SAMS was used to model the transformed and standardized annual data of sites
3, 5, and 7 of the Yakima basin using the MAR (1) model. The modeling
results are shown below:
Model:MAR
3
Number_of_sites:
Site(s)_ID:
Site_3: a-coef=
Site_5: a-coef=
Site_7: a-coef=
Data_Standardization:
LOG
-205.000000
LOG
2000.000000
6.096067
LOG
8.147832
450
6.488171
YES
000000
0.461667
0.103274
0.081923
Model_order(p,q):
phi_parameters:
1 0
(Annual)
phi_l
0.802852
-0.271925
-0.091863
0.180350
0.241441
0.127788
0.243420
-0.083069
103272
(Annual)
0.716938
0.736062
0.704988
0.736062
0.900586
0.868521
0.704988
0.868521
0.919150
These estimated parameters wcre used to generate 150 samples annual

data each of 48 years long for tlie tliree sites. The statistical analysis. result
of the generated data is shown below:
N12

Model: Multivariate AR (MAR), (Statistical Analysis of Generated Data) Site
Number:
3
YAKIMA_RIVER_AT_EASTON_DIVERSIONJDAM
Mean
Standard Deviation
Coefficient Coef.
Minimum
Correlation
Historica
l
699.3479
246.3507
1.8333
0.3523
1726.4000
346.9000
Generate
d
687.1061
218.7747
1.1163
0.3161
1400.439
9
367.8550
1.0000
0.2548 0 . 0238
0.0770 0.0034 0
. 0430 0.1625 0.1544 0.1121
-0.2085
-0.0532
0
1
2
3
4
5
6
7
8
10
Lag-0 Cross Correlations Sites
7 and
3 (BU & YA)
7 and
5 (BU & YA)
7 and
7 (BU & BU)
0.7269
0.9536
1.0000
Skewness
Variation Mximum
Structure LAG
1.0000
0.2156
0.0339
0.0114
0.0204
0.0103
0.0307
0.0409
0.0252
0 . 0357
-0.0369
0.8142
0.9563
1.0000

Demand Level =
Longest
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage Capacity
Rescaled
Range
Hurst Coefficient
1.0000 * sample mean
4.0000
255.5000
6.0
268.
4500
498.
2249
9.23
97
0.6996
6.0933
303.3795
5.5733
299.3968
495.2468
9.3981
0.6966
YAKIMA RIVER AT CLE ELUM

Mean
2345.5000
Standard Deviation
826.0001
Coef. Variation Mximum
Minimum
1 .000
0
LAG
0.2872 -
Site
Number:
***********
0
1
2
3
4
5
6
7
8
9
10
Historical 1474.3375
358.9830 0.2136 0.2435
Lag-0 Cross
Sites
5 and
3
5 and
5
5 and
7
Storage and
0 .2386
2300.8103
732 .6721
1.0000
0.2546
0 . 0224
0.0445
0.1007
-0.0004
0.0092
-0.0187
0.0426 -
-0.0137
0.1397 -
-0.0363
0.1650 -
-0.0478
0.0598 -
-0.0251
0.1297 -
-0
0.0224
0347
Generated
1461. 5977
348.3850
0.2240
0.0406
Correlations
(YA &
(YA &
(YA &
Drought
Demand Level =
Longest
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage Capacity
Rescaled
Range
Hurst Coefficient
1.0
*
sample
mean
YA)
YA)
BU)
Statistics
0 .8040
1.0000
0
9536
7.0
2220.5625
6.0
1561.5746
3803.9871
10.596
6
0.7428
0.8653
1.0000
0.9563
6.2200
2088.8184
5.6133
2044.6892
3397.3022
9.7093
0.7070
BUMPING RESERVOIR
Site Number:
***********
Mean
Standard Deviation
Coef. Variation Mximum
Minimum
Historical 209 . 5250
53.9224 0.1097 0.2574
316.4000
112.1000
Generated
207.7169
52.5678
0.1658
0.2534
332.250
5
95.6784
LAG
1.0000
1
.2548
2
0.0238
3
.0770
4
0.0034
5
.0430
6
0.1625
7
0.1544
8
0.1121
0.2156
0 . 0339
-0.0114
-0.0204
-0.0103
-0.0307
-0.0409
-0.0252
-0.0357
-0.0369
-0.2085
9
Demand Level =
Sites
7 and
7 and
7 and
1.0000
3 (BU
5 (BU
7 (BU
Lag-0
Correlations
& YA)
& YA)
& BU)
-0.0532
Cross
0.7269
0.9536
1.0000

1.0
* sample mean
0.8142
0.9563
1.0000
Longest
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage Capacity
Rescaled
Range
Hurst Coefficient
4.0
255.5000
6.0
268.4500
498.2249
9.2397
0.6996
6.0933
303.3795
5.5733
299.3968
495 .2468
9.3981
0.6966
21 / J.D. Salas, W. L. Lae, D.K. Frevert
21.6.2.4. Disaggregation Models

A spatial-tcmporal disaggregation modeling and generation example using
SAMS based on multivariate data of the Yakima basin is demonstrated here.
In this example both annual and monthly data being modeled are transformed
using logarithmic transformation. The schematic representation of the stations
locations in the basin are shown in Fig. 21.11. Clearly, stations 5 and 11 can
be considered as key stations. Stations 3, 4, 8, and 10 are substations and 1, 2,
7, and 9 are subsequent stations. Scheme 1 will be used to model the key
stations so that the annual flows of the key stations will be added together to
form one series of annual data as an index station. The index station data will
be fitted with an ARMA(1,1) model and then a disaggregation model (either
Valencia and Schaake or Mejia and Rousselle) will be used to disaggregate the
annual flows of the index station into the annual flows at the key stations. The
key station to substation disaggregation will be done using two groups. The
first group contains key station 5 and substations 3 and 4. The second group
contains key station 11 and substations 8 and 10. The substation to subsequent
station disaggregation was also done based on two groups. The first group
contains substations 3 and 4 and subsequent stations 1 and 2. The second
group contains substations 8 and 10 and subsequent stations 7 and 9. The
modeling results for the annual and monthly data are summarized below.
h'ij;. 21.11. Slalions local ions in (lie basin.
Hh.
H1 7

Annual (spatial) disaggregation
Disaggregation Model: Valencia and Schaake
Modeling of Key stations

Disaggregation scheme: 1
Key stations Id : 5 and 11
Model of Index station: ARMA(1,1)
Key station to substation disaggregation modeling

Number of groups: 2 Group
#: 1
Keystations Id: 5
Substations Id : 3 and 4 Group #: 2
Keystations Id : 11
Substations Id : 8 and 10
Substation to subsequent station disaggregation modeling

Number of groups: 2 Group
#: 1
Substations Id : 3 and 4 Subsequent
stations Id : 1 and 2 Group #: 2
Substations Id : 8 and 10 Subsequent
stations Id : 7 and 9
Using the above configuration the estimated model parameters are given below.
Modeling of Key stations

Basic_statistics_of_the_index_station:
Mean:
16.337555
Standard_deviation:
0.196377
Model_order(p,q):
phi_parameters:
phi_l
0.095154
theta_jparameters:
theta_l
-0.151970
1 1
(Annual)
(Annual)
0.036331
(Annual)
H1 7

Disaggregation_of_index_to_key_stations:
A__matrix
0.515312 0.484688
B_matrix
0.020397
-0.020397
0.000000
0.000001
Disaggregation of Kev stations to substations

Number_of groups:
group_#:
Number of key stations:
Key__stations ID:
Data Transformations:
Station_5:
a-coef=
5
LOG
2000.000000
Basic__statistics_of_the_key_stations: Mean_of_the_process:
8 .147832
0.103274
Number_of_sub_stations:
Sub_stations_ID:
Station_3:
a-coef=
Station_4:
a-coef=
2
34
LOG
-205.000000
LOG
1000.000000
Basic_statistics_of_the_sub_stations: Mean_of_the_process:
6 . 096067
7 .417926
Standard_deviation_of__the_process :
0 .461667 0 . 094175

A matrix
3.933447
0.904966
B matrix
0.219627
-0.002626
0
.
000000 0
. 011408
group_#:
Number_of_key_stations:
1
11
Key_stations_ID:
Station_ll: a-coef=
LOG
2406.000000
Basic_statistics_of_the_key_stations:
8.189722
0.097387
Sub_stations_ID:
Data Transformations:
2
8
Station_8:
a-coef=
Station_10:
a-coef=
LOG
2500.0
LOG
100.000000
Basic_statistics_of_the_sub_stations:
8.090804
6.195611
0.072572
0.205165
A_matrix
0.738420
1.995138
B matrix
0.010106
0.052522
10
0.000000
0.040097
Hl'
Disaggregation of substations to subsequent stations

Nurnber_of_groups:
group_#:
Sub_stations_ID:
Station_3:
a-coef=
Station_4:
a-coef=
2
3
LOG
-205.000000
LOG
1000.000000
6.096067
7.417926
0.461667
0.094175
Number_of_subsequent_stations: 2
Subsequent_stations_ID:
1
2
Station_l:
LOG
a-coef=
49.000000
Station_2:
LOG
a-coef=
210.000000
Basic_statistics_of_the_subsequent_stations
5.658607
6.036669
0
.189585
0
.124544
A_matrix
0.027025
0.005409
1.867341
1.288824
B_matrix
N0

0.033196
0.008933
0.000000
0.013417

group_#:
Sub_stations_ID:
Station_8:
a-coef=
Station_10:
a-coef=
2
8
10
LOG
2500.000000
LOG
100.000000
8.090804
6.195611
Standard_deviation__of__the_process :
0.072572
0.205165
Number_of_subsequent_stations: 2
Subsequent_stations_ID:
Data__Transformations :
Station_7:
a-coef=
Station_9:
a-coef=
LOG
450.000000
LOG
40.000000
Basic_statistics_of_the_subsequent_stations:
6 .488171
5 . 980681
Standard deviation of the process:
0 . 081923
0 . 220482
A_matrix
0 . 841615 -0 . 007637
B_matrix
0.017719 0.007666
0
.
093955
1.071983
0 . 000000
0
020592
H/I
Seasonal (temporal) disaggregation

For annual-monthly disaggregation modeling, the stations were divided into
two groups. The first group contains the stations 1, 2, 3, 4, and 5, while the
second group contains stations 7, 8, 9, 10, and 11. Parts of the annual-monthly
disaggregation modeling results are shown bclow.
Disaggregation Model: Lae condensed Model
Number of groups: 2 Group #: 1
stations id : 1, 2, 3, 4, and 5
Group #: 2
stations id : 7, 8, 9, 10, and 11
group #: 1
Season : 1
A
matrix
0.100187
-0.790256
-1.087746
-0.736562
-0.812205
matrix
0.272571
0.316740
0.313677
matrix
0.553959
0.669728
0.739637
0.610305
0.730781
1.056880
1.411542
1.300074
1.144258
0.337
333
0.342
454
-5.591171
-5.794002
-6.127742
-10.566408
-7.984121
0.000000
0.062122
0.037441
0.087106
1.344898
0.162744
0.154738
0.319091
0.254001
0.245251
0.342263
0.321621
0.432287
0.507121
0.064863 0.175279
-0.220576
0.154692
-0.291538
-0.239491
0.000000
0.000000
0.051319
0.001906
0.021514
2.833674
3.963215
0.458153 0.967182
-1.119348
-1.188065
-1.012993
-1.139128
3.974580
9.683748
5.462620
8.148174
9.577835
10.251024
0.00000
0
0.00000
0
0.00000
0
0.00000
0
0.00000
0
0.00000
0
0.00000
0
0.104401
0.041748
7.173420
8.640571
0.024983
group #: 1
Season : 5
A matrix 7.485024
-6.270320
-8.476122
-8.248028
-9.367930
matrix
0.812630
0.701064
0.697859
0.728257
0.736211
matrix
4.226117
3.171489
2.938035
2.794332
3.306370
group #: 2
Season : 1
A
matrix
1.716307
-0.044698
-0.326232
-0.327360
-0.147128
B matrix
0.356707
0.429611
0.285874
0.248105
0.420281
22.856783
20.105253
23.010061
16.965302
24.837963
0.000000
0.166081
-0.003222
0.054581
0.000507
-2.866502
-2.221176
-2.149964
-1.867283
-2.100979
16.572737
23.627403
20.584972
15.794100
22.826544
0.000000
0.108808
0.072522
0.057508
0.099421
1.439505
1.168698
1.391239
0.467788
1.053099
0.000000
0.000000
0.098642
0.186866
0.011022
0.141137
0.463517
0.547366
0.927850
1.024580
0.725862
-9.404758
-10.550832
-11.500857
-8.688657
-8.968649
-2.391560
-2.099235
-0.642270
-1.273152
-1.783531
3.279173
1.744206
1.468768
1.905422
1.898677
-3.392607
-2.921220
-1.793802
-2.042270
-3.548427
0.000000
0.000000
0.067637
0.059324
0.026707
-12.453279
-11.709853
-15.063147
-8.150460
-15.114661
0.191075
0.000000
0.000000
0.000000
0.000000
0.050552
0.000000
0.000000
0.000000
0.035298
0.014496
-6.186196
-5.100427
-1.557045
0.988259
-1.299358
0.000000
0.000000
0.000000
0.000000
0.103991
0.000000
0.000000
0.000000
-0.697620
-0.114892
0.423544
1.038808
1.114273
matrix
0.298232
0.272947
0.103876
0.106382
0.254331
0.338
970
-0.406830
-0.334373
-0.131294
-0.066460
-0.362385
0.236938 0.629304
-0.576146
-0.130068
-0.164665
-0.476519
0.654334
0.674303
0.182709
0.211029
0.584160
0.592925
0.381739
0.244746
0.659288
group #: 2
Season : 5
A matrix 3.034014
10.645264
-3.917177
-5.067391
-8.766766
matrix
2.416789
1.659479
1.618642
1.504343
1.396754
-2.220287
-3.768532
-0.828180
-2.485281
-3.613741
1.226575
6.803477
6.347968
4.987008
6.793612
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.000000
0.285451
0.160045
0.209446
0.127493
0.115327
-0.016704
0.075774
-0.030911
0.595607
0.049310
1.505506
1.830661
1.827437
1.679651
0.869733
-3.527168
-2.907587
-2.917736
-2.833496
-2.370874
2.026513
1.818996
2.342135
2.304887
1.566433
2.032238
-2.139591
-2.449902
-2.352898
-1.248207
matrix
0.490879
0.408971
0.435884
0.417643
6.733072
11.803424
-0.284301
3.801157
9.480949
1.720777
2.314499
0.804703
2.491937
2.021952
These estimated parameters were used to generate 100 samples of monthly

data each of 48 years long for the 10 sites. Part of the statistical analysis results
of the generated data is shown on the next page.
Stochastic Analysis, Modeling and Simulation (SAMS 2000)

/ 21
Model: Seasonal Disaggregation, (Statistical
Analysis of Generated Data) Site Number: 2
KACHESS
RESERVOIR
Historical Generated
Season
Mean
1
16 . 8646
16.9702
2
19.7521
20.0277
3
16.1458
16.1352
4
13.3875
13.4198
5
15.2688
15.3021
6
26.2375
26.3170
7
44 . 6521
44.3891
8
33 .4583
33.1077
9
11.4625
11. 5481
1
2.5000
2.5730
0
1
3.0542
3.0691
1
1
8.9646
9.0435
2
Standard Deviation
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
12.1013
12.8655
9.2932
7 . 9937
8.5009
8 . 1718
14.4182
16.2124
9.5621
2.2897
3 . 0264
6.7700
ISMIS3
.. Historical
Station 2 (KA)
MEAN Generated
5
4
A,
/\
I i i
S 1b 11 1^
i )
Season
STANDARD DEVIATION Generated Station
12.1722
12.9882
9.5560
8.6000
8.7934
7.9901
13.9078
16.0398
11.1432
2.8500
2.7782
7.4574
H 25
(KA)

i Skewness Coefficient
1.1320 1.4399
2
1.8967
1.5356
0.7127
1.2846
0.9671
1.7533
5
6
7
8
9
1
0
1
1
1
2
2 .2952
0.1600
0.3599
0.2885
1.1013
1.2974
3.1266
1.1720
2.4070
0.2967
0.3659
0.5599
2.1439
2.2914
1.9573
1.8414
OII
El
5
SKEWNESS C0EFF.
Historical Generated Station 2 (KA)
2
1
* ......................................... ............
aA A
1\
1 i 3 4 S 7 1b 1'1 1^ Season
H 26
Season to Season Correlations

LAG
0.6589
0.4100
2
3
4
5
0 .3546
0.4388
0.4377
0.2811
0 . 0489
0 .5925
0.8565
0 .8978
0 .3768
0.6031
7
3
9
1
0
1
1
El 13
0 .4416
0.5262
SEASON TO SEASON CORRELATION Historical

Generated Station 2 (KA) Lag 1
1.0
0.2913
0.3904
0.3056
0.1841
0.0670
0.6309
0 . 8436
0.6757
0.4301
0.4574
0.6
S
05 0
O
-0.5
-1.0
'A
1
2
Demand Level
"I 3 5 I l ? 5 ib i'i iT
season

1.0
* sample mean
Longest Drought
Mximum Dficit
Longest Surplus
Mximum Surplus
Storage Capacity
Rescaled Range
Hurst Coeffcient
10.0000
112.4566
8.0000
163.0347
564.9718
36.5715
0.6356
10.7700
124.0420
7.6300
181.7904
593.0753
37.6897
0.6359
_____________________________ jBB

site Number: 11 NACHES_R_BELOW_TIETON_R_NEAR_NACHES Season
Historical Generated Mean
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
56.0854
76.6458
64.6104
62.5875
77.3479
151.5771
280.5313
236.7167
103.7792
40.6542
28.0208
36.2292
56.1426
77.0317
64.7072
63.0430
77.0965
152.925
1
280.540
1
236.687
3
103.159
6
40.6756
28.1190
35.8892
BSD
MEAN
'Historical
Standard Deviation
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
40.6182
70.4072
42.6869
39.8475
48.1696
59 .2644
96.9022
103 .3385
57.9129
17.0232
10.6177
20.6300
44.2509
64.4993
41.9235
41.8386
43.1282
58.0738
94.0675
100.587
4
57.7420
16.5544
10.5495
19.6031
1
2
3
4
5
6
7
8
9
1
0
1
1
1
2
0.9662
3.4510
2.3219
1.3324
2.7078
0.4732
0.5250
0.2663
0.8792
0.0272
-0.4915
1.3003
1.8455
2.1245
1.8524
1.8464
1.7876
0.5181
0 . 6177
0.4145
1.2173
0.1560
0.1691
1.3090
N27
Generated
Station 11
(NA)
21 / J.D. Salas, W.L. Lae, D.K.

Frevert
Season
to
Season
Correlations
LAG 1
0
.6809
3
0
.
4
5099 0
5
. 7709
6
0
.
7
5542
3
0.4644
9
0.3119
10
0.3550
11
0.5396
12
0.8317
Storage and
0 Drought
.
8759
Demand
Level
=
0.7917
Longest 0.5972
Drought
Mximum
Dficit
Longest
Surplus
Mximum
Surplus
Storage
Capacity
Rescaled
Range
Hurst Coefficient
2
1.0000 *
0
.
4309
0
.6573
0.548
2
0.540
6
0.543
8
0.471
0
Statistics
0.326
4 0 .
sample mean
5680
10.0000
0.847
693.8212
3 0 .
8521
7.0000
0.797
1063.9717
7
3839.3513
0.645
39.6617
2
0.6499
11.2400
767.2662
7.7000
1208.4659
3697.9873
38.2223 0
. 6381
Lag-0 Season to Season

Sites 1 and 2 (KE & KA)
Cross
Correlations
MIE
3

1
2
3
4
5
6
7
8
9
10
11
12
0.9853
0.9828
0 .
9793
0.9847
0.9924
0.9632
0.9788
0.9906
0.9888
0 .
8572
0 .
9504
0.9888
S tes 3 and
i
1
0. . 9068
2
3
4
5
6
7
8
9
1
0
1
1
1
2
0. ,8623
0. .6949
0. . 8251
0. , 9108
0. .7394
0. .7722
0. .8394
0. .7933
0. .4031
0. .2735
0, .8937
0.9844
0.9780
0.9725
0.9738
0.9650
0.9615
0.9761
0.9891
0.9578
0.6456
0.8028
0.9815
BHE3
SEASON TO SEASON CROSS CORRELATION Historical -"-Generated Stalion
1 & 2 (KE&KA) LagO
........ ...
0.5
95% L
c
t
O
-0.5
95% L
13451
6 1b 11 iS
season
(Y
A
0 .3507
.
0,
. .12506056
0
.0 . 1 2 7 5
. .0830
0
.0 . 0 9 9 4
. .2210
0
.0 . 3 0 8 7
.0 . 2 8 4 1
. .1972
0
.0 . 2 2 7 2
. .2705
0
,
ites
1
5 and 11 (YA & NA)

0.9392
0.3422
2
3
4
5
6
7
8
9
0.8905
0.8189
0.9137
0.9296
0.9286
0.9512
0.9699
0.9462
0.1526
0.2284
0.1553
0.0695
0.0676
0.2773
0.3916
0.3967
H2

10
11
12
0.6776
0.3608
0.9007
Sites
8 and 10
0.9755
0.9557
io 0 e
2
3
4
5
6
7
8
9
10
11
12
0.9867
0.9796
0.9847
0.9827
0.9781
0.9833
0.9897
0.9770
0.7619
0.5003
0.9741
0.9086
0.9642
0.9656
0.9274
0.9773
0.9752
0.9888
0.9533
0.7168
0.4811
0.9092
21.6.2.
0.4131
0.3005
0.2319
(NA & TI)
EiEa
SEASON TO SEASON CROSS CORRELATION Historical Generated
Station 8 fi. 10 (NA&TI) Lag 0 .
0.5
g
V
.................................................................................................... 95% L
95% L
-0.5 -1.0
\ 2 a 4 s 6 ? d iti i'i 1*2

season
Example of Forecasting by Stochastic Simulation
Taking the basic autoregressive lag one annual model, AR(1), it is easy to
show how the generated flows are altered under a forecasting environment.
For argument sake, assume the mean of the process is 20, the standard
deviation is 5, the lag one serial correlation is 0.5, and the data is indeed
normally distributed and there is no need for any transformation. Assume that
many traces are to be generated into the near future, say for three years. We
will examine how this differs from a standard generation for first an normal
preceding year, second for an abnormally high preceding year and third for an
abnormally low preceding year.
First assuming that the previous just experienced years flow was exactly
the mean. This is a zero departure form the mean and the lagged term has a
corresponding valu of zero. The generated vales for the first year will be
distributed with a mean of 20 (unchanged because the lagged term had no
effect). The standard deviation of the generated flows is reduced to 4.33. Note
that a standard non-forecasting generation would have produced a mean of 20
and a standard deviation of 5. The second year will have a mean of 20.
Similarly for the third year. The standard deviation will be 4.84 for the second
year and 4.96. Note that the standard deviation quickly converges to the nonforecasting valu as is expected with a short term memory process.
Now, if the preceding year had been abnormally high, say lwo Standard
deviations high (a flow of 30), the means w i l l he 25, 22.5, and
N MI
21.25 respectively for the three stochastically forecasted annual flows.

Interestingly, the standard deviations are as before, 4.33, 4.84, and 4.96.
If the preceding year had been abnormally low by two standard deviations
(a flow of 10), the means will be 15, 17.5 and 18.75 for the three forecasted
years. Again the standard deviations are unchanged. The following tables
show the behavior of the means and standard deviations for these examples:
STOCHASTICALLY FORECASTING EXAMPLE BEHAVIOR
OF THE MEAN
Generated
Year
Non-Forecasting
Forecasting Preceding Years
Model (for
Flow High Average Low
First
comparison)
20
25
20
15
Second
20
22.5
20
17.5
Third
20
21.25
20
18.75
STOCHASTICALLY FORECASTING
OF THE STANDARD DEVIATION
Generated
Year
Non-Forecasting
Model (for
EXAMPLE
BEHAVIOR
Forecasting Preceding Years

Flow
High Average Low
First
comparison)
5
4.33
4.33
4.33
Second
4.84
4.84
4.84
Third
4.96
4.96
4.96
REFERENCES
liratley, P., Fox, B.L., and Schrage, L.E., 1987, A Guide to Simulation, 2nd. Edition, Springer
Verlag, New York.
l emandez, B., and J. D. Salas, 1990, Gamma-Autoregressive Models for Stream-Flow
Simulation, ASCE Journal of Hydraalic Engineering, vol. 116, no. 11, pp. 14031414.
I revert, D. K., M. S. Cowan, and W. L. Lae, 1989, Use of Stochastic Hydrology in Reservoir
Operation, J. Irrig. Drain. Eng., vol. 115, no. 3, pp. 334-343.
( l i l i , P. E., W. Murray, and M. H. Wright, 1981, Practical Optimization, Academic
Press, N. York.
(irygicr, J. C., and J. R. Stcdinger, 1990, SPIGOT, A Synthetic Streamflow
Generation Software Package, technical description, versin 2.5, School of Civil
and linvironmental linginccring, Comoll University, Ithaca, N.Y..
1111 n i non 11> I mi, I). M., I >72, Applied Nonlini'nr Pmgramming, McGraw-Hill, New York.
Hll
O 0.50
Fig. 21.3. Seasonal data transformation results.
The steps that are usually involved in using this

transformation window option are summarized below:
2. Transform the data by using the selected transformation type
and coefficients
3. Saving the selected transformation type and coefficients in
a file
4. Transforming data by loading the previously saved
transformation parameter file
It is suggested that if transformations are needed for both
annual and seasonal data, the user should conduct annual data
transformation before conducting seasonal data transformation.

Cap20 Hidro

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Cap20 Hidro

Caricato da

Copyright:

Formati disponibili

Section 5: Environmental

STOCHASTIC EVENT FLOOD MODEL

The basic concept of the stochastic event-based rainfall-runoff model41 is to

20 I M.G. Schaefer, B.L. Barker

Fig. 20.1a. Magnitude-Frequency Curve for Peak Discharge.

Stochastic Event Flood Model (SEFM) / 20

Fig. 20.1b. Magnitude-Frequency Curve for Runoff Volume.

Fig. 20.1c. Magnitude-Frequency Curve for Mximum Reservoir Level.

CAPABILITIES OF THE STOCHASTIC MODEL

20 / M. G. SchaeJ'er, B.L. Barker

precipitation, elevation, and soil characteristics. Hydrometeorological

Simulation Capabilities of the SEFM model

The simulation capabilities of the model include:

Stochastic Event Flood Modcl (SEFM) / 20

runoff Ability to - Compute runoff on a distributed basis using

APPLICABILITY OF THE STOCHASTIC MODEL

DISTRIBUTED RAINFALL-RUNOFF MODELING

A key element in the stochastic approach is the selection of realistic initial

20 / M.G. Schaefer, B.L. Barker

Hydrologic Runoff Units

To accommodate the distributed approach, the watershed is divided into

Mean Annual Precipitation Zones

Mean Annual Precipitation (MAP) often vares widely across mountainous

Stoehastic Event Flood Model (SEFM) / 20

Fig. 20.2. Mean Annual Precipitation Zones for Keechelus Watershed,

20.4.3. Elevation Zones

20 / M.G. Schaefer, t.L. Barker

Scale: 1:200,000 1 t/liie

Fig. 20.3. Elevation Zones for Keechelus Watershed, Washington.

20.4.4o Soils Zones

Stochastic Event Flood Model (SEFM) / 20

Fig. 20.4. Soil Zones for Keechelus Watershed, Washington.

A number of hydrometeorological inputs are required in employing the

Table 20.1. Independent and Dependent Hydrometeorological

20 IM.G. Schaefer, B.L. Barker

Month of Storm Occurrence

October 1st Soil Moisture

Dependent upon: 1 and

Temperature During Storm

Dependent upon: 1 and

Probabilistic Inputs for Initial Watershed Conditions

Date of Occurrence of Extreme Storms - is the end-of-month of occurrence of

Stochastie Event Flood Model (SEFM) / 20

Antecedent Precipitation - is the total precipitation from the start of the

20 / M. G. Schaefer, B.L. Barker

parameter Gamma distribution for a zone of mean annual precipitation within

Stochastic Event Flood Model (SEFM) / 20

Fig. 20.7. Example Distribution of End-of-February Snowpack for Crooked River

Fig. 20.8. Simple Box-Plot for the Variability of End-of-lYIonth Reservolr

20 / M.G. Schaefer, B.L. Barker

Probabilistic Inputs Related to the Occurrence of the

Stochastic simulation of the temporal and spatial distribution of extreme

Fig. 20.9. Basin-Average 24-Hour 10 mi2 Precipitation Magnitude- Frequency

Precipitation Temporal Characteristics - Probabilistic information about the

Stochastic Event Flood Model (SEFM) / 20

24-Hour = 8.0 inches

72 Hour = 10.7 inches

20 / M.G. Schaefer, B.L. Barker

Area-duration curves for the 24-hour duration for Keechelus watershed29

Fig. 20.11. Probabilistic Depth-Area-Duration Curves for Cascade Mountain

Inputs Related to Rainfall-Runoff Modeling

Rainfall-runoff computations are accomplished in two stages. First, surface