Sei sulla pagina 1di 8

12nd International Conference on Urban Drainage, Porto Alegre/Brazil, 11-16 September 2011

Intensity-Duration-Frequency Estimation using Generalized Pareto Distribution for Urban Area in a Tropical Region
M.D. Norlida1*, I. Abustan1, R. Abdullah1, A. S. Yahaya1, O. Sazali1, M.D. Mohd Nor2 and M.S. Lariyah2 School of Civil Engineering, Engineering Campus, Universiti Sains Malaysia, 14300 Nibong Tebal, Seberang Perai Selatan, Pulau Pinang, Malaysia 2 Colledge of Engineering, Civil Engineering Department, Universiti Tenaga Nasional, Jalan Ikram-Uniten, 43000 Kajang, Selangor, Malaysia
* Corresponding author, e-mail norlidamd@water.gov.my, norlidamd@gmail.com
1

ABSTRACT
The Generalized Pareto Distribution (GPD) is used to derive the Intensity-DistributionFrequency curve for an urban area located in the tropical region using partial duration series (PDS). The Method of L-Moments (LMOM) is used to fit the distribution while the Kolmogorov-Smirnov (K-S) is used for goodness-of-fit test. The procedure was repeated for eleven rainfall durations, which range from 5 minutes to 4320 minutes. Five urban rainfall stations where the data was extracted were used in the study. For comparison purpose, the Log-Logistic 3(P) and the Generalized Extreme Value (GEV) were used. The GPD continuous parameters k, and were used to derive recurrence intervals for predicting rainfall intensities at the rainfall stations having less than 10-years data. The study proved that GPD is the most appropriate distribution compared to others. In the majority of cases, the GPD distribution provided good fits to PDS, while the performance fell to third place in the ranking for the rest of the cases. The result from 11 rainfall durations showed that the GPD, GEV and LL3 (P) had a total of ranking number 89, 111 and 130 respectively. The eleven duration most preferred first ranking by GPD, GEV and LL3 (P) is 61.8%, 20% and 18.2% respectively.

KEYWORDS
Generalized extreme value; generalized pareto distribution; log-logistic; methods of lmoments; maximum likelihood estimates

INTRODUCTION
The rainfall Intensity-Duration-Frequency (IDF) relationship is one of the most commonly used tools in water resource engineering, either for planning, designing and operating of water resource projects, or the protection of various engineering projects (e.g. highways or dams) against floods. This IDF curve estimation is used to estimate floods for different intervals at specific return periods. Several statistic distributions have been applied to characterise the extreme behaviour of rainfall by a mathematical framework to the ordinary rainfall data and discharge observation (Koutsoyiannis et al., 1998). GPD, GEV and LL3 are used to characterise the partial distribution series of recorded rainfall, which having less than 10-years data. The GEV distribution was re-introduced and reviewed by Bertin and Clusel (2003) to provide a general framework for the frequency analysis of Norlida et al. 1

12nd International Conference on Urban Drainage, Porto Alegre/Brazil, 11-16 September 2011 extreme hydrological and meteorological events. The GPD and GEV investigated parameters are widely used in looking at the most accurate continuous parameters in Italy on time series samples (Deida and Pauliga, 2009) and on a performance of some parameter estimator in USA (Zea Bermudez and Kotz, 2010). Fitting the Log Logistic distribution by generalized moment, i.e. Maximum Likehood Estimates (MLE), probability weighted moments (PWM) and LMOM (Fahim Ashkar and Mahdi, 2006) found to be more related to the choice of the moment. The parameters accuracy is based on the type of the generalized moment chosen. The review on flood frequency estimation was explained by Cunnane (1989) and Ahmad (2008) used rainfall threshold to separate a convective and a non-convective rainfall in the tropical region in Kuala Lumpur. Abustan et al. (2000) used urban rainfall and stream flow in finding the relationships between the two hydrologic parameters and rainfall station altitudes in a highly urbanised area in Kuala Lumpur. This paper explains and illustrates the continuous parameter estimation of the GPD, GEV and LL (3) P based on PDS and POT data while the Kolmogrov-Smirnov (K-S) is used for the goodness-of-fit test. The study found that GPD is the most appropriate distribution compared to others. The GPD parameters are used to derive an IDF curve for all rainfall stations in Kuala Lumpur.

METHODS
Probability Distribution Generalized Pareto Distribution. The probability density functions for the GPD with shape parameter k 0, scale parameter , and threshold or location parameter . The k, and are the continuous shape, scale and location parameters respectively (Zea Bermudez and Kotz, 2010). Generalized Extreme Value (GEV) Distribution. The GEV distribution which is widely recommended for flood frequency analysis has the probability density function and cumulative distribution function (Bertin and Clusel, 2006). The class of GEV distributions is very flexible with the tail shape parameter k, as a scale and as a location parameter. Log-Logistic. The LL (3) P distribution has been used in hydrology for modelling rainfall and stream flow (Singh, 1995). The unbounded distributions have a range of . The Log-Logistic Distribution with three parameters LL (3)P has parameters shape , scale , and location . Goodness of Fit Test The K-S tests (Chakravart et al., 1967 and Andres-Domenech et al., 2010) are used to decide if a sample comes from a population with a specific distribution. The K-S test makes use of critical values to choose the specific best selected distribution run by EasyFit version 5.3 Professional in which critical values must be calculated for each distribution. The smallest critical p-value indicated the best distribution following the classical Glivenko-Cantelli Theorem (Topse, 1970). Estimation Methods Fitting statistical distribution to extreme rainfall event data is necessary to establish good IDF 2 IDF Estimation using GPD for Urban Area in a Tropical Region

12nd International Conference on Urban Drainage, Porto Alegre/Brazil, 11-16 September 2011

Figure 1. The location of Sg. Kerayong in Malaysia. curves. The use of graphical method namely the Method of L-Moments (LMOM) and Maximum Likelihood Estimates (MLE) have been employed for this purpose. Methods of L-Moments. LMOM is a linear combination of order statistics, which are robust to outliers and virtually unbiased toward small samples, making them suitable for rainfall and flood frequency analysis, including the identification of distribution and parameter estimation (Hosking, 1993). Pearson (1991) and Yang et al. (2009) developed regional flood frequency based on LMOM for New Zealand and China respectively. Vogel and Fennessey (1993) studied on how one should replace LMOM with product moment diagrams. Maximum Likelihood Estimates. The MLE equations can be expressed as where N is the sample size and involve matrix form calculation (Singh et al., 1993). The LMOM is used to determine the parameters that maximize the probability (likelihood) of the sample data. In statistical hydrology, the method of MLE is considered to be more robust (Singh et al. 1993), versatile and yields estimators with good properties of statistics. MLE methods apply to most models and to different types of data. In addition, they provide efficient methods for quantifying uncertainty through confidence bounds.

STUDY AREA AND DATA DESCRIPTION


Based on the available rainfall data with a majority average of 8.5 years, the POT series for five gauged sites of the Klang region in the centre of urban city of Kuala Lumpur, Malaysia were considered for analysis. The area chosen has an enormous urbanisation process throughout the years which resulted in the increase in population. Rainfall varies within time and space. Therefore, it is crucial to investigate the urban rainfall IDF and this study has also included some of the rainfall within the catchment. Figure 1 shows the study area, Table 1 shows the five rainfall station's inventory while Figure 2 illustrates the rainfall location in the catchment. JPS Ampang station has the longest rainfall data compare to the other four stations.

Norlida et al.

12nd International Conference on Urban Drainage, Porto Alegre/Brazil, 11-16 September 2011 Table 1. Station Inventory for Selected Urban Rainfall Stations.
Station No 1 2 3 4 5 Number 3117070 3117101 3117102 3117104 3117130 Name JPS Ampang Kg. Cheras Baru Taman Miharja Pandan Indah Jam. Jalan Cheras From 30-Jun-70 27-Mar-98 31-Mar-98 02-Jan-06 07-Dec-06 Data Duration To 18-Jan-09 20-Nov-08 15-Feb-09 01-Jul-06 13-Jan-08

Partial Duration Series and Threshold Data Partial Duration Series (PDS) or Peak Over threshold (POT) studies series consist of various distributions were conducted continuously by Diebolt et al. (2003), Madsen and Rosbjerg (2004), and Zea Bermudez and Kotz (2010). For the identified regions GPD is the suitable distribution for heavier tailed at the end of the series. In a longer duration record, however, the threshold was usually raised so that on average, only three or four floods a year is included. Van Montfort and Witter (1986) examined PDS having on average, 110 events per year. Rosbjerg and Madsen (2004) demonstrated that the PDS/GPD model is competitive with the Annual Maximum Series (AMS)/GEV model and highly efficient for regionalisation. Dahal and Hasegawa (2008) recommended the threshold relationship fitted to the lower boundary of the data group as defined by landslide-triggering rainfall events and is suitable for Himalaya, Nepal. Both visual and statistical methods were proposed for obtaining a given precipitation threshold which resulted from various monthly rainfall thresholds within the months in the arid environment (Lopez et al., 2008). In general, from small to larger PDS are considered in the present study. The 35 mm/hr following Ahmad (2003) is used as a threshold basis and varied accordingly to the rainfall time interval. The data quality is improved by investigating neighbourhood stations within 2 km in radius.

Figure 2. The catchment study in highly developed urban area in Kuala Lumpur and five rainfall stations distribution within Sg. Kerayong catchment.

IDF Estimation using GPD for Urban Area in a Tropical Region

12nd International Conference on Urban Drainage, Porto Alegre/Brazil, 11-16 September 2011
0.30 0.20 0.10 0.00 50

GPD ,k
y = -4E-05x + 0.0995 R = 0.6439
k min

45 40 35 30

GPD,

140 120 100

GPD,

25 20 15 10 5 0 1 25

-0.10 -0.20 -0.30 -0.40 0 0.40 0.35 0.30 0.25 1000

k max

y = 4.3233x0.2611 R = 0.8718 2.1627x0.3219

k min k max

80 60 40

y = 12.98x0.2789 R = 0.9315

min k max

y = 3E-09x2 - 6E-06x - 0.1979 R = 0.0199


Durations (minutes)
2000 3000 4000 5000

y= R = 0.9089
Durations (minutes)
10 100 1000

20 0 10000 1 140 120 100 10

Durations (minutes)
100

y = 9.7849x0.185 R = 0.8362
10000

1000

GEV, k
20

GEV,

GEV,

y = -3E-05x + 0.3017 R = 0.6333


k min

15

0.20 0.15 0.10 0.05 0.00 0 3.50 3.00 2.50 2.00 1000 k max

10

y = 2.3524x0.2471 R = 0.8818
min max

80 60 40 20 0

y = 15.461x0.2703 R = 0.945
min max

y = 0.1145e4E-05x R = 0.0261
Durations (minutes)
2000 3000 4000 5000

y = 1.9239ln(x) - 2.8527 R = 0.8724


Durations (minutes)
1 10 100 1000 10000

y = 10.453x0.2189 R = 0.8798
Durations (minutes)
1 10 100 1000 10000

0 35

LL(3)P,
y = -0.092ln(x) + 2.6089 R = 0.2529

LL(3)P,
30 25 20

140 120 100

LL(3)P,

1.50 1.00 0.50 0.00 0 1000

k min k max

15 10 5 0

y= R = 0.9383

2.8835x0.2646 y= R = 0.9073
Durations (minutes)

k min k max

80 60 40 20 0

y = 13.285x0.273 R = 0.9323
k min k max

y = 1.2548x0.0353 R = 0.3616
Durations (minutes)
2000 3000 4000 5000

1.8786x0.2942

Durations (minutes)
1 10 100

y = 9.0855x0.1864 R = 0.8075
10000

10

100

1000

10000

1000

Figure 3. GPD, GEV and LL (3)P Continuous Parameter Plot.

RESULTS AND DISCUSSION


Small and larger PDS (Zea Bermudez and Kotz, 2010) or POT are considered and applied to various distributions. Five rainfall stations in the urban area with the majority average 8.5year POT data are extracted. The GPD continuous parameters k, and are used to derive rainfall estimation following Langbein (1949) for various recurrence intervals. The study found that population samples, smaller than 20 as compared to 100 samples gave unstable IDF curve shapes when they are plotted. The characteristics of the parameter estimation equations are illustrated in Figure 3. The GPD continuous parameter plotted shape k ranges from -0.30 to 0.18, scale ranges from 2.23 to 44.51 and location ranges from 8.81 to 121.61. The shape parameter k plotted for the three distributions tend to converge to a point as rainfall interval increases. The scale parameter , and show a clear boundary with their own power equation shapes adjacent each other with minimum and maximum values at their respective locations and produced envelope curves. The third continuous parameter and of the GPD, GEV and LL (3) P show diverging shapes as the rainfall durations increase. On the other hand, the difference of and between the minimum and maximum value will be greater as the rainfall duration increases. Probability-Probability plot (P-P Plot) is used to compare P (model) and P (empirical) of cumulative distribution function of GPD, GEV and LL (3) P distributions (Figure 4). The P-P Plot determined how well the GPD fits the observed data. This plot will be approximately

Norlida et al.

12nd International Conference on Urban Drainage, Porto Alegre/Brazil, 11-16 September 2011 Table 2. K-S Test Preferable First Ranking of GPD, GEV and LL (3)P.
Total 1st Ranking (no of times) % Preference 1st Ranking GPD 34 61.8% GEV 10 18.2% LL(3)P 11 20%

linear if the specified theoretical distribution is the correct model. The null hypothesis was accepted and explained in the K-S tests. Meanwhile, the percentage of preference for the first ranking choices of distribution is 61.8% to the GPD, 18.2% to the GEV and 20% to the LL (3)P as shown in Table 2. Furthermore, the GPD continuous parameter k, and were used to derive rainfall intensity estimation for various recurrence intervals. The finding shows that the average recurrence interval (ARI) lines appear undulating and show improper IDF curves where there are small rainfall samples less than 20 samples in one population included in the analyses of station 3117104 and 3117130. The study proceeded with three rainfall stations selected 3117070, 3117101 (Figure 4) and 3117102. The three stations provided three continuous parameter estimations of rainfall stations and showed a common shape of GPD, GEV and LL(3)P in Figure 3. Two further investigations were carried out to minimise the continuous parameter error as well as to increase the accuracy. First, it was found that the power equations of the scale and location parameters gave the highest accuracy ranging between 83.6% and 94.5% on linearlog scale. Secondly, accuracy of the continuous shape parameter k and value ranged from 2% to 64.4% on liner-linear scale and a linear or a polynomial equation. A general equation can be taken from an average continuous parameter equation of its lower and upper bound equations.

P-P Plot 3117101 (30 minutes) P-P Plot


1 0.9

1000

RAINFALL INTENSITY-DURATION-FREQUENCY CURVE


Site 3117101 - Kg Cheras Baru (1998-2008)

0.8

0.7

0.6

0.5

0.4

Rainfall Intensity (mm/hr)

100

6 5 4 6 3 5 2 4 1 3

P(model)

P (M o d e l)

0.3

10

0.2

0.1

0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

1 DENOTES 1 IN 2-YEAR 2 DENOTES 1 IN 5-YEAR 3 DENOTES 1 IN 10-YEAR 4 DENOTES 1 IN 20-YEAR 5 DENOTES 1 IN 50-YEAR 6 DENOTES 1 IN 100-YEAR

P(empirical) Generalized Pareto Generalized Extreme Value Log Logistic (3) parameter
Gen. Extreme Value Gen. Pareto Log-Logistic (3P)

P (Empirical)

1 1 10 100 Duration (minutes) 1000 10000

Figure 4. P-P Plot and IDF Curve for Station Kg. Cheras Baru, 3117101.

IDF Estimation using GPD for Urban Area in a Tropical Region

12nd International Conference on Urban Drainage, Porto Alegre/Brazil, 11-16 September 2011

CONCLUSION
The GPD continuous parameter is used to derive rainfall IDF estimation for various ARI. The figures and tables presented appear to be very useful in illustrating the GPD, GEV and LL (3) continuous parameter behaviour in the study area and some conclusions can be derived from the results. From the analysis of five rainfall stations, the important findings are as follows: The continuous scale parameters of GPD, GEV and LL(3)P denoted as , and have a consistent in pattern and bounded by their minimum and maximum continuous parameter estimations in power equations respectively. The continuous shape parameters k and of the GPD, GEV and LL (3)P are in linear equations. The location parameters of GPD, GEV and LL (3)P are diverging in power equations as rainfall increases. The continuous parameter equations of GPD, GEV and L (3)P are able to estimate a general shape, scale and location of continuous parameter estimation at various rainfall durations hence for the derivation of IDF curves. Future study shall use different distribution with more recorded rainfall data as longer duration records can minimise errors and difficulties which will eventually lead to better rainfall IDF estimation for various ARI.

ACKNOWLEDGEMENTS
This paper is financially supported by Public Administrative Department Malaysia, Department of Irrigation and Drainage Malaysia and Universiti Sains Malaysia.

REFERENCES
Abustan I., Mohd Nor M.D. and Abdul Wahid N. (2000). SWMM modelling for a small catchment in Kuala Lumpur. Proceedings of Fresh Perspectives on Hydrology and Water Resources in Southeast Asia and the Pacific, Christchurch, New Zealand, 21 24 Nov 2000, pp.166 -172. Ahmad N. (2008). Characterization of convective rain in Klang Valley, Malaysia. Master Thesis (Hydrology and Water Resources). Universiti Teknologi Malaysia. Andres-Domenech I., Montanari A., and Marco J.B. (2010. ) Stochastic rainfall analysis for storm tank performance evaluation. Hydrol. Earth Syst. Sci., 14, 12211232. Ashkar F and Mahdi S. (2006). Fitting the log-logistic distribution by generalized moments. Journal of Hydrology, 328, 694 703. Bertin E. and Clusel M. (2006). Generalized Extreme Value Statistics and sum of correlated variables. Journal of Physics A: Mathematical and General, 39(24), 7607. Chakravarti, Laha, and Roy (1967). Handbook of Methods of Applied Statistics, Volume I, John Wiley and Sons. pp. 392-394. Cunnane, C. (1989). Review of statistical models for flood frequency analysis. WMO Operational Hydrology. Rep. no. 33, WMO no. 718, World Meteorological Organization. Dahal R.K., Hasegawa S. (2008). Representative rainfall thresholds for landslides in the Nepal Himalaya. Geomorphology 100, 429- 443. Deidda R., Puliga M. (2009). Performances of some parameter estimators of the generalized Pareto distribution over rounded-off samples. Physics and Chemistry of the Earth, 34, 626634. Diebolt J., El-Aroui M.A., Garrido M. and Girard S. (2003). Quasi-conjugate Bayes estimates for GPD parameters and application to heavy tails modeling. Rapport de recherche no 4803,29 pages. Hosking J.R.M. and Wallis J.R. (1993). Some statistics useful in regional frequency analysis. Water Resource Research, 29(2), 271-281. Koutsoyiannis D., Kozonis D. and Manetas A. (1998). A mathematical framework for studying rainfall intensityduration-frequency relationships. Journal of Hydrology, 206,118-135.

Norlida et al.

12nd International Conference on Urban Drainage, Porto Alegre/Brazil, 11-16 September 2011
Langbein W.B. (1949). Annual Floods and the partial-duration flood series, Transactions, American Geophysical Union, 30(6), 879-881. Lopez B.C., Holmgren M., Sabate S., Gracia C. A. (2008). Estimating annual rainfall threshold for establishment of tree species in water-limited ecosystems using tree-ring data. Journal of Arid Environments, 72, 602611. Madsen H. and Rosbjerg D. (1993). Application of the partial duration series approach in the analysis of extreme rainfalls. IAHS Publ. no. 213. Pearson C.P. (1991). New Zealand regional flood frequency analysis using L-moments. The New Zealand Hydrological Society. J Hydrol, 30(2),53-64. Rosbjerg D. and Madsen H. (2004). Advanced approaches in PDS/POT modelling of extreme hydrological events. British Hydrology Society. Hydrology: Science & Practice for the 21st Century, 1, 217-220. Singh V.P. and Guo H. (1995). Parameter estimation for 3 parameter generalized pareto distribution by the principle of maximum entropy (POME). Hydrological Sciences Journal des Sciences Hydrologiques, 40, 2. Singh V.P. and Guo H. and Yu F.X. (1993). Parameter estimation for 3 parameter log logistic distribution (LLD3) by pome. Stochastic Hydro. Hydraul. 7, 163 177. Topsoe F. (1970). On the Glivenko-Cantelli Theorem. Probability Theory and Related Fields, 14(3), 239-250. Van Montfort M.A.J. and Witter J.V. (1986). The Generalized Pareto distribution applied to rainfall depths. Hydrological Science, 31(2), 151-162. Vogel R.M. and Fennessey N.M. (1993). L-moments should replace product moment diagrams. Water Resources 29(6),1745-1752. Yang T., Xu C.Y., Shao Q.X., and Chen X. (2009). Regional Flood Frequency and Spatial Patterns analysis in the Pearl River Delta Region using L-Moments Approach. Stoch Environ Res Risk and Assess, 24(2), 165-182. Zea Bermudez P.D., Kotz S. (2010). Parameter estimation of the generalized Pareto distribution part 1. Journal of Statistical Planning and Inference, 140,1353-1373.

IDF Estimation using GPD for Urban Area in a Tropical Region

Potrebbero piacerti anche