Trends in Road Traffic Congestion An Extreme Order Statistics

Trends in Road Traffic Congestion: An Extreme Order Statistics
Approach
A. STATHOPOULOS AND M. KARLAFTIS

Department of Civil Engineering, National Technical University of Athens, Athens, Greece.
Most of the widely used statistical models are based on “average behavior”; that is, they are built
around explaining and predicting the outcome of the mean of available data. And, in most
analyses, observations that are too “far” from the mean, i.e. outliers, are discarded. But, an
important question needs to be addressed: do outliers (also called extreme values) contain
information that can be useful in practice and are there any statistical techniques that provide for
their modeling? The goal of this paper is twofold. First, to offer a short exposition of the
fundamental principles governing the analysis of extreme values and, second, to offer a traffic
example of how the use of extreme value analysis can improve on the results of more commonly
used statistical techniques. The results of the case study suggest that both the sign and the
magnitude of an estimated trend term vary significantly between extreme value and regression
models, a finding with very important policy implications.
keywords: traffic data analysis, extreme values, outliers, extreme order statistics
INTRODUCTION
Transportation and traffic considerations most frequently revolve around the analysis of
available data; data on accident counts, traffic injuries, freight volumes, passengers, traffic
flows, highway speeds, etc. Appropriate models are fit to these data to “explain” and predict the
behavior of corresponding variables. As is well known, most of the commonly used models are
based on “average behavior”; that is, most models are built around explaining and predicting the
outcome of the mean of the available data.
And, while this approach is very satisfactory in modeling most of the commonly
encountered problems there are, many times, transportation problems that deal with boundary
conditions. These conditions can be, for example, congested traffic, speed limit violators,
extreme terminal delays, etc. From a methodological standpoint all these problems have been
dealt with using the commonly available statistical analysis tools (linear, non-linear and Poisson
regression, time-series analyses, etc.). But, an important question needs to be addressed: are the
1
TRB 2003 Annual Meeting CD-ROM Paper revised from original submittal.
commonly used statistical methods still unbiased, efficient and asymptotically consistent when
dealing with boundary or extreme conditions? And, furthermore, are there any other statistical
techniques that can provide for improved modeling of such phenomena and also allow for useful
inferences to be made?
In most statistical analyses the usual course of action is to discard observations that are
“too far” from the mean; these observations are commonly termed outliers. But, as is widely
recognized, outliers frequently contain information that can shed much light into the phenomena
being studied. Furthermore, in many cases, these outliers can help predict the occurrence of
other such outliers much better than by studying the remainder of the values. For example, it is
quite possible that by examining the occurrence of, say, very high traffic occupancy values
(occupancy is frequently recognized as the “best” indicator of congestion), the future occurrence
of such values can be better predicted. That is, by concentrating the study on congestion itself,
rather than on the entire series of occupancy values, it may be easier and more effective to
predict its occurrence.
Interestingly, the study of phenomena far from the mean, phenomena that are infrequent,
or, in mathematical terms, phenomena that occur in the tail of probability distributions have, for
a long time, been the focus of many scientific disciplines. Seismologists and geotechnical
engineers have been very interested in the occurrence of earthquakes of extreme magnitudes;
oceanographers and hydraulic engineers have been interested in the occurrence of the T-year
flood level; meteorologists and structural engineers have been interested in the occurrence of
strong hurricanes. All these phenomena, powerful hurricanes and earthquakes, T-year floods, etc.
have been described as extreme events and, as such, their analysis is widely known as “Analysis
of Extreme Values” or “Extreme Order Statistics” (as a point of scientific interest it should be
noted that extreme value analyses have been used to study a very wide variety of subjects
ranging from the atmospheric ozone levels to earthquakes, to hurricanes, to stock market returns,
to population aging and sea levels. Of-course, a discussion of these studies is beyond the scope
2
of this paper; interested readers should refer to (1), (2) and (3) for both an in-depth discussion of
the available techniques and a description of a gamut of available case studies).
The spirit and the goal of the analysis of extreme values is to develop models that can
capture the essential information contained in them, i.e. in the tails of distributions, and to
improve on the projection of re-occurrence of such events. The potential for applying these
techniques in transportation problems is very promising; questions such as congestion
occurrence, trends in traffic flows, extreme highway speeds, high accident areas, etc. can all be
studied using the tools provided by extreme value analysis. The goal of this paper is twofold.
First, to offer a short exposition of the fundamental principles governing extreme value analysis
and, second, to offer a traffic example of how the use of extreme value analysis can improve on
the results of more commonly used statistical techniques. The remainder of this paper is
organized as follows; the next section reviews some of the fundamental mathematical principles
upon which extreme value analysis is built. The third section describes and offers some results
from applying extreme value analysis to a traffic database and, finally, the fourth section offers
some concluding remarks.
CONCEPTS OF EXTREME ORDER STATISTICS
Fundamental Concepts
The issues related to the analysis of Extreme Order Statistics (Extreme Values) may be
described, in broad terms, as an effort to make statistical inferences about the extreme values in a
population or a random process. The traditional approach (4) is based on the extreme value limit
distributions first identified by Fisher and Tippett (5) and applied to a large number of cases as
reviewed in the introductory section. A typical application consists of the generalized extreme
value distribution fitted to annual (or of any predefined time period) maxima (that is, the
maximum value of a given series for a given time period) and can be defined as
3
1κ

  y − µ  
H ( y; µ ,ψ ,κ ) = exp  −  1 − κ   (1)
 
 ψ  
over the range of y that κ ( y − µ ) < ψ , where ψ > 0 and κ , µ are arbitrary. The case κ = 0 is
interpreted as the limit κ → 0 , that is
  y − µ  
H ( y; µ ,ψ ,0 ) = exp  − exp  −  (2)

  ψ  
the type I family of Gumbel distribution(4). These distributions are justified as the limiting
stable distributions of extreme value theory. The numerical fitting procedures were developed by
Prescott and Walden (6, 7) and a Newton-Raphson algorithm was developed by Hosking (8) and
amended by Macleod (9). For the asymptotics of Maximum Likelihood, see Smith (10); it should
be noted though that there is evidence that Maximum Likelihood estimation does not work well
in small samples (8, 9, 10), and other methods have been proposed, the most well known of
which is the “probability weighted moments” approach proposed by Hosking et al. (11). It
should also be noted that an extension to the usual time period maxima is to use several of the
largest order statistics in that period (12, 13, 14).
Threshold methods, the second most important area of interest in Extreme Order
Statistics after the Maxima method, have been extensively developed by hydrologists over the
last 30 years, often under the acronym of the POT (peaks over threshold) method. These
methods are based on fitting a stochastic model to either the exceedances or the peaks over a
threshold for a given series. Such methods have been used for quite some time, but the first
systematic treatment was by Todorovic and Zelenhasic (15) and Todorovic and Rousselle (16).
The approach most commonly employed in practice is based on the generalized Pareto
distribution, which has an interpretation similar to that which motivates Eq. (1). The remainder
of this Section deals with POT methods since this class of models shows most promise in
transportation and traffic related problems. It studies the occurrence of phenomena over a
4
“tolerance” value (acceptable threshold), such as highway speeds over the speed limit, traffic
flows in the unstable region, etc. These are typical cases where the analyst may wish to identify
and analyze classes of events that could potentially lead to severe consequences in a particular
context. In this case, in which the potential extremist events have not yet been identified or
occurred, a systematic approach to identifying potential damaging events may be extremely
crucial.
The Generalized Pareto Distribution

The most straightforward models for exceedances over thresholds are based on a Poisson
process of exceedance times (when, for example, congestion occurs during a time period)
combined with independent excesses over the threshold (the magnitude of congestion). Consider
a threshold u, and let N denote the number of exceedances of u during a period of, say, m
months. Suppose the excesses, or the differences between the observations over the threshold
and the thresholf itself, are independent with common distribution function G. The theory upon
which POT models is built suggests that G is the generalized Pareto distribution given by
1k
 y
G ( y;σ , κ ) = 1 −  1 − κ  (3)
 σ 
where σ > 0 and κ is arbitrary; the range of y is 0 < y < ∞ if κ ≤ 0 , o < y < σ κ if κ > 0 .
When κ = 0 then Eq. (3) corresponds to the exponential distribution with mean σ . There are
several ways of motivating Eq. (3). Pickards (17) showed that this equation arises as a limiting
distribution for excesses over thresholds if and only if the parent distribution is in the domain of
attraction of one of the extreme value distributions. Another motivation suggests that if Y is
generalized Pareto and u>0, then the conditional distribution of Y-u given Y>u is also
generalized Pareto. Another property suggests that if N has a Poisson distribution and,
conditionally on N , Y1 ,..., YN are independent identically distributed generalized Pareto random
5
variables, then max (Y1 ,..., YN ) has the generalized extreme value distribution. Thus, a Poisson
process of exceedance times with generalized Pareto excesses implies the classical extreme
value distributions. These properties suggest that the generalized Pareto distribution will be a
practical family for statistical estimation, provided that the threshold is taken sufficiently high.
Of-course, the question that naturally arises is how high to take the threshold. Theoretical results
are available that guide toward threshold selection (see, for example, Smith (18)), but in this
paper a 92% and over occupancy measurement is considered as corresponding to congested
conditions (Bensalah (19) suggesting that, as a rule-of-thumb, extreme value analysis is
concerned with quantiles over 90%).
Methods for estimating the generalized Pareto distribution parameters have been
thoroughly reviewed by Hosking and Wallis (20). Maximum Likelihood estimators exist in large
1
samples provided that κ < 1 and are asymptotically normal and efficient if κ ≤ . In general, it
2
is possible to reduce maximum likelihood estimation to a one-dimensional search whose log-
likelihood based on Y1 ,..., YN is
∑ log (1 − κ Y σ )
N
− n log σ + (1 κ − 1) i
1
Writing σ = κ τ and differentiating with respect to τ and κ , the maximum with respect to κ
is achieved when
∑ log (1 − τ Y )
N
κ = κ (τ ) = − N −1 i
1
Thus, the maximum likelihood estimation reduces to solving the single equation
∑Y
N
N τ = {1 κ (τ ) − 1} i (1 − τ Yi )
1
6
The most frequently encountered “competitor” for maximum likelihood estimation is the
probability weighted moments method of Hosking and Wallis (20). The rth probability weighted
moment is defined as
σ
α r = E Y {1 − G ( Y )}  =
r
  ( r + 1)( r + 1 + k )
where G is the generalized Pareto distribution function. The method essentially consists of
equating the theoretical and sample-based values of the probability weighted moments for r=0,1.
POT Regression
The most widely used application of statistics is the description of systematic variation in a
variable of primary interest, a response or dependent variable, in terms of covariates (or
explanatory or independent variables). In the context of POT models, regression assumes that κ
and σ are functions of covariates and regression parameters.
Maximum Likelihood Estimation

Suppose that the Yi , i = 1,..., N are independent generalized Pareto random variables with
parameters κ i = κ ( zTi γ ) and σ i = σ ( xTi β ) , where xi and zi are known vectors of covariates of
respective lengths p+1 and q+1. Suppose in addition that the Yi may be upper truncated at
random, and say that δ i = 0 if yi is the observed value of the random variable Yi , but that δ i = 1
if yi is a lower bound for the true but unobserved Yi . The contribution to the overall log-
likelihood L = ∑l i made by Yi is
 − yi σ i − log σ i + δ i log σ i , κi = 0

li ( yi ;σ i ,κ i ) =   1   yi    yi   (4)
  κ − 1 log  1 − κ i σ  − log σ i + δ i log σ i + log  1 − κ i σ  , κ i ≠ 0
 i   i  
  i  
finally, the score vector has components
7
∂L
∑ z κ ∂l ( y∂;κσ ,κ )
N
= i i i i
∂γ
i i
i =1 i
(5)
∂l ( y ;σ ,κ )
= ∑xσ
∂L N
i i i i
∂β ∂σ
i i
i =1 i
APPLICATION TO TRAFFIC FLOWS
The Problem Considered

The basic problem considered in this paper is whether traffic flows and congestion in one of
Athens’ main traffic corridors has increased over the last 4 years; furthermore, with the 2004
Olympic Games fast approaching, it is imperative to determine whether key road axes that will
serve as main gateways for the Olympic Family are approaching congestion or whether policy
measures instituted by the authorities have begun to show an effect in tapering congestion.
For information purposes, it is noted that the urban area of Athens, the capital of Greece,
has an area of 60 km2 and a population of approximately 3.8 million people. During the last
decade the population in the greater Athens area has increased by about 10% while at the same
time car ownership has increased considerably, approaching 310 automobiles per 1000
inhabitants. This has led to an increase in travel time by 26% in the last 12 years which has
resulted in the deterioration of traffic conditions in the capital. Overall, planning authorities had
to deal with a 3.5% annual increase in traffic for the last ten years. An increasing proportion of
the signalized intersection approaches in the center of the city is highly congested (levels of
service E-F). Obviously, travel times in such a congested network are quite long, and the
potential problems, especially during periods of high demand, can be insurmountable.
8
The Data
Background
A dynamic traffic map and ATIS for the central Athens area has been operating since 1996. As a
first attempt to meet driver information needs, travel time information is given over the internet
updated every 15 minutes. To do this, 6 origin points have been defined at the major city
entrances, from which a total of 17 possible (commonly used) routes have been considered.
Speeds and travel times along these 17 routes are being estimated using flow and occupancy data
collected directly from the 140 controllers installed in various points of the down-town urban
network. These are single (inductive) loops measuring traffic volume and queue presence at an
average distance of 80 m from the stop line.
Raw data covering traffic volumes and occupancies are selected by sensors of the Urban
Traffic Control Center (UTCC) and arrive to the special processing facilities of the National
Technical University of Athens (NTUA) every 90 seconds. All data are batch processed, with a
combination of specific tools and algorithms. Data integrity is ensured by the controllers
themselves as malfunctioning loops default shortly to the value of 255 and they are
not taken into account in the analysis period or thereafter. Data quality control is provided
automatically by the data management software that performs screening and data repair
functions. The processed data files are then transmitted using the File Transfer Protocol (FTP)
procedure from the batch processing enviroment to the UNIX based Web Server of the NTUA
School of Civil Engineering (FCE) via the University’s telematics (ISDN) network. Combining
all the information contained in the data files, ten Graphic Interchange Format (GIF) images are
generated. These are included in Web Pages of the NTUA Faculty of Civil Engineering Web
Server giving access to information on the traffic situation in Athens to Internet users.
The first two GIF images contain volume and congestion data for the entire central
Athens urban area. The remaining eight GIFs (four for volume and four for congestion) are
blown-up parts of the network, generated along with the main image every quarter of an hour via
9
a UNIX script. Users can have access to those blown-up parts by clicking on a part of the main
image in the Web Page (The final result is available at the address: www/transport.ntua.gr/map)
The Database
Data on traffic flow and occupancies are collected, as already mentioned, at 90 sec intervals at
140 locations around Athens, continuously since 1996. All this data has been stored and is
available for analysis. For the purposes of this analysis further manual checking was performed.
Detail police records were used to exclude periods were major or minor events (e.g.
demonstrations) or incidents external to normal traffic operations (e.g. roadworks) had been
reported. In an effort to reduce the size of the data set to be processed without losing much
essential information, a new data set was created containing occupancy measurements (at 90
second intervals) for the first two months (January and February) of the years 1998 – 2001. This
yielded more than 15,000 measurements for each of the 17 cross-sections along a part of the
eastern section of the Olympic Ring that is examined in this study (Figure 1). To make more
practical sense of the results, and after preliminary testing indicated this to be a statistically
acceptable approach, data were further aggregated into directions: occupancy toward the city
center (direction NE to SW) and toward the Olympic Stadium (the Olympic Stadium is in the
opposite direction from the city center; in this paper only the results for the direction toward the
Olympic Stadium are presented (direction SW to NE) – the results for the other direction are
available from the authors upon request). All data refer to 4 measuring locations along the
aforementioned corridor (shown by arrows in Figure 1). The distance between the first and the
last location is 4100m, with intermediate distances between the first and the second, the second
and the third etc, equal to 1900m, 1200m and 1000m respectively.
10
The Methodological Approach
As previously mentioned, the main goal of this study is to demonstrate the potential applications
of Extreme Order Statistics to transportation problems. The case study selected for this
demonstration is the problem of traffic exceedance (a precursor or indicator of congestion) along
one of Athens’ major roadway corridors that serves not only a large part of the daily commuting
activity, but is also part of the Olympic Ring and will serve as a major artery for the Olympic
visitors in the 2004 Games. The goal is to examine whether congestion has increased since 1998,
or whether measures instituted by the traffic authorities have been effective; to achieve this, the
main parameter considered is that of a trend. That is, interest in this study focuses around
determining the existence of a trend in congestion over the past four years. In light of the
discussion in the previous Section, the approach taken here is to fit the model:
yi = α + bti + ε i , 1 ≤ i ≤ n (6)
where yi is the exceedance magnitude for the ith observation (occupancy magnitude above the
92% threshold), α is the constant, ti is the time trend for the ith observation (ti = 1,…, n, where n
= total number of observations), and the ith residual ε i is assumed to come from a distribution in
the Generalized Pareto family (Eq. 3).
Estimation Results
Three basic models were fitted to the data: the first was a least squares model fitted to the entire
data set; the second was again a least squares model fitted to the exceedances data (as previously
noted, exceedance is considered any occupancy value over 92% and clearly indicates the onset
of congestion); the third was the model shown in Eq. (6) fitted again to the exceedances data
using maximum likelihood estimation. The results for all three models are presented in Table 1.
It should be noted here that the Generalized Gamma Model (GGM) was initially estimated as the
11
distribution of the error terms in the third model. Then, since most of the other widely used
parametric forms of the Generalized Pareto distribution (such as the Exponential, the Beta, the
Weibull, the Standard Gamma, the Reverse Weibull, etc.) are nested within the GGM model,
they were evaluated using the likelihood ratio test. In most cases, as can be seen in Table 1, the
Reverse Weibull model could not be rejected and fitted the data very well and better than all
other models (this result was also reached using graphical diagnostics not shown in this paper).
Interestingly, Heckert et al. (21) found the Reverse Weibull as the “best” fitting distribution to
hurricane speeds since its upper tail is finite, lending itself well to modeling phenomena bounded
from above (in the occupancy case examined here, 100% occupancy can be considered as the
upper bound).
The results are very interesting. First, it appears that both the sign and the magnitude of
the trend term vary between the models estimated. Second, the least squares estimated trend
varies significantly between the full and the extreme values/exceedances (congestion) data set as
shown in the second and third columns of Table 1. Third, the trend term demonstrates marked
differences for some loops within given daily time periods. For example, congestion has
increased at loop 67 at similar magnitudes for all time periods; the estimated, via Maximum
Likelihood, slope coefficient is between 0.57 and 0.99. Congestion has remained unchanged at
loop 70, while it has increased significantly for most periods (excluding the period 13:30-17:00)
at loop 93. Interestingly, traffic at loop 87 has decreased for the morning peak period (06:30-
10:00) and for the early afternoon (13:30-17:00), but shows a large increase for the late morning
(10:00-13:30) and afternoon peak (17:00-20:30) periods.
It is also worth noting that had the trend term been estimated via simple least squares the
results would have been statistically both biased and inconsistent for both data sets. This is a
typical example of estimation errors that occur when either an inappropriate technique is used to
analyze a given data set, or when a wrong functional form is specified during estimation (for a
more elaborate discussion see Washington et al. (22)). Of-course, since this study focuses on
12
extreme loadings that are indicative of congestion trends, the extreme values (exceedances) data
set is more appropriate. A least squares estimation on the congestion data yields similar results
as those of Eq. (6) for loop 67, but largely contradictory results for all other loops. Least squares
estimation for loop 70, for example, would suggest a decreasing trend in congestion rather than
no trend (estimated from Eq. (6)), and would also suggest no trend (or slightly decreasing) for
loop 93 instead of an increasing one.
CONCLUSIONS
As is well known, most of the commonly used statistical models are based on explaining and
predicting “average behavior”; that is, most models are built around explaining and predicting
the outcome of the mean of the available data. And, while this approach is very satisfactory in
modeling most of the commonly encountered problems there are, many times, transportation
problems that deal with boundary conditions. These conditions can be, for example, congested
traffic, speed limit violators, extreme terminal delays, etc. In particular, severe traffic conditions
that may lead to a general network breakdown can be seen and treated as a physical calamity.
From a methodological standpoint all these problems have been dealt either by using the
commonly available statistical analysis tools or by discarding the observations (outliers)
altogether. But, an important question needs to be addressed: do outliers (also called extreme
values) contain information that can be useful in practice and are there any statistical techniques
that provide for their modeling? As such, the first part of this paper offered a short exposition of
the fundamental principles governing the analysis of extreme values and, in second offered a
traffic example of how the use of extreme value analysis can improve on the results of more
commonly used statistical techniques.
The analysis of Extreme Order Statistics (Extreme Values) may be described, in broad
terms, as an effort to make statistical inferences about the extreme values in a population or a
random process. This analysis is based on two fundamental principles: first, the generalized
13
extreme value distribution can be fitted to minima and/or maxima (that is, the minimum and/or
maximum value of a given series for a given time period) and, second, the generalized Pareto
distribution can be fitted to exceedances of a given measure over a threshold value. Using these
two principles inferences and predictions can be made regarding extreme values (outliers). To
demonstrate the principles of extreme value analysis, this paper presented a case study of traffic
congestion in downtown Athens, Greece.
The basic problem considered in this paper is whether traffic flows and congestion in
one of Athens’ main traffic arteries has increased over the last 4 years; further, with the 2004
Olympic Games fast approaching, it is imperative to determine whether key road axes that will
serve as main gateways for the Olympic Family are approaching endemic congestion conditions
or whether policy measures instituted by the authorities have begun to show an effect in tapering
congestion. To estimate this trend in congestion, two basic models were estimated: first, a least
squares models was fit, and then an extreme values trend model was estimated (via maximum
likelihood estimation). The results were very poignant. It appears, rather conclusively, that both
the sign and the magnitude of the trend term vary between the models estimated. Had the
commonly used least squares estimates been used to determine the trend the results would have
pointed toward either no increase in congestion or slight decrease. Extreme value techniques
point toward an increase in congestion, a finding with tremendous policy implications. Although
this case study is only a first step in demonstrating the usefulness of extreme value analyses in
transportation problems, it is hoped that more researchers will make use of the available
statistical tools and gain a deeper understanding of the occurrence of extreme events.
REFERENCES
1. Galambos, J. (1978) The Asymptotic Theory of Extreme Order Statistics. Wiley, New York.
2. Hipel, K.W. (ed.) (1994) Stochastic and Statistical Methods in Hydrology and
Environmental Engineering: Extreme Values. Kluwer Academic Publishers, New York.
14
3. Reiss, R.D. and Thomas, M. (2001) Statistical Analysis of Extreme Values 2nd Ed.
Birkhauser, Germany.
4. Gumbel, E. J. (1958) Statistics of Extremes. Columbina University Press, New York.
5. Fisher, R.A. and Tippett, L.H.C. (1928) Limiting forms of the frequency distributions of the
largest or smallest numbers of a sample. Procceedings of the Cambridge Philosophical Society,
24, 180-190.
6. Prescott, P. and Walden, A.T. (1980) Maximum likelihood estimation of the parameters of
the generalized extreme-value distribution. Biometrika, 67, 723-724.
7. Prescott, P. (1983) Maximum likelihood estimation of the parameters of the three parameter
generalized extreme-value distribution from censored samples. Journal of Statistical Computing
and Simulation, 16, 241-250.
8. Hosking, J.R.M. (1985) Algorithm AS 215: Maximum likelihood estimation of the
parameters of the generalized extreme-value distribution. Applied Statistics, 34, 301-210.
9. Macleod, A.J. (1989) AS R76 – A remark on algorithm AS 215: Maximum likelihood
estimation of the parameters of the generalized extreme-value distribution. Applied Statistics, 38,
198-199.
10. Smith, R.L. (1985) Maximum likelihood estimation in a class of nonregular cases.
Biometrika, 72, 67-92.
11. Hoskinh, J.R.M., Wallis, J.R. and Wood, E.F. (1985) Estimation of the generalized extreme-
value distribution by the method of probability-weighted moments. Technometrics, 27, 251-261.
12. Leadbetter, M.R., Lindgren, G. and Rootgen, H. (1983) Extremes and Related Properties of
Random Sequences and Series. Springer, New York.
13. Smith, R.L. (1986) Extreme value theory based on the r largest annual events, Journal of
Hydrology, 86, 27-43.
14. Tawn, J.A. (1988) An extreme-value theory model for dependent observations. Journal of
Hydrology, 101, 227-250.
15. Todorovic, P., and Zelenhasic, E. (1970). A stochastic model of flood analysis, Water
Resources Research, 6(6), 1641-1648.
16. Todorovic P. and Rouselle J (1977). Analyse stochastique des crues et son effet sur les
plaines inondables, Colloque canadien d'hydrologie: inondations, pp. 279-288, Edmonton,
Alberta.
17. Pickards, J. (1975) Statistical inference using extreme order statistics. Annals of Statistics, 3,
119-131.
15
18. Smith, R.L. (1987) Estimating tails of probability distributions. Annals of Statistics, 15,
1174-1207.
19. Bensalah, Y. (2000) Steps in applying extreme value theory to finance: A review. Working
paper 2000-20, Bank of Canada.
20. Hosking, J.R.M. and Wallis, J.R. (1987) Parameter and quantile estimation for the
generalized Pareto distribution, Technometrics, 29, 339-349.
21. Heckert, N.A. and Simiu, E. (1998) Estimates of hurricane wind speeds by “Peaks over
Threshold” method. ASCE Journal of Structural Engineering, 124(4), 445-449.
22. Washington, S., Karlaftis, M.G. Mannering, F.L. (2003) Statistical and Econometric
techniques for transportation data analysis, Chapman & Hall/CRC Press, Boca Raton, Fl.
16
Table 1. Parameter Estimates for Various Trend Models
All Data Extreme Values (congestion Data)
GGM3 RW4
LSE1 LSE1 ML2 scale shape scale
5
Loop 93
06:30-20:30 4.2 -0.02 2.51 0.02 2.6
06:30-10:00 7.6 ns 1.72 0.04 -4.6
10:00-13:30 3.7 ns 4.66 0.051
13:30-17:00 1.1 ns ns
17:00-20:30 0.9 -0.6 4.73 0.048
Loop 87
06:30-20:30 -1.3 ns 1.1 0.037
06:30-10:00 0.7 ns -1.94 0.03
10:00-13:30 -0.3 0.8 4.23 0.011
13:30-17:00 -0.7 ns -0.8 0.042
17:00-20:30 -2.5 0.7 2.94 0.027
Loop 67
06:30-20:30 0.97 1 0.84 0.03 1.85
06:30-10:00 1.23 0.8 0.87 0.048
10:00-13:30 0.61 0.9 0.94 0.051
13:30-17:00 NS 0.7 0.57 0.053
17:00-20:30 1.92 1.01 0.99 0.043
Loop 70
06:30-20:30 -1.92 -1.6 ns
06:30-10:00 1.1 ns ns
10:00-13:30 -4.7 -3.4 ns
13:30-17:00 0.7 ns ns
17:00-20:30 -0.7 -3.9 ns
1
Least Squares Estimate for the trend term
2
Maximum Likelihood Estimates for the trend term
3
Generalized Gamma Model
4
Reverse (Negative) Weibull
5
ns denotes no statistical significance (90% significance level)
17
Figure 1: Loop detector location sequence along the SW to NE corridor
18

Trends in Road Traffic Congestion An Extreme Order Statistics

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Trends in Road Traffic Congestion An Extreme Order Statistics

Caricato da

Copyright:

Formati disponibili

Trends in Road Traffic Congestion: An Extreme Order Statistics

A. STATHOPOULOS AND M. KARLAFTIS

outcome of the mean of the available data.

predict its occurrence.

the available techniques and a description of a gamut of available case studies).

techniques in transportation problems is very promising; questions such as congestion

some concluding remarks.

CONCEPTS OF EXTREME ORDER STATISTICS

interpreted as the limit κ → 0 , that is

largest order statistics in that period (12, 13, 14).

occurred, a systematic approach to identifying potential damaging events may be extremely

The Generalized Pareto Distribution

conditionally on N , Y1 ,..., YN are independent identically distributed generalized Pareto random

paper a 92% and over occupancy measurement is considered as corresponding to congested

conditions (Bensalah (19) suggesting that, as a rule-of-thumb, extreme value analysis is

concerned with quantiles over 90%).

is possible to reduce maximum likelihood estimation to a one-dimensional search whose log-

likelihood based on Y1 ,..., YN is

variable of primary interest, a response or dependent variable, in terms of covariates (or

and σ are functions of covariates and regression parameters.

Maximum Likelihood Estimation

finally, the score vector has components

APPLICATION TO TRAFFIC FLOWS

The Problem Considered

potential problems, especially during periods of high demand, can be insurmountable.

average distance of 80 m from the stop line.

demonstration is the problem of traffic exceedance (a precursor or indicator of congestion) along

the Generalized Pareto family (Eq. 3).

(10:00-13:30) and afternoon peak (17:00-20:30) periods.

loop 93 instead of an increasing one.

commonly available statistical analysis tools or by discarding the observations (outliers)

commonly used statistical techniques.

congestion in downtown Athens, Greece.

Potrebbero piacerti anche