Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Forthcoming
Resnick: Heavy Tail Phenomena: Probabilistic and Statistical Modeling
Muckstadt and Sapra: Models and Solutions in Inventory Management
Laurens de Haan
Ana Ferreira
Springer
Laurens de Haan
Erasmus University
School of Economics
P.O. Box 1738
3000 DR Rotterdam
The Netherlands
ldehaan@few.eur.nl
Ana Ferreira
Instituto Superior de Agronomia
Departamento de Matematica
Tapada da Ajuda
1349-017 Lisboa
Portugal
anafh@isa.utl.pt
Series Editors:
Thomas V. Mikosch
University of Copenhagen
Laboratory of Actuarial Mathematics
DK-1017 Copenhagen
Denmark
mikosh@act.ku.dk
Stephen M. Robinson
University of Wisconsin-Madison
Department of Industrial
Engineering
Madison, WI 53706
U.S.A.
smrobins@facstaff.wisc.edu
Sidney I. Resnick
Cornell University
School of Operations Research and Industrial Engineering
Ithaca, NY 14853
U.S.A.
sirl@cornell.edu
e-ISBN: 0-387-34471-3
ISBN-13: 978-0-387-23946-0
Printed on acid-free paper.
2006 Springer Science+Business Media, LLC
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission of the publisher (Springer Science+Business Media LLC, 233 Spring Street,
New York, NY 10013, U.SA.), except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereafter
developed is forbidden.
The use in this publication of trade names, trademarks, service marks and similar terms, even if
they are not identified as such, is not to be taken as an expression of opinion as to whether or
not they are subject to proprietary rights.
Printed in the United States of America.
(TXQ/MP)
987654321
spnnger.com
In cauda venenum
Preface
viii
Preface
distribution function. The two parameters that play a role, scale and shape, are based
roughly on derivatives of the distribution function.
In order to be able to apply this theory some conditions have to be imposed. They
are quite broad and natural and basically of a qualitative nature. It will become clear
that the so-called extreme value condition is on the one hand quite general (it is not
easy to find distribution functions that do not satisfy them) but on the other hand is
sufficiently precise to serve as a basis for extrapolation.
Since we do not know the tail, the conditions cannot be checked (however, see
Section 5.2). But this is a common feature in more traditional branches of statistics.
For example, when estimating the median one has to assume that it is uniquely defined.
And for assessing the accuracy one needs a positive density. Also, for estimating a
mean one has to assume that it exists and for assessing the accuracy one usually
assumes the existence of a second moment.
In these two cases it is easy to see what the natural conditions should be. This is not
the case in our extrapolation problem. Nevertheless, some reflection shows that the
"extreme value condition" is the natural one. For example (cf. Section 1.1.4), one way
of expressing this condition is that it requires that a high quantile (beyond the scope
of the available data) be asymptotically related in a linear way to an intermediate
quantile (which can be estimated using the empirical distribution function).
The theory described in this book is quite recent: only in the 1980s did the contours of the statistical theory take shape. One-dimensional probabilistic extreme value
theory was developed by M. Frechet (1927), R. Fisher and L. Tippett (1928), and
R. von Mises (1936), and culminated in the work of B. Gnedenko (1943). The statistical theory was initiated by J. Pickands III (1975).
The aim of this book is to give a thorough account of the basic theory of extreme
values, probabilistic and statistical, theoretical and applied. It leads up to the current
state of affairs. However, the account is by no means exhaustive for this field has
become too vast. For these two reasons, the book is called an introduction.
The outline of the book is as follows. Chapters 1 and 2 discuss the extreme
value condition. They are of a mathematical and probabilistic nature. Section 2.4 is
important in itself and essential for understanding Sections 3.4, 3.6 and Chapter 5,
but not for understanding the rest of the book. Chapter 3 discusses how to estimate
the main (shape) parameter involved in the extrapolation and Chapter 4 explains the
extrapolation itself. Examples are given.
In Chapter 5 some interesting but more advanced topics are discussed in a onedimensional setting.
The higher-dimensional version of extreme value theory offers challenges of a
new type. The model is explained in Chapter 6, the estimation of the main parameters
(which are infinite-dimensional in this case) in Chapter 7, and the extrapolation in
Chapter 8.
Chapter 9 (probabilistic) and Chapter 10 (statistical) treat the infinite-dimensional
case.
Appendix B offers an introduction to the theory of regularly varying functions,
which is basic for our approach. This text is partly based on the book Regular Vari-
Preface
ix
ation, Extensions and Tauberian Theorems, by J.L. Geluk and L. de Haan, which is
out of print. The authors wish to thank Jaap Geluk for his permission to use the text.
In a book of this extent it is possible that some errors may have escaped our attention. We are very grateful for feedback on any corrections, suggestions or comments
(ldhaan@few.eur.nl, anafh@isa.utl.pt). We intend to publish possible corrections at
Ana's webpage, http://www.isa.utl.pt/matemati/~anafh/anafh.html.
We wish to thank the statistical research unit of the University of Lisbon (CEAUL)
for offering an environment conducive to writing this book. We acknowledge the
support of FCT/POCTI/FEDER as well as the Gulbenkian foundation. We thank
Holger Drees and the editors, Thomas Mikosch and Sidney Resnick, for their efforts
to go through substantial parts of the book, which resulted in constructive criticism.
We thank John Einmahl for sharing his notes on the material of Sections 7.3 and 10.4.2.
The first author thanks the Universite de Saint Louis (Senegal) for the opportunity
to present some of the material in a course. We are very grateful to Maria de Fatima
Correia de Haan, who learned BlgX for the purpose of typing a substantial part of
the text. Laurens de Haan also thanks Maria de Fatima for her unconditional support
during these years. Ana Ferreira is greatly indebted to those who propitiated and
encouraged her learning on the subject, especially to Laurens de Haan. Ana also
thanks the long-enduring and unconditional support of her parents as well as her
husband, Bernardo, and son, Pedro.
In a book of this extent it is possible that some errors may have escaped our attention. We are very grateful for feedback on any corrections, suggestions or comments
(ldhaan@few.eur.nl, anafh@isa.utl.pt). We intend to publish possible corrections at
Ana's webpage, http://www.isa.utl.pt/matemati/ anafh/anafh.html.
Lisbon,
2006
Laurens de Haan
Ana Ferreira
Contents
Preface
vii
xv
37
37
40
43
49
60
xii
Contents
3.6.2 The Negative Hill Estimator (y < - | )
Simulations and Applications
3.7.1 Asymptotic Properties
3.7.2 Simulations
3.7.3 Case Studies
Exercises
113
116
116
120
121
124
127
127
130
133
139
140
141
145
145
145
147
147
148
148
149
153
Advanced Topics
5.1 Expansion of the Tail Distribution Function and Tail Empirical
Process
5.2 Checking the Extreme Value Condition
5.3 Convergence of Moments, Speed of Convergence, and Large
Deviations
5.3.1 Convergence of Moments
5.3.2 Speed of Convergence; Large Deviations
5.4 Weak and Strong Laws of Large Numbers and Law of the Iterated
Logarithm
5.5 Weak "Temporal" Dependence
5.6 Mejzler's Theorem
Exercises
155
3.7
155
163
176
176
179
188
195
201
204
Basic Theory
6.1 Limit Laws
6.1.1 Introduction: An Example
6.1.2 The Limit Distribution; Standardization
6.1.3 The Exponent Measure
207
207
207
208
211
Contents
xiii
214
221
226
230
235
235
235
247
258
271
271
276
276
278
279
261
265
268
285
288
289
10 Estimation in C[0,1]
10.1 Introduction: An Example
10.2 Estimation of the Exponent Measure: A Simple Case
10.3 Estimation of the Exponent Measure
293
293
294
296
302
311
314
314
315
321
323
328
331
331
332
335
xiv
Contents
338
339
344
349
Part IV Appendix
A
357
361
361
371
385
401
References
409
Index
415
Notation that is largely confined to sections or chapters is mostly excluded from the
list below.
=d
->d
_>p
a(t) ~ b(t)
a
n
y
r
v
Q
Up)
1 - F^
2ERV
a+
aflV b
a Ab
[a]
\a\
a.s.
C[0, 1]
C + [0, 1]
q + [0, i]
Cj"[0, 1]
equality in distribution
convergence in distribution
convergence in probability
lim, a{t)/b(t) = 1
tail index
residual dependence index
extreme value index
gamma function
exponent measure
metric \\/x l/y\
indicator function: equals 1 if p is true and 0 otherwise
left-continuous empirical distribution function
second-order extended regular variation
max (a, 0)
min(a, 0)
max (a, b)
min(<z, b)
largest integer less than or equal to a
smallest integer greater than or equal to a
almost surely
space of continuous functions on [0, 1] equipped with the
supremum norm
{/ C[0, 1] : / > 0}
{/ ^ C[0, 1] : / > 0, l/loo = 1}
{/ C[0, 1] : / > 0, |/|oo = 1}
xvi
C+[0,1]
CSMS
D and D'
D[0, T]
V(Gy)
ERV
f+
f4
zl/loc
F
Fn
Gy
GP
i.i.d.
L
R+
R*_*
R(Xt)
RVa
U
X*
*x
(0, oo] x Ct [0, 1] with the lower index Q meaning that the space
(0, oo] is equipped with the metric Q
complete separable metric space
dependence conditions
space of functions on [0, T] that areright-continuousand have
left-hand limits
domain of attraction of GY
extended regular variation
left-continuous version of the function /
right-continuous version of the function /
generalized inverse function of /
(usually left-continuous) inverse function of /
SUp5 | / ( 5 ) |
distribution function
right-continuous empirical distribution function
extreme value distribution function
generalized Pareto
independent and identically distributed
dependence function
[0,oo)
Ri\{(0,0)}
rank of X,- among {X\, Z 2 , . . . , Xn)
regularly varying with index a
(usually left-continuous) inverse of 1/(1 F)
supfjc : F(x) < 1} = U(oo)
M{x : F(x) > 0}
Parti
One-Dimensional Observations
1
Limit Distributions and Domains of Attraction
One may think of the two theories as concerned with failure. A tire of a car can
fail in two ways. Every day of driving will wear out the tire a little, and after a long
time the accumulated decay will result in failure (i.e., the partial sums exceed some
threshold). But also when driving one may hit a pothole or one may accidentally hit
the sidewalk. Such incidents have either no effect or the tire will be punctured. In the
latter case it is just one big observation that causes failure, which means that partial
maxima exceed some threshold.
In fact, in its early stages, the development of the theory of extremes was mainly
motivated by intellectual curiosity.
Outline of This Chapter
Our interest is in finding possible limit distributions for (say) sample maxima of
independent and identically distributed random variables. Let F be the underlying
distribution function and JC* its right endpoint, i.e., x* := sup{* : F(x) < 1}, which
may be infinite. Then
max (Xi, X 2 , . . . , Xn) - JC* ,
n -> 00 ,
X < x) = Fn(x) ,
which converges to zero for x < x* and to 1 for x > x*. Hence, in order to obtain a
nondegenerate limit distribution, a normalization is necessary.
Suppose there exists a sequence of constants an > 0, and bn real (n = 1,2,...),
such that
max(Xi,X 2 ,
...,Xn)-bn
an
has a nondegenerate limit distribution as n -> oo, i.e.,
lim Fn(ax+b) = G(x),
(1.1.1)
n->oo
= - m a x ( - X i , -X2,...,
-X)
Clearly it follows that F(anx + bn) -> 1, for each such x. Hence
-log F(ax + b)
lim
= 1,
n-K 1 - F(anx +bn)
and in fact (1.1.2) is equivalent to
lim (1 - F(ax + bn)) = - log G(x) ,
n*oo
or
i*rn
/-i i o \
Let f^~> 8^~ be the left-continuous inverses of fn and g. Then, for each x in the
interval (g(a), g(b)) that is a continuity point of g*~ we have
lim /*-(*) = **(*).
(1-1.5)
n-*oo
Proof Let x be a continuity point of g^. Fix s > 0. We have to prove that for
w,no e N, n > no,
We are going to prove the right inequality; the proof of the left-hand inequality is
similar.
Choose 0 < s\ < 8 such that g*~(x) e\ is a continuity point of g. This is
possible since the continuity points of g form a dense set. Since g*~ is continuous
in x, g*~(x) is a point of increase for g; hence g(g*~(x) e\) < x. Choose 8 <
x
g(g*~~(x) i) Since g*~(x) s\ is a continuity point ofg, there exists o such
that fn (g*~(x) s\) < g (g*~(x) \) + 8 < x for n > o- The definition of the
function fn*~ then implies g*~(x) s\ < fj~(x).
We are going to apply Lemma 1.1.1 to relation (1.1.3). Let the function U be the
left-continuous inverse of 1/(1 - F). Note that U(t) is defined for t > 1. It follows
that (1.1.3) is equivalent to
lim "M-b*
= G- (e-V*) =: D(x) ,
(1.1.6)
for each positive x. This is encouraging since relation (1.1.6) looks simpler than
(1.1.3). We are now going to make (1.1.6) more flexible in the following way:
Theorem 1.1.2 Let an > 0 and bn be real sequences of constants and G a nondegenerate distribution function. The following statements are equivalent:
(1.1.7)
t-+oc
for each continuity point x of G for which 0 < G{x) < 1, a(t) := a[t], and
b{t) := b[t] (with [t] the integer part of t).
lim
'-oo
U(tx)-b(t)
a(t)
= D{x) ,
(1.1.8)
a(t) := a[t], and
Proof. The equivalence of (2) and (3) follows from Lemma 1.1.1. We have already
checked that (1) is equivalent to (1.1.6). So it is sufficient to prove that (1.1.6) implies
(3). Let x be a continuity point of D. For t > 1,
U([t]x) - b[t] ^ U(tx) - bm ^ U ([t]x (1 + l/[f])) - b[t]
a[t]
~
a[t]
~
a[t]
The right-hand side is eventually less than D(x') for any continuity point x' > x with
D(x') > D(x). Since D is continuous at x, we obtain
hm
t-*
U(tx) - bm
= D(x) .
ay]
This is (3).
We shall see shortly (Section 1.1.4) the usefulness of these two alternative conditions for statistical applications.
1.1.3 Extreme Value Distributions
Now we are in a position to identify the class of nondegenerate distributions that can
occur as a limit in the basic relation (1.1.1). This class of distributions was called the
class of extreme value distributions.
Theorem 1.1.3 (Fisher and Tippet (1928), Gnedenko (1943)) The class ofextreme
value distributions is GY (ax + b) with a > 0, b real, where
Gy(x)=exp(-(l
+ yx)-1/yy
1 + yx > 0 ,
(1.1.9)
with y real and where for y = 0 the right-hand side is interpreted as exp(e~x).
Definition 1.1.4 The parameter y in (1.1.9) is called the extreme value index.
Proof (of Theorem 1.1.3). Let us consider the class of limit functions D in (1.1.8).
First suppose that 1 is a continuity point of D. Then note that for continuity points
x >0,
U(tx) - U(t)
lim
= D(x) - D(\) =: E{x) .
(1.1.10)
t-*oo
a(t)
Take y > 0 and write
Ujtxy) - U(t)
ajt)
Ujty) - Ujt)
ajt)
'
' '
(1.1.12)
1 = 1,2, for all x continuity points of () and E(- y). For an arbitrary x take a
sequence of continuity points xn with xn \ x (n -> 00). Then (jcny) -> (jcy)
and (;cn) - E{x) since is left-continuous. Hence (1.1.12) holds for all x and y
positive. Subtracting the expressions for i = 1, 2 from each other one obtains
(*) (Ai - A2) =
B2-Bi
( y )
= l i m ^
*-< ajt)
+ Hjt),
(1.1.13)
(1.1.15)
= Q(s)A(e'),
and by (1.1.15),
Q(t + s)-Q(t) = Q(s)Q'(t).
(1.1.16)
Q(s) ,
s
(ioge / ) / (o = G/,(0)=: Y
G'(0 = e>"
and (since Q(0) -= 0)
Q(t) = / e^Js
Jo
-1
H(t) = Hf(0)
and
Z)(0 = D(l) + tf'(O)
t* - l
Hence
^ < * >
X
n
, ^ *
<
L L 1 8
-logG(x)
Combining (1.1.17) and (1.1.18) we obtain the statement of the theorem.
If 1 is not a continuity point of D, follow the proof with the function U(txo) with
XQ a continuity point of D.
y4
j/=0
r=-*
- - y=i
j/=4
y=j:oo
1.0 -r
0.8 -j
0.6 H
^f
0.2 -J
y_i1
,_ju
n <x j
f0 ,
x < 0,
lexpH-),*>0.
10
(~e~x),
- J1 ,
x>0.
(1.1.19)
t-*oo
a{t)
(1 ,. 2 0)
(1.1.21)
<">
l ^(0
for all x for which 1 + y* > 0, w/iere JC* = sup{jc : F(^) < 1}.
Moreover, (1.1.19) holds with bn := U(n) and an := a(n). Also, (1.1.22) holds
with f(t)= a (1/(1 -F{t))).
Proof The equivalence of (1), (2), and (3) has been established in Theorem 1.1.2.
Next we prove that (2) implies (4).
It is easy to see that for s > 0,
11
(l-g)^-i
U(T^J)-U(T^)
u
<
^t-vJT^Wj)
(i=fe)
U
(r%) " (l=fe)
(1+eF-l
i
(T=W))
' _^.
(l=fe)
as / f **, and consequently
\l-F(t)J
hm ^
'-r =
\\-F{t))
^_ F ( :;I^))=" ^
i.e., (4) holds.
The converse (i.e., (4) implies (2)) is similar.
Example 1.1.7 Let F be the standard normal distribution. We are going to prove that
(1.1.3) holds: for all x > 0,
lim n (1 - F(anx + bn)) = e~x
(1.1.23)
w->oo
with
bn := (2log* - loglogw - log(47r))1/2
(1.1.24)
and
an := 1
Note first that bn/(2\ogn)1/2
2 _ 1 log 2 ->- 0 and hence
(1.1.25)
12
b2
1
-^ + \ogbn - log* + - log(2;r) -* 0 ,
(1.1.26)
for x R. Hence
.
exp
+b / 2/2/
\2 . \
) ' f
Z"00
\bn
du
'bl
= exp J - (^ + logfcn - logn + i log(2jr)) J f ^-"2/(2^2)^-M
JM
13
exceeding the top of the dike) in a given year is 10 . The question is then how high
the dikes should be built to meet this requirement. Storm data have been collected
for more than 100 years. In this period, at the town of Delfzijl, in the northeast of
the Netherlands, 1877 severe wind storms have been identified. The collection of
high-tide water levels at Delfzijl during those storms forms approximately a set of
independent observations, taken under similar conditions (i.e., we may assume that
they are independent and identically distributed).
First we convert the 10~4 probability to a probability per storm (since that is what
the data give us). Since there are 1877 storms in 111 years, we look for the level that
is exceeded by one such storm with probability (111/1877) x 10~ 4 . Let F be the
distribution function of the high-tide water level during one such storm. Then we are
lookingforthel-(lll/1877)xlO- 4 quantile,i.e.,F^ (l - (111/1877) x 10~4) =
U ((1877/111) x 104) * U (17 x 104).
Of course this is a simplification of what really goes on: if X is the maximum in
a year and Z the maximum during a storm, we have X maxi<j<# Z; with N the
(random) number of storms in a year. We ignore the randomness of N.
Normally one would estimate a quantile by the empirical quantile, that is, one
of the order statistics. But the highest order statistic corresponds in this case to
F<~ (1 - 1/1878) U (19 x 102). So we need to extrapolate beyond the range
of the available data.
At this point Theorem 1.1.2(3) can help. In view of Theorem 1.1.3 the condition
can be written as
,. u(tx) - u(t)
y - I
hm
=x
'-oo
a(t)
for x > 0, some real parameter y, and an appropriate positive function a. Let us
use the approximation with t < 19 x 102 (so that we can estimate U(t) using the
empirical quantile function) and tx := 17 x 104. Then the requested quantile is
Y
(1.1.27)
For the moment we just remark that Theorem 1.1.2(3) seems to provide a possibility
to estimate an extreme quantile by fitting the function (xY \)/y (in the present case
with y close to zero) to the quantile type function U.
S&P500
Daily price quotes of the S&P 500 total return index over the period from 01/01/1980 to
14/05/2002 are available, corresponding to 5835 observations. The daily price quotes,
pt say, are used to compute daily "continuously" compounded returns rt by taking
the logarithmic first differences of the price series, rt = log(pt/pt-\). Stock returns
generally exhibit a positive mean due to positive growth of the economy. Therefore
we shall focus only on the loss returns. We shall assume here that the observations
are independent and identically distributed (cf. Jansen and de Vries (1991)).
14
Now consider the situation in which one has to decide on a big risky investment
while one cannot afford to have a loss larger than a certain amount. Then it is of
interest to know of the probability of the occurrence of such a loss.
If F is the distribution function of the log-loss returns and x is the critical (large)
amount, the posed problem is the estimation of 1 F{x). Then Theorem 1.1.6(3)
suggests that for some positive function a and large JC,
,+
>-'<">K ^)
-\/y
for some large t. This again motivates a tail probability estimator under the extreme
value theory approach.
Life Span
There is some discussion among demographers and physicians about whether there
is limited life span for humans; that is, if we consider the life span of a human as
random, does its probability distribution have a finite endpoint? The problem can be
considered from the point of view of extreme value theory.
A data set consists of the total lifespan (in days) of all people born in the Netherlands in the years 1877-1881, still alive on January 1,1971, and who died as a resident
of the Netherlands. This concerns about 10 000 people.
Now we want to decide whether the right endpoint of the distribution is finite.
The endpoint, finite or not, is lim^oo U(t), which we denote by U(oo). It will be
verified later on (Section 3.7) that for the given data set the hypothesis y < 0 is not
rejected. Moreover, it is known from Section 1.2 below that for y negative we must
have U(oo) < oo. Hence we shall believe that there is an age that cannot be exceeded
by this cohort.
Next we estimate this maximal age U(oo). For this we use the limit relation
(1.1.20) again:
U(tx)-U(f)
a(t)
xy-l
(t - > oo),
but as we shall show later, if y < 0, this relation is also valid for JC = oo, i.e.,
U(oo) - U(t)
a(t)
1
y
(t - oo),
or
U(oo) % U(t) -
Y
(t -* oo) .
15
(1.1.28)
n->oo
for some given real y and all x. These conditions, basically due to von Mises (1936),
require the existence of one or two derivatives of F.
It is easy to see, using relation (1.1.6) from Section 1.1.2, that F cannot be in the
domain of attraction of Gn and Gn with y\ ^ yi.
The following theorem states a sufficient condition for belonging to a domain of
attraction. The condition is called von Mises'condition.
Theorem 1.1.8 Let F be a distribution function and x* its right endpoint. Suppose
F"(x) exists and F'{x) is positive for all x in some left neighborhood ofx*. If
(u 29)
ss ( ^ y ^
or equivalently
i i m , , , vvo = -y - 1
(1.1.30)
Proof (of Theorem 1.1.8). Here, as elsewhere, the proof is much simplified by formulating everything in terms of the inverse function U rather than the distribution
function F. By differentiating the relation
1
1 ~ F(U(f)) = t
we obtain
F'(U(t))
Differentiating once more, we find that
U (t)
" -
2ri
F(U(tm
rwW-WWtf
- -2[1 - F(Um
[F'iUit))?
'
so that
t U"(t)
U'(t)
F"(U(t))[l - F(U(t))]
{F'{U(t))f
7a(t)
(1.1.31)
y
16
,.
t U"(t)
j^Hds,
U'(S)
JXQ
= 0.
logjr
'->fl<*<*
Hence also, since \es el \ < c \s t | on a compact interval for some positive constant c,
/'(/*)
r^-l
lim sup
0.
tf'(0
This implies that
t/ftjc) - l/(Q
xy - 1
-H^-i*
converges to zero.
For later use we next give von Mises' condition formulated in terms of U.
Corollary 1.1.10 Condition (1.1.29) is equivalent to
tU"(t)
lim
(1.1.32)
= v ~ 1 ,
which implies
.
U'{tx)
Y-\
lim
Uf(t) = x
(1.1.33)
t-+oc
tU'(t)
xy
17
I t a ^ - I
(11.34)
t^oo 1 - F(t)
y
for some positive y, then F is in the domain of attraction ofG y
Proof As in the proof of Theorem 1.1.8 we see that condition (1.1.34) is equivalent
t0
lim ^ = y .
(1.1.35)
Further,
-i:
tsU\ts)
U(ts)
logU(tx)-\ogU(t)
ds
s
(1.1.36)
or
,.
i/(rjc) - 1/(0
hm
t-+oo
yU(t)
J^- 1
tU'(t)
hm
= y,
- 0*"(0
lim ,
^ =
ft** 1 - F(t)
(1.1.37)
y
^-^
= -y .
Y
(1.1.38)
Since
fx
tsU'its)
ds
\og(U(oo) - U(tx)) - log(/(oo) - 1/(0) = - / . . . , \*
,
Ji J7(oo) - U(ts) s
relation (1.1.38) implies
'
18
U{tx)
- = xY
U(oo)-U(t)
lim
*->oo
lim
t-oo -y(U(OQ)
JC^
- 1
U(t))
tU {t)
v
'
lim
= y
Y
t->oo U(oo) - U(t)
To get an idea about the tail behavior of the distribution functions in the various
domains of attraction note that for x* = oo and t > x\,
log(l - F(t)) = log(l - F(*0) - /
with f(s) := (1 - F(s))/F'(s).
lim ^
r->>oo
hence
r
hm
t^oo
= lim f(t)
t
= y ;
t-oo
log(l - F(Q)
1
= ,
log t
Y
i.e., for y > 0 the function 1 F(t) behaves roughly like t~l/y, which means a
heavy tail. For y = 0, however,
lim ta{\ - F{t)) = lim x? (1 - F ( * i ) ) e x p - { f f - f r^oo
r->oo
IJ^ \fis)
-a)
1=0
J s J
for all a > 0 and hence the tail is light. Similar reasoning reveals that for y < 0 (in
which case necessarily x* < oo, as we shall see later on in Theorem 1.2.1),
log(l - Fjt))
1
hm
= ,
*t** log(x* t)
Y
so that the function 1 Fix* t) behaves roughly like t~l/y as t | 0.
The reader may want to verify that the Cauchy distribution satisfies Theorem
1.1.11 with Y 1 (Exercise 1.6); the normal, exponential, and any gamma distribution
satisfy Theorem 1.1.8 with y = 0 (Exercise 1.7), and a beta i\x, v) distribution satisfies
Theorem 1.1.13 with y = /ji~l (Exercise 1.8).
r
Remark 1.1.15 Conditions (1.1.29) for y = 0 and (1.1.34) for y > 0 are due to von
Mises. Sometimes a condition, much less general in the case y = 0, is referred to as
von Mises' condition (cf., e.g., Falk, Hiisler, and Reiss (1994), Theorem 2.1.2).
19
t-+oo
(1.2.1)
a(t)
for all x > 0, where y is a real parameter and a a suitable positive function.
We prove the following results.
Theorem 1.2.1 The distribution function F is in the domain of attraction of the
extreme value distribution GY if and only if
1. for y > 0: x* = supjjc : F(x) < 1} is infinite and
,-*oo 1 _ Fit)
for all x > 0. This means that the function 1 F is regularly varying at infinity
with index l/y, see Appendix B;
2. for y < 0: JC* is finite and
1 - F(x* - **)
_lf
hm
- = * l/y
40 1 - F(x* -t)
(1.2.3)
fit) :=
Jt[t
*(l-F(s))ds
'
d.2.5)
1 - F(t)
Theorem 1.2.2 The distribution function F is in the domain of attraction of the
extreme value distribution Gy if and only if
20
1. fory > 0; F(x) < 1 for all x, ff(l - F(x))/x dx < oo, and
,. f, (1 - F{x)) &
,1%,
i-F(t)
=y->
2. for y < 0: there is x* < oo swc/i f/iaf fx*-t^
r|0
(126)
1 - F(x* f)
3. for y = 0 (Tzere f/ie ng/tf endpoint JC* may be finite or infinite): f* ffx (1
F(s))ds dt < oo ant/ the function h defined by
hMh(x) :=
V-F(x))f?ft'\l-F(s))dsdt
~2
(1.2.8)
(fZ\l-F(s))ds)
satisfies
lim 6(0 = 1 .
(1.2.9)
ft**
In fact,
Jf
1 - F ( 0
since
OO
x
/
log * - logf dF(x) = / 1 - F(JC)
Relation (1.2.6) will be the basis for the construction of the Hill estimator of y (cf.
Section 3.2). Similarly, (1.2.7) can be interpreted as
lim E (log(jc* - X) - log t\X >x*-t)
= y,
which will be the basis for the construction of the negative Hill estimator (Section
3.6.2), and (1.2.9) is equivalent to
E ((X - t)2\X > t)
lim ^2
21
(-x-l/y)
\
fory=0:
lim Fn(anx + bn) = exp (e x)
holds for all x with
bn := /(n),
tfra/ / as in Theorem 1.2.1(3).
We reformulate Theorem 1.2.1 in a seemingly more uniform way.
Theorem 1.2.5 The distribution function F is in the domain of attraction of the
extreme value distribution Gy if and only if for some positive function f,
t\x*
(1.2.10)
1 F(t)
for all x with 1 + yx > 0. If (1.2.10) holds for some / > 0, then it also holds with
(yt,
f(t)=
y > 0,
j -y(x*-t),
[ ff
y<0,
1 - F(x)dx/(l
- F(t)) , y = 0 .
y>0,
=y ,
t->oo
lim f(t)/(x*
-t)
-y,
Y <0,
(1.2.11)
y = 0
22
Theorem 1.2.6 The distribution function F is in T>(Gy) if and only if there exist
positive functions c and f, / continuous, such that for all t e (to, x*), to < x*,
1-F(r) = c(0exp{-/
JL.
JtQ
=y ,
y > 0,
t>00
lim f(t)/(x*-t)
lim /'(f)
I If**
= 0 and
= -y,
lim /(f)
y<0,
= 0 I/JC* < oo , y = 0 .
t\x*
Remark 1.2.7 The auxiliary functions / in Theorems 1.2.5 and 1.2.6 are asymptotically the same. If von Mises' condition is satisfied for y = 0, then we can take
/(f) = (1 F(t))/F'(t).
Remark 1.2.8 Note that Fo(f) := max(0, \c exp(- / ^ f~l (s) ds)) is a probability distribution function and that
1-F(f)~l-F0(f),
f ff*.
(1.2.12)
It follows that for any F eV(Gy) there exists a distribution function Fo with (1.2.12)
such that Fo satisfies von Mises' condition of Theorem 1.1.8 (for y = 0), Theorem
1.1.11 (for y > 0), or Theorem 1.1.13 (for y < 0).
In order to prove the results of Theorems 1.2.1-1.2.6 we are going to study relation
(1.2.1) for the inverse function U first.
Lemma 1.2.9 Suppose (1.2.1) holds.
1. Ify > 0, then lim^oo U(t) = oo and
lta m . A .
t-+oo a(t)
a2,3,
2. Ify < 0, then lim^oo U(t) < oo and, with (/(oo) := lim^oo U{t),
lim
= .
t-+oo
a(t)
y
In particular this implies that limr_>oo &(t) = 0.
3. Ify = 0, then
lim ^
= 1
for all x > 0andlimt-+ooa(t)/U(t) = 0. Moreover, ifU(oo) < oo,
(1.2.14)
(1.2.15)
UjoQ-Um
23
f^oo U(oo)-U(t)
forx > Oandlimt^ooaW/iUioo)
'
- U(t)) = 0. Further,
lim *
^oo
= 1
(1.2.17)
a(t)
for x > 0.
Corollary 1.2.10 7. For y > 0 relation (1.2.1) w equivalent to
lim - 7 ^ = JC^ /or JC> 0 .
(1.2.18)
x >0,
(1.2.20)
a(t)
r
hm
'-oo
= 1.
F\(t)
U2(t)-Ui(t)
a(t)
.
= 0.
Uil-e)s)<U2(s)<Uil+e)s)
24
Hence
Ul((l-8)s)-Ul(s)
a(s)
U^s)
The left- and right-hand sides converge respectively to ((1 s)Y l)/y and
((1 -f s)y \)/y. Hence statement (2) has been proved. The converse is similar.
Proof (of Lemma 1.2.9 and Corollary 1.2.10). We prove the assertions for y > 0.
The proof of the other assertions is similar.
It is easy to see that if (1.2.1) and (1.2.13) hold, (1.2.18) is true and that (1.2.18)
implies the other two.
Note that (1.2.1) implies for x > 0,
a(tx)
lim
t-*oo a(t)
=
(u(f*y)-u<f)
t-+oo \
a(t)
_ U(*x)-U(t)\
a(t)
lim
//U(txy)-U(tx)\
&(tx)
)
) J V
k^io V
//U(Zk)-U(Zk~l)\
a(Z )
ZYa(Zk~l)
) I V
ZY ,
- U{Zk~l)\
Zy(l -s)<
U(ZM)
U(Zk)
\U(Zk) - U(Zk~l)j
(1.2.21)
ZY{\ + 8) .
U(Zn)
N-+oo
N
n 1
= N.+00
lim YI s (u(Zo)-U(Z
)) f[ ^ P " ? ?
\
>
7 11 U(Z ) - U(Z ~ )
v
n=0
k=n0
> lim y
n=n0
-e)
k=n0
oo
'
25
since Z > 1 and by assumption y > 0. Hence /(/) -> oo, t -> oo.
In order to prove (1.2.13) add inequalities (1.2.21) for k = o , . . . , . Divide the
result by /(Z n ) and take the limit as n -> oo. This gives
/(Z n+1 );
lim v
= Z'
n-oo /(Z n )
(1.2.22)
(1.2.23)
(/(Z n ( ' ) + 1 Z n ( * ) + 1 )
c/(zG>)
*
(1-2.24)
Proof (of Theorem 1.2.1 for y ^ 0). We prove the theorem for y > 0. The proof for
y < 0 is similar. From the definition of the inverse function U(x) one sees that for
any s > 0,
\\-F{t))-
\l-F(t)J
Hence
^ ) <
- . ( _ L _ )
\l-F(t)J
s
U
^ .
(1.2.25>
\l-F(t)J
Suppose (1.2.1) holds, i.e., we have (1.2.18). Then the right- and left-hand sides
of (1.2.25) converge to (x/(l + s))y and (JC/(1 s))y respectively. Hence, since the
relation holds for all s > 0, it implies
lim t~lU f l
) = xy.
(1.2.26)
"
t^oo l -
F { t )
i/y
F(tx)
i.e., (1.2.2). The proof of the converse implication is similar and is left to the reader.
26
Proof (of Theorem 1.2.2 for y ^ 0). We prove the theorem for y > 0. The proof for
y < 0 is similar. First note that by (1.2.2) for any e > 0 and sufficiently large t,
LzlM<ee-l/Y
Hence
l-F(ten)
l-F(tek)
- - < *(s-l/y)n
=n
IJ 1 - F(tek~l) ~
1 - F(t)
and for all x > 1,
1
- *('*> <
- ^[1g']>
1-F(f) "
I-F(t)
<
e(-l/y)[log,]
p-FfrQ*
lim
f-
1/y
\/y
f-*00
with
m
1 ~ F(t)
ait) := / f (l - Fix))/x dx '
Note that
P
dx
f
dx
C*
dx
- log / (1 - Fix)) + log / (1 - F{x)) = / a(x) x
Hence, using the definition of the function a again, we have
/OO
1-- F ( 0 = a(t)J
(1--
F(x))
dx
X
/OO
a(0j
dx
exj
(1- F(x))
X
(-/'
(1.2.27)
a(fjc)
=
x
q.r-1 [ ?l) = x -
e x p ( - jf a( : , v > ^ ) - e^
^ .
27
lim
n-oo
Proof (of Theorem 1.2.5). Since U is the left-continuous inverse of 1/(1 F), for
e >0,
1 - F (J7(Q + sf(U(t)))
1 - F(U(t))
^
1
^ 1 - F (U(t) - sf(U(t)))
~ t{\ - F(U(t))} ~
1 - F(U(t))
it follows
(1.2.28)
= (1 + yx)~l/y
t>oo
Now, Theorem 1.1.6 tells us that F V(GY). This proves thefirstpart of the theorem.
The second statement just rephrases Theorem 1.2.1. Relation (1.2.11) for y ^ 0
follows from the easily established fact that if (1.2.10) holds for / = f\ and / = /2,
then lim,f ** f\ (t)/fc(f) = 1. For the case y = 0 use Theorem 1.2.6.
Proof (of Theorem 1.2.6 for y ^ 0). For the "if part just check directly that (1.2.2)
or (1.2.3) of Theorem 1.2.1 is satisfied. Next suppose F V(GY) for y > 0. Write
1
a(t) =
" F(f)
ft(l - F(x))/x dx '
Note that
00
(-
l o g
n(1 - F(x))
vt x. dx \
x )
a(t)
)=-,
1 - F(f) = 1
U"T)
Theorem 1.2.2 states limr_>oo a(t) = 1/y, hence the representation. The proof for
y < 0 is similar.
28
For the proof of Theorems 1.2.1, 1.2.2, and 1.2.6 with y = 0 we need some
additional lemmas.
Lemma 1.2.14 Suppose that for all x > 0, (1.2.1) holds with y = 0, i.e.,
lim
~-
'-oo
a(t)
= log* ,
where a is a suitable positive function. Then for all e > 0 there exist c > 0, ^o > 1
such that for x > 1, t > to,
U(tx) - 1/(0
a(t)
Proof For all Z > 1, there exists to > 1 such that for t > to,
U(fe) - U(t)
< Z
a(t)
and
a(fe)
< Z
a(t)
(use (1.2.17) for the last inequality). For n = 1, 2 , . . . and t > to,
U(ten) - U(t) _ A
n
<
U(tek) - U{tek~l) yl
a(ter)
k-l
Y,zY\z<nZ.
fc=l
r=l
.^.^logz
~ logZ
(use for the last inequality a + 2 < 2(log Z)~lZa for any a > 0 and 1 < Z < e).
Corollary 1.2.15 7/(1.2.1) holds for y = 0, tfien / ^ f/(s)/,s2 ds < oo and
limM)-(o=0
r-oo
a(r)
with
t f00
J.s
f00
ds
Uo(f) := - / U(s) -j = / l/frf/e) - j .
Afote f/iaf /Q w continuous and strictly increasing.
29
Proof.
-I
Uo(t) - U(t)
f U(st/e) - U(f) ds
a(t)
Ji
a(t)
s2
We can now apply Lebesgue's theorem on dominated convergence: (1.2.1) gives the
pointwise convergence and Lemma 1.2.14 the uniform bound. In particular we have
Ji00 U(s)/s2 ds < oo.
Proof (of Theorem 1.2.1 for y = 0). We have proved already in Theorem 1.1.6 that
(1.2.4) is necessary and sufficient for the domain of attraction. Since the function F
is monotone and e~x is continuous, relation (1.2.4) holds locally uniformly. Hence,
in order to prove that (1.2.4) holds with the function / from (1.2.5), it is sufficient to
prove that if (1.2.4) holds, then
ftx\l-F(s))ds
1 - F(f)
Take I/ 0 and F 0 from Corollaries 1.2.15 and 1.2.16. Note that (1.2.4) holds with
F replaced by Fo, i.e.,
l-Fo(*+*/(*))
_,
hm
= e .
tu*
1 - F 0 (0
Since also (use THopital's rule)
r
ftx*(l-Fo(s))ds
ftx*(l-F(s))ds
1-F0(0
1-F(0
it is sufficient to prove the statement for Fo rather than for F.
By dominated convergence, using the inequality from Lemma 1.2.14, we have
r
hm
zjr
^ o ^ f f - U0(z)
r~ U0(zx) - U0(z) dx
i
f= hm /
=- = 1
F0(u))du
30
Proof (of Theorem 1.2.2 for y = 0). Clearly relation (1.2.4) implies
lim t + xf(t) = x*
for all x .
tfx*
Now we replace the running variable t in (1.2.4) by t' + yf{tf) (tf f x*) and get
Um
l-F({t'
t*
+ yf(tf))+xf{t'
+ yf{t')))
1 - F(t' + y/(f'))
l - F ( ; ' + y/(Q)
r*
1 ~ F(t')
_y
=
e
'
that is,
lim
t'\x*
i
l-F(r')
U. = e - * e - y .
(1.2.31)
r
lim
, ' /
= ~Jt~3'
(1.2.32)
t'u*
1 - F(r')
Keep in mind that the convergence in (1.2.4) is locally uniform. It then follows
from (1.2.31) and (1.2.32) that
lim
i'\x*
J v
"
fit')
= 1
(1.2.33)
for all v (formally this is proved by contradiction: suppose that for some sequence
t'n f x* the limit in (1.2.33) equals c e [0, oo], c ^ 1; then (1.2.31) cannot be true).
This holds in particular for the function / from (1.2.5), i.e.,
ftX+xf(t)(l -
,.
<s
lim
ft*'
l-F(t
+ xf(t))
ds
1-F(/)
ftx\l
= l
- F(s)) ds
tu*
(1.2.34)
ffx (1 _ F(s)) ds
ffx*
l-Fi(f)
(l-F(s))ds\
31
tells us by (1.2.4) that F\ is in the domain of attraction of Go. But then again by
(1.2.4) and (1.2.5) we must have
*t**
i-*i(o
with
(f{\-Fi{s))ds
i-w
(1 2 37)
''
Since the convergence in (1.2.35) and (1.2.36) is locally uniform, the functions
/ and / i must be asymptotically equivalent:
/i(0~/(0
as f t * * .
(1.238)
2k{
?~
fi(t)
> 0,
(1.2.39)
as t t JC*.
(1.2.40)
where
1 - F2(t) :=
(1-Fi(0)2
ftx
{1-Fi(s))ds
Moreover, by (1.2.9),
1 - F2(t) ~ 1 - F{t)
{h(s)-l)ds;
J to
fx{t)
= hm / (h(t + uf\{t)) - 1) dw = 0
'tWo
limMt+xMt))=i
nx*
Mt)
32
= e x.
-^
1-F*(0
(1.2.42)
Proof (of Theorem 1.2.6 for y = 0). Suppose F X>(G0). Define for n = 1, 2,
recursively
F(/) := max ( 0,1 - /
(1 - F n _i(s)) ds |
(1.2.43)
( * - /
and Fo(0 := F(t). The integrals are finite: the arguments in the previous proof show
that Fn e V(Go) for all n. Moreover, we have for all n, as t f **,
Write
Q n i t )
._ 1 - F n + 1 ( Q
- 1-F(0
~m ) \ .
(1 - F 4 (0) 3
(1.2.45)
Define
1 - F*(0 :=
(1 - F 3 (0) 4
(1 - F 4 (0) 3
Note that by (1.2.43), for sufficiently large t,
d
1 - Fi(0
1 - F 3 (0
1
- {-log(l - F*(0)} = 4 - ^ i - 3 - =: - .
dt
1 - F 3 (0
1 - F 4 (0
/*(0
Hence
1
1 - F3(Q IA
=
(4 a 3 (0 3) > 0
Mt)
1-F4(0
and
^V1
rf_ /_4
f (t)
~ dt \Q2(t)
=
fi3(0/
_ /_4
\C2(0
(1.2.46)
3 _ V 2 /-4<22(0
3Q 3 (Q \
23(0)2/
63(0/
V(G 2 (0)
33
1 - F*(f)
and
m = Ms).
To prove the converse, first note that
lim Up- = 0 , JC* = oo,
l
, t-+00
lim 4Q = 0 ,
JC*
< oo,
since if x* = oo,
fit) - /fro)
1 /" ,,, w
no
=-sf
x* t
ns)ds
JC*
<JC*
(1.2.47)
for sufficiently large t and all real JC. Obviously t + xf(t) -^ JC*, f f JC*. Next note
that
f(t+xf(t))-f(t)
fit)
1 /"+*/('>
ff(s)ds =
7(f) I
ff(t sf(t))ds
>
f{t +
* m
fit)
=1
(1.2.48)
c(t+xf(t))
c(t)
exp
(~ fo
itTjm*)
34
Exercises
1.1. Let / be any nondecreasing function and /**" its right- or left-continuous inverse,
respectively f*~(y) := inf {s : f(s) > y] or f*~(y) := inf {s : f(s) > y}. Check
that:
(a) (/**")*" = /"" if / * " is the left-continuous inverse, with / " the left-continuous
version of / .
(b) (f*~)*~ = / + if / " " is the right-continuous inverse, with / + the rightcontinuous version of / .
(c) / " (/*"(0) < t < f+ (/*~(0) whether / * " is the right- or left-continuous
inverse.
1.2. Verify that Gny{anx + bn) = GY{x) = exp ( - ( 1 + yx)~l/y)9
for a = n^ and i?n = (ny l ) / y , for all.
with 1 + yx > 0,
= T(/x + v X r ^ ) ) " 1 ^ ^ ) ) - ^ !
-JC^-V"
/x > 0, v > 0, 0 < JC < 1, is in the Weibull domain of attraction with y = /JL~1 .
1.9. Check domain of attraction conditions for F(JC) = ex, x < 0, and F(x) =
l-el'x9x
<0.
1.10. Show that F e V(Gy), for some real y, is equivalent to limt^oo(V(tx)
V(t))/a(t) = (xy l ) / y , x > 0, for some positive function a, where V :=
(l/(-logF))-.
35
1.11. Suppose Ft V(GYi) fori = 1,2 and y\ < yi. Suppose also that the two
distributions have the samerightendpoint x*. Show that 1 F\ (x) = o{\ F2O)),
as* t **
1.12. Let Ft 6 V(GYi), i = 1,2. Show that for 0 < p < 1 the mixture pF\ +
(l-p)F 2 2>(G m ax( y i ,^))if:
(a) Y\ ^ Y2,
(b) yi = n # 0.
Can you say something about the case yi = yi 0?
1.13. Show that if F V(GY), then (for any y)
lim 5t , 1 - F(^)
hm - L = 1 .
*t** lim^, 1 - F(s)
Conclude that the geometric distribution F(x) = 1 e~^x\ x > 0, and also the
Poisson distribution are in no domain of attraction.
1.14. Find a discrete distribution in the domain of attraction of an extreme value
distribution.
1.15. Let Xi, X 2 , . . . be an i.i.d. sample with distribution function F. Show that
if F is in the domain of attraction of Gy with y negative and c (0, 00), there
exist constants an > 0 such that (max(Xi,..., Xn) F*~(l (cn)~1)) /an converges in distribution to Y cy /y, where Y has distribution function of the type
exp ( - ( - v ) " 1 ^ ) , v < 0 (i.e., Weibull type).
1.16. Prove that if F V(GY) and X is a random variable with distribution function
F, then for all 00 < x < U(oo) (U(oo) is therightendpoint of F),
E\X\"l{x>x)<OQ
if 0 < a < l/y+ with y+ := max(0, y) and E\X\al{x>x) = 00 if a > l/y+. Recall
that U(oo) < 00 if y < 0 and U(oo) = 00 if y > 0.
1.17. Let F(x) := P(X < x) be in V{GY) with y > 0. Let A be a positive random
variable with EAl/y+s
< 00 for some e > 0. Let A and X be independent. Show
that
limP(AX>x)=EAl/
<a)<
-i
P(AX>x,A<x/t
)
' "- / 00 ;
p
px/to
< (1+a) /
a 1 / K + ^F(A < a) .
Jo
36
Take the limits x oo and then s | 0. Further use P(AX > x, A > x/to) < P(A >
*/fo).
1.18. Show that F(x) = 1 e~x~smx, x > 0, is not in any domain of attraction (R.
von Mises).
Hints: Show that lim^oo (1 F(x + logn*)) = e - * - s m * for all JC > 0 with
nit = [e2jtk] for A: = 1, 2 , . . . , i.e., limw_>oo U(rikx) logn^ = /I(JC), where U :=
(1/(1 F))*"" and U\ is the inverse of ex+smx. Now proceed by contradiction.
= (l + y*)
1/y
(2.1.1)
lim HMzm
t-+oo
X
=
1JZ1
a(t)
(2.L2)
for each x > 0, where y is a real constant called the extreme value index, is designed
to allow convergence in distribution of normalized sample maxima, as in (1.1.1). But
the conditions also imply convergence of other high-order statistics.
Let us start to derive the result for the exponential distribution. Suppose E\, 2,
are independent and identically distributed standard exponential and E\in < E2,n <
- < En,n are the nth order statistics. By Renyi's (1953) representation we have for
fixed k <n,
\E\,n, E2,n> Ek,n)
WS,5 t A
\n
with E\,...,
n\
S. + JL+..
n
n1
^B\
n k+\J
(2.1.3)
This suggests, and we shall show this later on, that the point process of normalized
lower extreme-order statistics converges to a homogeneous Poisson process.
Next we generalize the result (2.1.3) to the entire domain of attraction, and as
usual, we formulate it for upper order statistics rather than lower ones.
38
Theorem 2.1.1 Let X\, X2,... be i.i.d. with distribution function F. Suppose F is
in the domain of attraction of Gy for some y e R Let X\jn < X2,n 5: < Xn,n
be the nth order statistics. Then with the normalizing constants an > 0 and bn from
(1.1.1) and fixed keN,
an
an
an
converges in distribution to
'""
(T)
has distribution function F. Hence
\Xn,m Xni>w, . . . , An_fc-f i>nJ
= T^)."(d^)
(d^))
= hm
w->oo
an
(lim n(l-e-*'n)YY
-1
x-y-i
Hence by (2.1.2), (2.1.3), and the fact that n (l - e"x/n) -* JC, n -* 00, for x > 0,
we get the result.
IG^)}:
of points in R+ x R and define a point process (random measure) Nn as follows: for
each Borel set B c R + x E ,
39
Moreover, consider a Poisson point process N on R+ x (**, x*], where *JC and JC*
are the lower and upper endpoints of the distribution function GY, with mean measure
v given by, with 0 < a < b, *x < c < d < x*9
v ([<*, b] x [c, d]) = (b-a)
[(1 + ycyl/y
- (1 + yJ)"1^]
The following limit relation holds. For information about point processes see, e.g.,
Jagers (1974).
Theorem 2.1.2 The sequence ofpoint processes Nn converges in distribution to the
Poisson point process N, i.e., for any Borel sets B\,..., Br C M+ x (*JC, x*] with
v(dBi) = 0fori = 1,2, . . . , r ,
(tf n (Bi),..., Nn(Br)) - i (tf ( B i ) , . . . , N(Br)) .
Proof. By Theorem 4.7 of Kallenberg (1983), see also Theorem A. 1, p. 309, of Leadbetter, Lindgren, and Rootzen (1983), and Proposition 3.22, p. 156, of Resnick (1987),
it is sufficient to check that for all half-open rectangles / := (JCI , X2] x (yi, yi\,
lim ENn(I) = EN(I) ,
(2.1.4)
and that for each B = U*=i ^> a ^ n ^ te u n i o n of half-open rectangles parallel to the
axes,
lim P(Nn(B) = 0) = P(N(B) = 0) .
(2.1.5)
H->00
Now
EN*(/)
n^ .
P^yl<^Zh.<y^
nx\<i<nx2
< ? 2 ) -4 (l + yyirl/r
- (l +
yyiTxly
n -> oo .
nx\<i<nx2
For relation (2.1.5) note that the rectangles It can be taken to be disjoint. In fact, by
the independence of the X,-, i = 1, 2 . . . , it is sufficient to consider a set B of disjoint
half-open intervals with identical first coordinates, i.e., in a vertical strip. Then
40
= 0)
xi<i/n<x2l
or y\
_/
<
an
< y\
or-
HUnP(y^<^<y?)
#{i : nxj</<rtJC2)
-i/y"
The result is quite helpful for developing intuition in extreme value theory: the
larger order statistics can be thought of as points of a Poisson point process with mean
measure determined by the extreme value distribution.
A clear and useful way to see what convergence in distribution of the point process
means is the following (cf. Appendix A): there exists a sequence of point processes
N, N\, # 2 , . . . defined on one sample space such that N =d N and Nt =d N( for
i = 1,2,... and Nt -> N a.s., as / -> oo. That is, for every relatively compact set
B whose boundary has zero mass under the limit measure, the number of points in B
under N( converges to th6 number of points in B under N. Moreover (note that the
numbers of points will be eventually equal) the position of all the points in B under
N( will asymptotically coincide with the position of the points in B under N.
41
Vk
*U'(i)
an
is asymptotically standard normal with
bn:=
k-l
n-Y
an := Jbn(l
-bn)n-\
-xk-l(l-x)n-k;
Using Stirling's formulafor n! one sees easily that the first factor tends to
Next note that
(2TT) _ 1 / 2
42
^-(-^-T(T^)2-)
so the highest-order terms cancel. The coefficient of x2/2 is
The other terms are of smaller order. Since the sequence of densities converges
pointwise, we have weak convergence of the probability distributions (Scheffe's
theorem).
M-l
i)
hence
Vik
"lt^V3Fk/(nUk+i,n)
(J' ( | s )
U'(l)
ds
By (1.1.33) and Potter's inequalities (Proposition B.1.9), for n > no, s > 1,
Hence
<(i+)^
43
(--Y;+E' _1
Vnt/t+1
,
y +ef
ye'
has the same limit distribution as Vfc (k/(nUk+i,n) l)- Since e > 0 is arbitrary we
find that
f"'(f)
has the same limit distribution as y/k (k/(nUk+\,n) l)-
So we see that the normal distribution is a natural limit distribution for intermediate order statistics. As in the case of extreme order statistics, where we made the
connection with point processes, we want to put the present limit result in a wider
framework, which in this case will be convergence toward a Brownian motion. However, for this result we need more than just the domain of attraction condition. One can
consider the domain of attraction condition as a special kind of asymptotic expansion
of U near infinity. For the approximation by Brownian motion, as well as for many
statistical results as we shall see later on, it is very useful to have a higher-order expansion. We call this the second-order condition. This condition will be discussed in the
next section. The extension (or rather analogue) of Theorem 2.2.1 in this framework
will be discussed in Section 2.4.
=: DY(x),
a(t)
(2.3.1)
a(f
>
A(t)
/ W
(2.3.2)
44
exists. The function A could be either positive or negative. Write H for the limit
function. Of course the case H(x) = 0 for all x > 0 is not very informative.
Let us rewrite relation (2.3.2) as follows: for all x > 0,
hm
= H(x)
(2.3.3)
t-*oo
a\(t)
with a as before and a\ a A. The first question is, which functions H are possible
limit functions in (2.3.3)?
Note first that when we replace the function a by a + ca\, for some constant c,
(2.3.4)
This means that we can always add a multiple of Dy to the function H. It follows
that if (2.3.3) holds with H(x) = cDy (x), the relation is still not very informative.
So we require that the function H in (2.3.3) not be a multiple of DY. In particular, H
should not be identically zero.
Definition 2.3.1 The function U (or the probability distribution connected with it) is
said to satisfy the second-order condition if for some positive function a and some
positive or negative function A with limf_oo A(t) = 0,
u(tx)-u(t) _ xy-\
Y
lim 22>
=: H(x) , x > 0 ,
(2.3.5)
t-+oo
A(t)
where H is some function that is not a multiple of the function (x y 1) / y. In particular,
H should not be identically zero. Occasionally we shall refer to the functions a and
A as (respectively) first-order and second-order auxiliary functions.
Remark 2.3.2 Note that the second-order condition implies the domain of attraction
condition.
We have the following results. Proofs are given in the appendix on regular variation, Section B.3.
Theorem 2.3.3 Suppose the second-order condition (2.3.5) holds. Then there exist
constants ci, Q R and some parameter p < 0 such that
H(x) = a f
sY~l
fS up~l du ds + c2 f
sy+p-1
ds .
(2.3.6)
fl(r)
=clXy-
A(t)
(2.3.7)
and
lim ^ l =
t-+oo A(t)
p
x
(2.3.8)
45
- Dy(x))+c2Dy+p(x),
(2.3.9)
(2.3.10)
(2.3.11)
Next we are going to simplify the limit function H by changing the functions a
and a\ a little as in (2.3.4). We work out one of the three cases. Suppose p ^ 0, so
(2.3.9) holds. Replace a by a + c 2 ai. Then the limit / / changes to
(c\ + p c 2 ) - (D y + P (*) - Dy(x))
p_1
W
Jw^ ,
which is just the first term in (2.3.6) with c\ = 1. Notice that in the process we may
have changed the positive function a\ into a negative one. One can simplify relations
(2.3.10) and (2.3.11) as well. We formulate our result.
Corollary 2.3.4 Suppose relation (2.3.3) holds for all x > 0 and the function H is
not a multiple of Dy. Then there exist (possibly different) functions a, positive, and
a\, positive or negative, such that
tfm " < ' * > - " ( ' > - ( W * > == ,f
^-^oo
a\(t)
Y l
8sr->
~
,f Hu,p-~.l dudSt
duds,
.3.12)
(2(2.3.1
lim
t-oo
a(t)
A(t)
= f
sY~l
(2.3.13)
) ,
(2.3.14)
which for the cases y = 0 and p = 0 is understood to be equal to the limit of (2.3.14)
as y -> 0 or p -> 0, respectively, that is,
46
l(s=I-log*),
p # 0 = y,
0=y
I 2 (log*) ,
Corollary 2.3.5 Suppose relation (2.3.3) holds for all x > 0 and the function H
is not a multiple of Dy. Then there exist functions a+, positive, and A+, positive or
negative, such that
U(tx)-U(t)
DY(X)
MO
lim
(2.3.15)
y,p\X)>
A*(0
f-*00
where
Y+p
*Y,p(x)
:=
Y+P^O,
log*,
y+p
p<0,
0,p<0,
(2.3.16)
ixHogx.p^O^y,
i(log*)2,
p=0=y,
a(0 ( l - A ( 0 ) , P < 0,
o*(0 :=
a(0 ( l - A ( o ) ,/) = 0 j t y ,
a(0 ,
P = 0 = y,
and
A+(t) :=
A(f) , p < 0,
A(0,
P = 0,
MO
- 1 = o(A*(t)),
with the following property: for any s, S > 0 there exists to = to(e, S) such that for
all t, tx > to,
u(tx)-u(t) _
apif)
AoO)
xy-i
%AX)
<sxy+pmsK(xs,x~s),
(2.3.17)
Mtx) -XY
aoit) *
A0(t)
VP
rv*
1
l
47
(2.3.18)
P < 0,
yl/(0 ,
y > p = 0,
(7(0 + U(f) ,
y =
p=0,
1/(0 - c ^
U(t) := {
p < 0,
r y ( l / ( o o ) - 1 / ( 0 ) , y < P = 0,
r-^l/W,
y > P = 0,
f>(0 ,
y=
= 0,
t Jo
g(s) ds
1
-iY+p)Qi2%P
,Y+P<0,P<0,
aoit)
A0(t) := {
fr+p>28.
y + p>o, P<0,
V(t)
aoit) '
y + P = o, P < 0 ,
Uit)
U{t) '
y * P = o,
Uit)
. aoit) '
y = p = 0.
The next corollary is an alternative formulation of the result of the last theorem
that is sometimes useful.
Corollary 2.3.7 Under the conditions of Theorem 2.3.6, with the same functions ao
and Ao satisfying, as t > oo, Ao(t) ~ A*(f) and a${t)/a+(t) 1 = 6>(A*(0)> for
any e, 8 > 0 there exists to = to(e, 8) such that for all t, tx > to,
Ujtx)-boJt) _ xy-i
flo(0
A0(t)
w/iere
*y,pW
ejc^maxOc 5 ,*-*) ,
(2.3.19)
48
, y + p ^ 0, p < 0,
otherwise ,
1^(0
and
K + p ^ 0 , p < 0 ,
Y+P '
log* ,
:
Vy,PW
y+p = 0,p<0,
*nog*,p = 0#y,
(2.3.20)
[i(logx)2, p = 0 = y .
Next we formulate the second-order condition in terms of the distribution function,
rather than in terms of the function U.
Theorem 2.3.8 Suppose (2.3.5) holds. Then for all x with 1 + yx > 0,
l-F(t+xf(t))
n
(
\-F(t)
~ UY[
lim
= (Gy W ) 1 + > / #y,p ( G ; 1 (*)) ,
a(t)
tfx*
wfere / ( f ) : = A ( 1 / ( 1 - F ( 0 ) ) , ( / ) : = A ( l / ( 1 - F(f))), ourf Qy(x)
y j c ) - 1 / ^ . Conversely, (2.3.21) imp/ww (2.3.5).
(2.3.21)
: = (1 +
For convenience we state the simpler corresponding results for the case y > 0
separately.
Theorem 2.3.9 Suppose for some y positive and positive or negative function
Ujtx)
U(t)
lim
t^oo
A,
:=K(x)
A(t)
exists for all x > 0 and K is not identically zero. Then for a possibly different function
A, positive or negative,
Uitx)
_xy
,xf> - \
lim
=xY
(2.3.22)
A,^
t-*oo
A(t)
p
for all x > 0 with p < 0. Moreover, for any e, 8 > 0 there exists to = to(s,8) > 1
such that for all t, tx > to,
U W -
x r
XP 1
XY
A0(t)
< sxY+f)
md&(x\ x~8),
(2.3.23)
with
Aoit)
:=
of(0
= x-^:
yp
(2.3.24)
F(t))).
Remark 2.3.10 For the equivalence of (2.3.22) and (2.3.24) see Exercise 2.11
49
Example 2.3.11 The function U(t) = c$tY + c\, with c$ and y positive, and c\ =
0, satisfies the second-order condition of Theorem 2.3.9 but not the second-order
condition of Definition 2.3.1.
It is interesting to observe that if (2.3.22) holds with p < 0, then for some positive
constant c the function | U (t)cty | is regularly varying with index y + p . In particular;
U(t) ~ cty, t -> oo. So the second-order condition with p < 0 makes the first-order
relation particularly simple.
Finally, we provide the sufficient second-order condition of von Mises type.
Theorem 2.3.12 Suppose the function U = (1/(1 F))*~ is twice differentiable.
Write
tU"(t)
A(t) :=
- y +1 .
If the function A has constant sign for large t, lim^oo A(t) = 0, and the function
\A\ is regularly varying with index p < 0, then for x > 0,
U(tx)-U(t)
tU'{t)
hm
_ JC^-1
Y
TJ
< N
= HYtP(x) .
U'(tu)
* U f ( y - 1) "A (2.3.25)
J\
since the function A(t) vanishes at t oo, i.e., U' e RVy-\. Now
u(tx)-u(t)
tU'(f)
xy-\
Ait)
A(t)
_ ft sr-hr-^ H u"mm^}-y)m-yu'm
duds
~
W)
_ /r >*-* n ( ^ - y+1) v-Mu~Y du ds
A{t)
J\ Ait) U'it)
50
is asymptotically normal (when properly normalized) under von Mises' extreme value
condition. However, we want to consider many intermediate order statistics at the
same time; hence we want to consider the tail (empirical) quantile process.
It is instructive to start by proving the main result of Section 2.2, Theorem 2.2.1,
i.e., the asymptotic normality of a sequence of intermediate order statistics, again,
now not under von Mises' conditions but under the second-order condition.
Theorem 2.4.1 Let X\iH < X2,n < < Xnytl be the nth order statistics from an
Ltd. sample with distribution function F. Suppose that the second-order condition
(2.3.21), or equivalently (2.3.5), holds for some y R, p < 0. Then
(f)
is asymptotically standard normal provided the sequence k = k(n) is such that
k(n) -> oo, n -> oo, and
lim
n-oo
(2.4.1)
r-
,n\(k
+ Op
\y+p
51
has the same limit distribution as >/k(k Yn-k,n I n1), which is asymptotically standard
normal; moreover, since kYn-k,n/n ->p 1 by (2.4.1), the other terms go to zero by
assumption (note that tyYiP(l) = 0).
The last result can be vastly generalized and yields the following, relating the tail
quantile process to Brownian motion in a strong sense.
Theorem 2.4.2 (Drees (1998), Theorem 2.1) Suppose X\, X2, ... are Ltd. random
variables with distribution function F. Suppose that F satisfies the second-order
extreme value condition (2.3.21) for some y R and p < 0. Let X\,n < X2,n <
< Xn,n be the nth order statistics. We can define a sequence of Brownian motions
{Wn(s)}s>o such that for suitably chosen functions ao and AQ and each e > 0,
l 2+s
sy+ '
sup
l
k~ <s<\
^\
S(f)
= 0(1).
Definition 2.4.3 Let X\il < X2,n < < XHtn be the nth order statistics and
k = k(n) a sequence satisfying k -> oo, k/n -> 0, as n -> oo. We define the tail
(empirical) quantile process to be the stochastic process [Xn-\ks],n}s>0Remark 2.4.4 It may happen that the convergence of (U{tx) U(t))/a(t) to
{xY \)/y (cf. Section 2.3, relation (2.3.1)) is faster than any negative power of
t->oo \
a(t)
for all x > 0 and a > 0. In that case the result of Theorem 2.4.2 holds with the
bias part y/kAo(n/ k)tyyiP(s~l) replaced by zero provided k(n) = o (nl~) for some
e > 0. A similar remark can be made in connection with the convergence results for
the various estimators of Chapter 3.
One can extend the interval of definition of s as follows.
Corollary 2.4.5 Define
o-=
v) I Y
*'
+^1oil
I ^n,n ^
yy
v <-i
Y ^
sy+l'2+E
KfXn-vvte-BoCfr
^ \
5K|)
s-y-\\
0.
52
(i,sy+w)\Vk(^
-[ks],n ~ Xn-k,n
ao(f)
0<s<l
- s^-1
Y ')
Vy,P(s~l)
0.
K+l/2+<?
/TlXn-[ks],n
0<s<l
^""-l
0,
n -> oo, provided k = k(n) > oo, k/n > 0, and VkAo (n/k) = 0(1).
Moreover,
sup s
1/2+e
^/ogx,-,t^-togt/(;)+|o8\
0<S<1
\k/
U.
The remainder of this section is devoted to proving the above results. It is instructive to prove the result of Theorem 2.4.2 first for the special case F(x) =
1 (1 + yx)~l/y, y e R , and all x with 1 + yx > 0. The next proposition will be
used in its proof and in the proof of Theorem 2.4.2. Let Q(t) := F*~(t)9 *x and let x*
be the left and right endpoints of F respectively, and \x~\ the smallest integer greater
than or equal to x.
Proposition 2.4.9 (Csorgfl and Horvath (1993), Theorem 6.2.1) LetXu X2, ..be
i.i.d. random variables with distribution function F and assume:
1. F is twice differentiate on (*JC, **), oo <*JC < x* < oo,
(**, JC*),
53
,n
(1
J/'(g(0>l ^ ^
c
o <^/ -7^(o7- Z>f 0 < < 3. TTien we can <fe/we a sequence of Brownian bridges {Bn{t)} such that
nets-l'2{\
sup
l/(n+l)<f<n/(n+l)
= OP(1) ,
n -> 00 .
(2.4.3)
Lemma 2.4.10 Lef Fi, F2,... &e i./.d. random variables with distribution function
1 1/j, j > 1. Consider the nth order statistics Y\,n < Yi,n < * < ^n,w For each
y eRwe can define a sequence of Brownian motions {Wn(s)}s>o such that for each
e>0,
^+l/2+*
SUp
k~l<s<\
Y
Vk (({^y
Vn-[ks\n)
iWW
- 11 _ ^ ! \
_ ,-y-i
Wj| ( j )
sup
t)s~l/2
l/(n+l)<r<n/(n+l)
ni/2(1
_ 0y+i / ^ L _ l _
( 1
"'r'
IW
Bn(t)
= 0P{\) .
x 2
1
- >)
]_
(T)~V-I\
Y
( | ^-[ks],n)
- 1
/ * y /y, r -ibi,-' _
(T)'X-I\ =
"~K - 1
54
and that
ks\ d
(ks\
(ks\
ks
Bn 11 - J = Bn i \ L Wn l\ - W(l) .
(2.4.4)
It follows that
sy+l/2+e
1/2
+(-)
^w(l)
Op(D
uniformly for k l < s < 1. Hence for s > 0 the part within the absolute value is
op (I) uniformly. Moreover, notice that
\(ks \ l / 2 e
sup
1 S
-1<,<1 V "
Wn{\) =
OP(1).
,
sup til t)0<*<i
\F"(Q(t))\
^ < oo .
(F'(Q(t)))2
Since we are interested only in the right tail, we may without loss of generality change
our distribution near the left endpoint in such a way that
, \F"(Q(t))\
/t
s u p n l t)J
? < oo .
It remains to verify that
supf (1 - 1 )
rtl
\
(F'(Q(t)))2
or equivalently,
sup
G"(0
" ~75^V<00
(1
55
sup
2 + U'(s)
< oo
Now by assumption,
l i m 2 + - ^ = l + y;
^-oo
U (s)
^/GPr"<r(^^)-*('-S)l
= Op(l) ,
with a(t) = tU'(t), or after some rearrangement,
sup
ks ,1/2+e
k- <s<\
vk>)
- , - ^
1 / 2
( l - ^ ) | = OHl).
(2.4.5)
Now take e > 0. Then the expression within the absolute value must be op(l)
uniformly in s. Next we look at the Brownian bridge part. Recall (2.4.4) with {Wn]
Brownian motion. Further note that
^-I/2
sup
k-l<s<1
(")
1 /
V ( 1 )= OP(1).
(2.4.6)
\k/
1/2+e
Op(l)
(2.4.7)
It is not difficult to see mat (2.4.7) still holds if we replace the function a with any
function a\ provided a\ (t) ~ a(t), t -> oo. In fact, we shall use the function oo from
Theorem 2.3.6.
Finally, we can handle the expansion in the statement of the theorem:
v+1/2+sf /T(Xn-m,n-u(.j)
V
-
-y-
a0(?)
o(|)
s-y-i\
Y
56
-VlA>(j)*(>"')}
(2.4.8)
V o(?)
(2A10)
/ *
~ U(j)
/(&) ~fr( f )
flO(f)
flD(fe)
flO(f)
*-M, -
*0(f)
U(%)
0(&)
sup
0<5<1 S1'2
/2
< oo
a.s.
Proof (of Theorem 2.4.2). Since U satisfies the second-order condition, there exists
a function U\ satisfying von Mises' second-order condition such that
lim
JT U(Xn-lksln)
* \k))
Tin x , / x =
t-+oo tU[{t)Ax{t)
- y + 1 (Theorem B.3.13). With q(t) =
(2.4.11)
tU[(t)Ax(t)
Ul(Yn-[ks),n)
q(Yn-[ks],n)
9(f) / '
(2.4.12)
57
Thefirstfactor is bounded by assumption and the second one tends to zero by (2.4.11).
For the last factor recall Potter's inequalities (Proposition B. 1.9(5)): for t
>to,x>j
(l-6)xy^min(x-\xe^
<^ 1
sr+^
(k-Yn-^ n)
=J - P - I / W ^
fg),-P^(2.4.14)
sup
0<s<l
is bounded a.s., we find that the third factor of (2.4.12) is bounded as well. Hence
17 U(Yn-[ks],n)
y+l/2+e
~ U\{Yn-[ks],n)
k~l<s<\
/1x
= oP (I) ,
kUl^k'
n -> oo .
(2.4.15)
We already know from Lemma 2.4.11 that the result of Theorem 2.4.2 holds with
Xn-[ks],n replaced by U\(Yn-\ks],n)' Relation (2.4.15) then implies that the result of
Theorem 2.4.2 also holds with Xn-[ks],n replaced by U{Yn-\ks^n). This completes
the proof.
Proof (of Corollary 2.4.5). The range of t values in (2.4.3) is (n + 1) _ 1 < t <
rt(n + l)~ 1 .FortheresultofTheorem2.4.2weusedonlytherangen~ 1 < t < 1 n~l.
By taking t = n/(n + 1) in (2.4.3) and following the lines of the proof of Theorem
2.4.2 with s = n/(k(n + 1)) we obtain
\k(n + l)J
io(f)
(2.4.16)
Let us consider the case y > 5 first. Since
sup s~l/2+e
\W(s)\ 4 - 0 ,
(2.4.17)
4>0,
(2.4.18)
0<s<k~l
SUp
0<s<k~l
^+l/2+
*K,p(^_1)
58
and
sup ^+1/2+*
0<5<^ _1
J-''-l
(2.4.19)
Xn,n - /() 1
+
oo(f)
K
k~y-
(2.4.20)
Since
sup
jy+i/2+* <
fc-y-i/2-*
0<s<k~l
5.-1
and X_[^]>n = Xn, for 5 < A
: x, we get, using (2.4.17)-(2.4.20),
sup s
Vk(Xn-insu-uty
y+l/2+e
0<s<k
s-y-i\
X
o(f)
y )
(2.4.21)
_1
sup s
y+l/2+e K * -
0.
0<s<k~
(2.4.22)
Now, (2.4.21) is dominated by the sum of two terms: the left-hand side of (2.4.2),
which goes to zero, and
+l 2+e
sup
Sr
' Vk
x, - uq)
k~ <s<\
<
0(f)
k-v-l'2-Jk X, - U(i)
o()
k-y-l/2-e
V*!
*, - t/(g)
ao(f)
i
Y
1
(JOT)
(^o)""-fe)-^G)^(^)
59
-y
|fc.y
_(*(*+!))
Y
-y-l
The first term of this expression tends to zero by (2.4.16). One easily checks that the
other three terms tend to zero too. Similarly one checks that (2.4.22) tends to zero by
considering the three terms separately.
Proof (of Theorem 2.4.8). As before, we prove the result with Xn-[ks],n replaced with
U (Yn-[ks],n), where \Yi^n} are the nth order statistics from the distribution function
1 - 1/JC, x > 1. Theorem 2.3.9 tells us that for s > 0,
ujYn-iksm) _(k
y
n [ksln
u(i)
V* - )
+ A
\k) \\nYn-[ks]>n)
^ ( 1 ) \nYn-[ksU)
}'
(2.4.23)
<s<\,
+ op(\)s-y-xl2-e)
(2.4.24)
(ri-[fai,i,)p - 1 = Lzi
L-P-IWH{S) + 0
P
P Vk V
{l)s-P-w-e\
(2425)
'
Now note that in fact the product of the right-hand sides of (2.4.24) and (2.4.25) can
be written as
s-rS-^-Z+op(l)s-y-V2-.
Hence
Y
(-Yn-lksln)
^Yn-lkslnY
1=
- y L l ^ l
(l)s-Y-l/2-e
( 2 A 2 6 )
(2.4.27)
60
It follows by combining (2.4.23), (2.4.24), (2.4.26), and (2.4.27) that the supremum over k~l < s < 1 of the expression in the first statement of the theorem is
op (1). The rest of the proof is like that of Corollary 2.4.5.
For the second statement note that the second-order condition (2.3.22) is equivalent to
,. logU(tx)-logU(t)-ylogx
xp-\
lim
t-+oo
A(t)
xp - 1
A0(t)
<xpmd^(x\x~8)
Exercises
2.1. Let X\, X2,... be i.i.d. random variables with distribution function F and
X\,n < %2,n "m < Xn,n the nth order statistics. Let F be in the domain of attraction of Gy with y > 0. Prove that Xn,n/Xn-\tn -+d Yy as n -> 00, where Y
has distribution function 1 1/jt, x > 1.
Hint: Renyi's representation (Section 2.1) implies that for exponential order statistics,
(_!,, Enyn En-\,n) are independent and En^n En-\,n has a standard exponential distribution. Hence (X n ,/X_i,) =d(U(Y*Yn-hn)/u(Yn-.hn),
where y* and
*n-i,n are independent, Y* has distribution function 1 l/x, x > 1, and F_i,n is
the second maximum of a sample of size n with distribution function 1 l/x, JC > 1.
Finally, use Corollary 1.2.10.
Remark: The converse statement is also true (see Smid and Stam (1975)).
2.2. (Beirlant and Teugels (1986)) Let Xi, X 2 , . . . be i.i.d. random variables with distribution functionFwithJC* > O.DefineM^ :== k'1 X!?=o logX-,> - l o g X _ M .
If F is in the domain of attraction of some extreme value distribution Gy with auxiliary
function a(n) and k < n is a fixed integer, then
M (1)
M
n,k
a(n)/U(n)
V>k k 2U=o
, y < u,
asn -> 00, with/ := (1/(1-F))*~, Qk,Z0, Z\,..., Z*_i independent, Q* gamma
distributed with A: degrees of freedom, and Z/, 1 = 0 , 1 , . . . , k 1, i.i.d. exponential.
2.3. Derive the limit distribution of Xn,n from the point process convergence of Theorem 2.1.2. Do the same for the joint distribution of (Xn-\tn, Xn,n).
2.4. What are the possible limit distributions of (X_i,w, Xn
n)l
61
2.5. Let Y\, Y2> be independent and identically distributed with distribution function 1 l/x, x > 1. Using the point process convergence of Theorem 2.1.2 find the
limit distribution under a trend, i.e., the limit distribution of maxi<;< n (X; i).
Hint: Recall what convergence of point process means (cf. last paragraph of Section
2.1).
2.6. Let U\,n < U2,n < - < Un, n be the order statistics from a standard uniform
distribution. Let k = k{n) be a sequence of integers such that for some p e (0,1),
lim^oo <s/n(k/n p) = 0. Prove that y/n(Uk,n p)/y/p(l - p) has a standard
normal limit distribution as n -> oo.
2.7. Show that the distribution function F defined by 1 - F(x) = x~l(l +
JC -1 exp(sinlogx)) satisfies the domain of attraction condition but not the secondorder relation (2.3.24).
2.8. Find the second-order relation for the Cauchy distribution.
2.9. Check the second-order condition for the normal distribution: note that with O
the standard normal distribution function,
1 - <D(f) = (27t)~l/2e-t2/2(l/t
Write ir(t) := 1/(1 - O(0). Prove that for
- l/t3+o(l/t3))
JCGR,
- V(t) - (\ogx)/V(t)}
= -(logjc) 2 /2 - log* ,
x >0,
f-00
-V(t)-
log*
\
1
(2 log t - log log t - log 47r) /2 J
= -(log*)2/2-logx.
Hint: For the last step use (and prove)
tp(0 = (21ogr - log log t - log4jr) 1/2 + o ((21og0~ 1 / 2 ) ,
t -> oo .
2.10. Check that the gamma distribution satisfies the second-order regular variation
condition with y = p = 0 and determine possible auxiliary functions a and A.
2.11. Prove the equivalence of (2.3.22) and (2.3.24) by noting that (2.3.22) is equivalent to
U{U-(,)xUr)_x
xp/y_l
lim
= x
t-+oo A (t/<-(*))
y
(with U*~ the left-continuous inverse of U) and then applying Vervaat's lemma (Appendix A).
62
2.12. Let U(t) = tv -k/y + ^ + T / T + t f ( ^ + T ) for y > 0and r < 0, f -> oo. Check
that C/(f) satisfies the second-order condition for y positive (2.3.22) with A(t) = tT
iffc= 0 or y -f r > 0 and A(t) = kt~y if y + r < 0. Discuss the case y + r = 0.
2.13. Let y > 0. Check that for y + p > 0, or y + p < 0 and lim^oo 7(0
a(t)/y = 0, if A*(f) is the auxiliary function in (2.3.22) then possible first- and
second-order auxiliary functions for (2.3.5) are a(t) = yU(t)(l + A*(t)/y) and
A(t) = (y + p)A*(t)/y respectively (and vice versa).
2.14. Verify that if U(t) = c 0 + c i ^ (1 + c2tp + 0 ( f ^ f o r y < 0,p < 0, y + p ^ 0,
co, <?2 # 0, and c\ < 0, as t -> oo, then the second-order condition (2.3.5) holds
withA(0 = py~l(y + p)c2tp and tf(0 = ycif^ (l + p _ 1 A ( 0 ) .
2.15. The Student /-distribution with v degrees of freedom satisfies
v/2
1 - F(0 _
-cv
vv/2+l
L-7
{V
+av
(V+2
n(t~v~4}
+u(t
) ,
t -* oo, where
V
cv
1/JT
1/4
2/(3JT)
3/16
8/(15*)
dv
--1/(3*)
-3/16
-4/(5)
-10/32
-8/(7*)
(Martins (2000)). Hence this model satisfies the second-order condition (2.3.5) with
y = 1/v and p = 2/v. Obtain the auxiliary functions.
2.16. Lemma 2.4.10 implies that
^('-'(HHJM;)
inD[l,oo).
2.17. Let s be some fixed positive constant. Deduce under the conditions of Theorem
2.4.2 that
converges to a normal random variable with mean zero and variance s~2y~l
n -> oo.
as
63
2.18. Formulate an analogous weak convergence result for the empirical distribution
function in the situation of Theorem 2.4.2. Assume VkAo(n/k) -* 0. Prove the
result.
2.19. Prove that under the conditions of Theorem 2.4.8 and \/kAo(n/k) -> X,
^n2k,n
3.1 Introduction
The alternative conditions of Theorem 1.1.6 (Section 1.1.3) serve as a basis for statistical applications of extreme value theory.
Consider relation (1.1.22) (Section 1.1.3): there exists a positive nondecreasing
function / such that
1 II
l i"
F7^
= (l + yx)
l/y
(3.1.1)
for all x for which 1 + yx > 0, where JC* = sup{x : F(JC) < 1}.
Let X be a random variable with distribution function F and let F e T>(Gy) for
some real y. Then (3.1.1) tells us that for x > 0, x < (0 v
(-y))~l,
lim P ( ^ > x \X > t] = (1 + yjt)- 1/}/ .
That is, the conditional distribution of (X t)/f(t)
distribution, as t t **>
Hy(JC) := 1 - (1 + yx)~l/Y
(3.1.2)
0 < x < (0 v ( - y ) ) _ 1 ,
(3.1.3)
where for y = 0 the right-hand side is interpreted as 1 e~x. This class of distribution
functions is called the class of the generalized Pareto distributions (GP). Figure 3.1
illustrates this class for some values of y.
Relation (3.1.1) means loosely speaking that from some high threshold t onward
(i.e., X > t) the distribution function can be written approximately as
. x >t,
which is a parametric family of distribution tails. One can expect this approximation
to hold for intermediate and extreme order statistics. Let X\, X 2 , . . . be independent
66
y=q.
yv
y= l
1.0 -
---
0.8 -
s'
0.6 -
^^-~~-~
^^^^
~~~~
0.4 0.2 -
1.0
1.5
Fig. 3.1. Family of GP distributions: for y = 2 and y = - 1 the right endpoints are 1 and
0.5 respectively, for y > 0 the right endpoint equals infinity.
and identically distributed random variables with distribution function F, and Fn the
corresponding empirical distribution function, i.e., Fn{x) := n~l YH=x 1{X,<.*}- Let
us apply the last approximation with t := Xn-k,n, where we choose k = k{n) > oo,
k/n 0, n -> oo. Then
1_ F
(X'*:-"'")}
* (1 - F ( X _ M ) ) j l - Hy
-'-;{'-^(i^4s))-
XY
- 1
y
(3.1.5)
x> t .
(3.1.6)
3.1 Introduction
67
O>~(I)+'(I)H-
_i
/(17x I0 )u(i)+a(^)+-'Kk/
\k
(3.1.8)
(3.1.9)
based on suitable estimators y,U(n/k), and a(n/k). In the rest of this and in the next
chapter we shall meet various estimators of these quantities for which the vector
rrl.
t/(f)-t/(f) 3(|)
11/ft
is asymptotically normal under suitable conditions. Using this relation we will prove
in Chapter 4 that
68
w
.,
.ij
,(^
)}(I wr
I+
S
with y and a(n/k) suitable estimators. Recall the relation f(t) = a(l/(l F(t)))
(cf. Theorem 1.1.6) with a the positive function in (3.1.5). Then for t := Xn-k,n we
get /(*_*,) = 0(1/(1 - F(X_ M ))) a{n/k).
Again we see that the estimation of y is a crucial step, which is the main subject
in the present chapter. Next, in Chapter 4 we shall prove asymptotic normality of the
tail estimator suitably normalized.
Life Span
The life span of people born in the Netherlands in the years 1877-1881 is assumed to
be random. Based on life spans of about 10 000 people, we want to decide whether the
underlying distribution has a finite upper limit U(oo), and if so, we want to estimate
U(oo). The asymptotic normality of (3.1.10) provides a confidence interval for y that
enables us to test the hypothesis Ho : y > 0 versus H\ : y < 0. It will later turn out
that the null hypothesis is rejected. Then we want to estimate the finite value U(oo)
and for that we use the limit relation (1.2.14) of Lemma 1.2.9:
,. l/(oo) - 1/(0
lim
=
t-+oo
a(t)
1
,
y
i.e.,
/(oo) 1/(0
a(f)
y
- /n\
a (?)
U(oo) = u(-)!f.
and we will prove, using the joint asymptotic normality in (3.1.10), that U(oo)
U(oo), suitably normalized, is asymptotically normal. An asymptotic confidence interval for U(oo) ensues.
3.2 A Simple Estimator for the Tail Index: The Hill Estimator
69
= y .
f
I
ds
(1 - F(s)) = I
Hence we have
ft (log u log t) dF(u)
lim Jt
-f-l
= y .
(3.2.1)
t^oo
1 - F(t)
In order to develop an estimator based on this asymptotic result, replace in (3.2.1)
the parameter t by the intermediate order statistic Xn-k,n and F by the empirical
distribution function Fn. We then get Hill's (1975) estimator y#, defined by
/5
logu - log Xn-k,n dFn(u)
YH : =
l-Fn(Xn-k,n)
or
YH-\Y1
k-\
lQ X
(3.2.2)
For the proof of the following theorems we need this auxiliary result.
Lemma 3.2.1 Let Y\, Yi,... be i.i.d. random variables with distribution function
1 l/y, y > 1, and let Y\,n < Y2,n < < ^n,n ^ ^ w^/z orJ^r statistics. Then
with k k(n),
lim Yn-k n = o
a.5. , n -> oo ,
provided k(n) = o(n).
70
Y-1
n **
rr
a.s. ,
n -> oo ,
i "
i
-l1liyi>yn-k,)>-l1hri>r}>Yr
Theorem 3.2.2 Let X\, X%,... be Ltd. random variables with distribution function
F. Suppose F e V(GY) with y > 0. Then as n oo, k = k(n) oo, k/n -> 0,
*
YH-+Y
Proof By Corollary 1.2.10, F e V(Gy) with y > 0 implies
U(tx)
r
lim
= xY
^ o o i/(r)
for x > 0, i.e. (Proposition B.1.9), for JC > 1 and f > to,
U(tx)
(l-e)xy-'<^<(l+e)xy+'9
or equivalently,
log(l -e) + (y-
(3.2.3)
Let Y\, Y2,... be independent and identically distributed, with common distribution 1 - 1/v, v > 1. Note that 17(7/) = rf X,-, i = 1, 2 , . . . . So it is sufficient to prove
the result for yH := k~l ^
log /(F_,>) - log / ( r - M ) . Apply (3.2.3) with
t = Yn-k,n, x = Yn-ifn/Yn-k,n' Since by Lemma 3.2.1, yn_jt, -> 00 a.s., n -> 00,
we have eventually,
log(l - e) + (y - ef) log f ^ z i i " ) < log t/(y n _,. w ) - log f/(Fn_it,n)
3.2 A Simple Estimator for the Tail Index: The Hill Estimator
71
J < yH
\Yn~k,nJ
i=0
,_n
\*n-k,nj
*I>(fe)
A:
ii.
This is part of a separate lemma (note that log Yf has a standard exponential distribution), which we give next.
Lemma 3.2.3 LetE, E\, Ei,... be i.i.d. standard exponential and let E\,n < E2,n <
< Enfn be the nth order statistics. Let f be such that Var/() < oo. Then
Vk
i=0
w independent of En-k,n and asymptotically normal with mean zero and variance
Vaif(E) as n -> oo, provided k = k(n) -> oo and k/n -> 0.
Proo/ Renyi's (1953) representation implies the independence statement and it gives
for each n,
/*
"n-7+1
\^ni,n ~~ Enktnj._Q =
j=i+l
i=0
with *, | independent and identically distributed standard exponential. It follows that the distribution of the left-hand side does not depend on n and that
[En-Un - En-k,n}i=0
{Ek-itk}ifc-i
i=Q
It follows that
= v* I /(,)-/())
since we take the average of all order statistics.
The result follows from the central limit theorem.
72
YH-+Y
Then F e
> 0.
V(Gy).
Proof Let Fn be the empirical distribution function of X\, X2,..., Xn and Gn the
empirical distribution function ofY\,Y2,...,Yn,
which are independent and identically distributed 1 1/JC, x > 1. Then for each ,
1 F w
- " = 1 - G -(r^))
We write
du
di
u
n r
9H = T
d-Fn(u))
k
Jxn-k,n
= lf
k
du
(l-Gn(s)) d\ogU(s)
JYn-k,n
with Y\,n < Y2,n < - < Yn,n the order statistics
We are going to use the following results:
ofY\,Y2,...,Yn.
1.
plsups(l-Gn(s))>b\
(
inf
\i<s<Yn,n
s(l-Gn(s))
= i,
<a]
for b > 1,
l/a
<- e~-\/a
,
5 (1 -
G(J))
ks
= inf r-[jte],
0<5<1
\0<s<\
Yn-k,n J \n
where the two factors are independent. Hence this is the same in distribution to
3.2 A Simple Estimator for the Tail Index: The Hill Estimator
73
with the F*'s independent of and equal in distribution to the Y 's, that is,
(
inf
s(l-Gn(s)))-Yn-k,n.
n(0 + l
- k(n(t) + 1)
Consider
r0
dlogU(s)
(l+e)
rflogl/fr)
n />
s(\-G
-G(*))
dlogl/fc)
H
The first inequality is true by definition. The second and fourth inequalities are true
with probabilities tending to 1 (for the second we use result (3); the fourth is true by
assumption). By results (1) and (2) the third inequality is true with probability at least
1 - e{\ - e)~l exp (-1/(1 - s)) > 0.
Hence we reach the conclusion that for each s > 0,
/
fdlogU(s)
P{t]t
-^<K+)>0
for t sufficiently large. Hence for sufficiently large t, t ft s~l d log U(s) < y + s.
We get the other inequality in a similar fashion.
Hence
lim t "dlogt/(s)
t-+oo /
Now by partial integration
f 1
f
ds
-d\ogU(s)
=t
logU(s) -j
- log U(t) .
s
s
Jt
Jt
Hence by Remark B.2.14(2) we find that for x > 0,
t
That is, U is regularly varying with index y, which implies (Proposition B.1.9) that
the function 1 F is regularly varying with index l/y.
74
x-\
-,
p
(3.2.4)
or equivalently,
Um
117
JT^M-xzW
* uxP/r-1 f
=x-i/Y
(3 2 5)
\\-F(t))
where y > 0, p < 0, and A isapositive ornegative function with lim,-*.,*) A(f) = 0.
Then
with N standard normal, provided k = k(n) - oo, k/n * 0, n > oo, and
lim V*A ( ) = A.
(3.2.6)
with X finite.
Remark 3.2.6 It may happen that the convergence of U(tx)/U(t)
than any negative power of t, i.e.,
t^oc
\ U(t)
to xy is faster
for all x > 0 and a > 0. In that case, the result of Theorem 3.2.5 holds with the
bias part k/(l p) replaced by zero. A similar remark can be made for all the other
estimators.
Proof (of Theorem 3.2.5). We write the second-order condition as
lim
'-oo
T7(7)
A(t)
xp
- \
p
A(t)
.
p
We apply the inequality given in Theorem B.2.18: for a possibly different function
AQ, with A.o(f) ~ A(t), t -> oo, and for each e > 0 there exists a fo such that for
t >to,x
>l,
log C/(fJt) - log U(t) - y log x
Ao(0
xp - 1
< *"+* .
(3.2.7)
3.2 A Simple Estimator for the Tail Index: The Hill Estimator
75
+ Op(i)|A0(r-M)liE ^ )
hence
k-1 (&=Ls.Y - 1
1=0
*-1 / v
* n
i=0
. \ /+e
\Xn-k,n/
i=0
i=0
k^Mn-k,*)
,=o v--*.'
_1
nA-P
4- EY** = ,
1-p-e
Mi)
This follows from Lemma 2.2.3, the fact that the function | An| is regular varying, and
Potter's inequalities (Proposition B. 1.9).
76
Proof (Second proof of Theorem 3.2.5 via the tail quantile process). By the second
statement of Theorem 2.4.8, with {Wn(s)}s>o a sequence of Brownian motions and
for each e > 0,
+ J = (s'lWn(s)
= -ylogs
- W(l))
where the 0p-term tends to zero uniformly for 0 < s < 1. Hence
YH = /
ds
Jo
= Yf
{-\ogs)ds
+^ - j
(s-lWn(s)-Wn(l))
ds
It follows that
Vk(yH-Y)
= YJ
(s-lWn(s)
If (s-1Wn(s)-Wn(l)) ds\ =
Remark 3.2.7 A third proof of Theorem 3.2.5, using an expansion for the tail empirical distribution function, will be given in Section 5.1.
Examples of distributions satisfying the second-order condition are abundant. For
example, the Cauchy distribution satisfies
1 - F(x) =
(JCTT)"1
00
+c2x-l/y+p/y(l
+0(1)) ,
x -> oo ,
(3.2.9)
for constants c\ > 0, c2 ^ 0, y > 0, and p < 0, then the second-order condition
(3.2.5) holds with the indicated y and p.
The second-order framework used in Theorem 3.2.5 provides the most natural
approach to the asymptotic normality of estimators like Hill's estimator. However,
next we discuss some of the problems related to second-order conditions of this type.
3.2 A Simple Estimator for the Tail Index: The Hill Estimator
77
np^l~2p\
1. Suppose Vic \A(n/k)\ -> oo. Then it is not difficult to see using the inequalities
in the proof of Theorem 3.2.5 that
-Y
A(|)
YH
l-p'
Since for large w, k must be much larger than W - 2 P / ( ! - 2 P ) 5 w e have for large n,
n/k much smaller than nl+2P/(i-2P) = w 1 /(i-2p) Hence the convergence rate
| A (n/k) | is slower than the rate np^l~2p) found after (3.2.10).
2. Suppose VkA (n/k) -> 0. Then
t(n) = o ( n - 2 ^ 1 - 2 ^ ) >
and the convergence rate l/Vk is again slower than np^x~2p\
3. Suppose VkA (n/k) -> X ^ 0, oo. Then by (3.2.10) the convergence rate l/Vk
is of order np^l~2p\ This is the optimal situation.
The above discussion leads to the question, what is the best choice for A? Theorem
3.2.5 tells us that if VkA (n/k) -> A, then
Vk(YH -Y)-^yN
+- ^ l-p
(3.2.11)
YN
YH - Y F +
A
YN
(!)
F ^ V + /i
\
,^*
(3.2.12)
We want to know for which choices of k k(n) this approximation is best, i.e., for
which k its mean square error
Y2
k
A2 ()
(1 - p) 2
78
is minimal. For the time being we continue to consider the special case A(t) = ctp.
Write r := n/k. This leads us to
2
(ry2
c22rr2p
f> \
argrnm ( n / r ) = u ,... ( + _ ~ 2 J ,
/v2a_^2xl/(2p-D
/ Y U P) \
l/d-2p)
-[(^f)'
-2p/(l-2p)
o(") =
.2x1/(1-2/)
(Y2(l-P)2Y
-2p/(l-2p)
(3.2.14)
and we call ko(n) the optimal choice for the sequence k(n) under the given conditions.
Now we go back to (3.2.12). We would like to consider min* E (y#y) 2 , but since
the expectation may not exist, we consider the minimum of the substitute expression
E
and the sequence ko that optimizes (3.2.15) will serve as the optimal choice for the
estimator y# too. Note that
lim VkoA ( f ) = lim c n k ^
= ****-<
(3 .2.16)
where sign(c) = 1 if c > 0 and sign(c) = 1 if c < 0. Hence for this choice of k we
have
^{yH-y)AN{^l,y2)
3.2 A Simple Estimator for the Tail Index: The Hill Estimator
79
and
A(|)
(YN
lim fc0min
\2
-= + ^ -
,. ,
. (Y2
= lim * 0 min Y +
^{D \
u /
by (3.2.16).
So far we have considered only the special case A(t) = ctp with p < 0. This is
often assumed in applications of extreme value theory. However, such simplification
is not feasible in the case p = 0. So next we shall consider the optimality problem
in the more general case of the second-order condition, not just the special case
A(t) = ctp.
As it will turn out in the end, if a sequence ko(n) is optimal then any sequence
k(n) ~ ko(n), as n -> oo, is also optimal. This implies that we can replace the
function A by any function A* with A*(f) ~ A(t), as t -> oo, without loss of
generality.
Similarly as before, we are faced with finding
argmin^o {
(3 2 17)
(T^)2 J
' -
where the function | A | is regularly varying with index p < 0. For p 0 it is reasonable
to assume that there exists a function A* with |A*(Y)| ~ |A(r)|, as / - oo, and |A*|
is monotone decreasing (see Theorem C.l in Dekkers and de Haan (1993)). In that
case we can assume without loss of generality that the function A2 satisfies
r
hm
A2{tx) - A2(t)
t^oo
q(t)
.
= log x ,
x > 0,
A2(f) ~ / s() dw .
/
We have for c > 1 and sufficiently large t,
,2
^-1
/oo
(3.2.18)
,,,2
f
A{t)
*y ^ c
f00 ^ A
y ^
2
n +
(1 - p)-~ J,/ J(W) d < n h (1 - p) 2
< ^ - + -- /
/2
(I
80
The infimum over t > 0 for the right- and left-hand sides can be calculated by just
setting the derivative equal to zero. For the right-hand side we get
y2d-pf
= s(t),
which is equivalent to
t= s
Y ^ /y2a-P)2\
n
V cn
/
c
c f
s(u)
2
p)2 JsJs^{Y2(\-p)'1/(cn))
(1(1 - P)
du
\YW-PY
2
(l-p) j
cn
(1 - P) Jo
Cn
Js^(Y2i\-p)'1Kcn))
s^(u) du,
pv
s(u)du=
Js<-{v)
s*~(u)du.
(3.2.20)
J0
For the left-hand side of (3.2.19) we have the same result but with c replaced by c~l.
It follows that the infimum (3.2.17) is
y2(l-p)2/n
(1 -p)2Jo
s^(u) du,
and it is attained at
t ~ s
(**)-
( ) '
Hence an optimal sequence ko = ko(n) in the sense of minimizing y2/k
A2(n/k)/(l p)2 is given by
ko =
-(**) J"
What can we say about the asymptotic distribution of \/&o(y# y)? As before,
3.2 A Simple Estimator for the Tail Index: The Hill Estimator
81
-p
( J
y-r-
-rr /
s(u)
du
= K2(1-P)2
/j
/y2(1-^2/"J^()(fM
K 2 (l-P) 2
,-(**)
ta^_taC^l_i^
JC-+00
(,221)
and
For p = 0 the limit in (3.2.21) must be interpreted as infinity. This means that
for p = 0, by minimizing the mean square error we get an optimal sequence ko for
which
T/ko(YH-Y) + bnAN(0,y2),
where bn is a slowly varying sequence tending to plus or minus infinity. This statement
is not useful for obtaining an asymptotic confidence interval for y. All we can say is
that
y/ko,
v P +
-7 (YH -Y)->1
= X2 > 0 .
(3.2.22)
82
()
YHI-P)2
n
s(n/ko)
Now, the functions f(t) = X2t/ ft s(u) du and \/s(t) are both RV\, but by Theorem B. 1.5,
lim s(t) f(t) = 0 .
t-+oo
U0.-xy
lim ^-:
^00
t-*oo
= xY logjc
logr
Consequently,
A(0=
and
logt
2
S(t) =
KlogO 3 '
It follows for the optimal sequence ko(n) that
*o(ft)~
dogn)3,
83
that for even moderate sample sizes the estimator may give the wrong impression
since the bias takes over very rapidly.
Another disadvantage of the Hill estimator is the fact that the estimator is not shift
invariant. A shift of the observations does not affect the first-order parameter y but it
may affect the second-order parameter. Consider the special case
U(t) = co + city + C2ty+T + o (ty+T)
(3.2.23)
C0
_y
C2
xY ~ t yxy(x y - 1) H
tTxy(xT - 1) .
U(t)
c\
c\
For r > y the second term dominates and for x < y the first term dominates.
Hence in the second case (r < y) one can improve the rate of convergence by
applying a shift co to the observations so that the first term of (3.2.23) disappears
and the second-order parameter changes from y to x < y. This simple trick (due
to Holger Drees) works in surprisingly many cases and results in much less disturbing
bias. Of course the trick also works when tT is replaced by any r-varying function.
/-i\-l i
yP := (log 2)
^nk,n ~ Xn2Jc,n
log
/0
.
^n2k,n
1X
(3.3.1)
~~ ^n4k,n
YP-+Y
2> Yn-4k,n
(3.3.2)
84
j^Ml"
-{(^JV'J"
(3 .3.3)
with F*, F > ^n independent and with common distribution function 1 1/JC,
the two components of the random vector in (3.3.2) are independent. By restricting
attention to 0 < / < k in (3.3.3) one also sees that the distribution of
i*
*nk,n J / = i
(3.3.4)
as
*n Lemma
Corollary 3.3.3 Denote the limit vector of'(3.3.2) by (Q, R). Then
\ 4 Yn-4k,n
V2
Proof.
1 Yn-k,n _ - _ / I Yn-k,n _ A / 1 ^-2^,n _ A
4Fn_4M
"U^-2M
/
V2F n _4 M
/
/ l Fn-M
A/1^-2M
\
+1
\2FW_2M
)\2Ytt-4ktn
J'
Proo/ (of Theorem 3.3.1). We use the domain of attraction condition (see Theorem
1.1.6, Section 1.1.3)
hm
t-+oo
a(t)
x > 0.
Since U is monotone, the relation holds locally uniformly. It follows that locally
uniformly for 0 < JC, y < oo,
u(tx) - u(t)
lim
=
t-+oo U(ty) - U(t)
-i
.
yY - 1
(3.3.6)
Xnk,n ~~ ^n4k,n
Xn2k,n ~~ Xn4k,n
Xn2k,n ~ ^n4k,n
85
for
..
CX
\l\
u(^^nYn-4k,n)-U(Yn-4k,n)
*n 4k,n ->> 4
Yn-2k,n
P ~
a n d *n4&,n
si <> o\
>2.
(3.3.8)
- Xn-2k,n
Xn-2k,n
Xn-4k,n
- 1
-\=2y
2Y \
lim
QyW
'-
= (COO)*"
HYAQ-\X))
lim
t^oo
^
A(t)
, x
= HYP(x):=
Y
^
/ sy~l / up~l
Ji
Ji
duds
for all x > 0 with Dy(x) = (xy - \)/y, a(t) = /(/(*)), and A(t) = a(U(t)).
Then, for k = k(n) -> 00, k/n -> 0, and
lim y/kA ( ^ ) = X
n->oo
(3.3.9)
86
with k finite,
Vfc(yp -y)-+N
(Xby,p,
varY)
Y,f>
' ^
l-2-P+l+4-^
= <
P 2 (log2) 2
< U
^ " '
P < 0 = y,
P = 0,
1,
ad
y 2 (2 2y+l + 1 )
4(log2)2(2>'-l)2 '
var v : =
4(log2) '
'
y =0.
Pro^/i We repeat the inequalities of Theorem 2.3.6: there exist ao and Ao such that
for any , 8 > 0 there existsfysuch that for t, tx > to,
u(tx)~u(t) _
ao(t)
xy-i
Ao(0
y p
0
& v -<5\
) = : 9y,p,,aOO
Vy,p(x) < ex + max(jc, A:"
with
r JC^+^-I
Y+p
y + P 7^ o, p < o,
log* ,
y + p = 0,p<0,
i*Mogx,
p = 0#y,
[(log*)2,
p=0=y.
*x.p(*) = {
It follows that
ao(yn-4ife,/i)
\*n-4k,n
= Vk
ao(Yn-4k,n)
\Yn-4k,n)
= V V
"'
+ VkAo(Yn-4kin)*YtP t^Lj^L)
(3.3. 10)
(3.3.11)
( "~~M J .
\ *n-4k,n
By Cramer's delta method and Corollary 3.3.3, we
have forJ thefirstterm in (3.3.10),
+ op(\)\/kAo{Yn-4k,n)qY,f>,e,&
87
Recall from Corollary 2.3.5, Theorem 2.3.6, and assumption (3.3.9) that if p < 0,
lim VkA0 (y) = lim Vk -A (y) = - ,
t-+oo
\k/
t-+oo
\k/
and if p = 0,
lim VfcAo (T) = Hm VkA (-) = X .
r-*oo
\fc/
f-oo
Vfc/
Hence the second term in (3.3.10) converges to (l{p<o}P~l + l{p=0}) 4~ p X *I>y>/0(4).
Finally, it is easy to see that (3.3.11) is asymptotically negligible. Therefore
^
fU(Yn-k,n)~U(Yn-4k,n)
V
_ 4^ - 1 \
0O<7-4JM)
( G + 7 f ) + (l{p<0}^ + l { p = o } ) 4 - ^ * y , p ( 4 ) .
^ ^
(3.3.12)
Similarly,
^
+ (\{p<0}~
+ l{p=o}) 4 ^ ^ * y t P ( 2 ) .
(3.3.13)
Vk (2* -2r)
V
= Vk(
UiYn k n)
->
~ U(Yn-4k'n)
\U(Yn2k,n)-U(Yn-4k,n)
^ 1 )
2Y-1J
ao(Yn-4k,n)
V
U{Yn-2k,n) ~ U(Yn-4Kn) V l
ao(Yn-4k,n)
\
1
U(Yn-2k,n)-U(Yn-4k,n)2y-l]ao(Yn-4k,n)
\\
ao(Yn-Ak,n)
V
2YY
>
Y
Y
ao(Yn-4kjn)
)
Y
(fi + ^ ) + (lfp^oij +
(W
Y
)
1(P=0))
4-"X* y . p (4))
88
^ 1 (ir-lR
l2
^*K,P(2)
^*y.(2)
M^)'
The result follows. The
particular+case
y = 0 and p < 0 is left to the reader (cf.
+ (l{p<0}^
l{p=0))4-'A
Exercise 3.6).
Proof (Second proof of Theorem 3.3.5 via the tail quantile process). Rewrite (3.3.1)
as
YP
(iog2rHog(x"-^"-Xn-^n)
V Xn-[k'/2],n
~ *n-k',n
for some sequence of integers k' = k'(n), where k! = 4k. Using Theorem 2.4.2 with
{Wn(s)}s>o a sequence of Brownian motions, s = \ and s = \,
Xn-[k'/4],n
Xn-[k'/2],n
Xn-[kf/2],n ~~ Xn-kf,n
4Y _ 2 K
=2
89
Hence,
1
(Xn~[k'/4],n - Xn-[k'/2\,n \
Xn-[k'/2],n - Xn-k',n J
= y^2+jp^-L
+ A
so that as n - oo,
^(yp
-y)
(x-t
hm P I > x
tfx* V f(f)
y
> t\ = 1 - Hy(x) := (1 + yx)-l'i/y
.
')-
(3.4.1)
This relation suggests that the larger observations (reflected in the condition X > t)
approximately follow a generalized Pareto (GP) distribution. Since the class of GP
distributions is parametrized by just one parameter y, this suggests that if we apply the
maximum likelihood procedure to the largest observations using the GP distribution
as a model, we could obtain a useful estimator for y.
This idea that we are now going to explain in detail will lead to what is generally
called the maximum likelihood estimator of y in extreme value theory (although
sometimes a slightly different definition is used). After determining the estimator,
we shall (and have to, since the general asymptotic theory of maximum likelihood
estimators does not apply for this approximate model) prove asymptotic normality. In
order to use the condition "X > t" in (3.4.1) properly we need the following lemma.
Lemma 3.4.1 Let X,X\,Xi,...
,Xnbe Ltd. random variables with common distribution function F, and let X\,n < X2,n < < Xn,n be the nth order statistics.
90
F(x) - F(t)
\_F(t)
>
* > '
Proof Let E\,n < < F n , n be the order statistics from an independent and identically distributed sample, with standard exponential distribution, P(E > x) = e~x,
x > 0. Then it is easy to see that the conditional distribution of (En-k+i,n,...,
Fn,n)
given {En-k,n = t] equals the distribution of (E* k,..., E%k) with
P(F* > JC) = e-(x-'\
x> t .
(3.4.2)
...,V(2>.
Now for JC > V(0
P (V(F*) >
JC)
F(JC)))
( 1
F { j c ) )
Hence we have proved the lemma for any distribution function F that is continuous
and strictly increasing. We leave the more general case to the reader.
Let X i , . . . , Xn be an independent and identically distributed sample with common distribution function F . As in the previous sections, to estimate y we shall concentrate on some set of upper order statistics (Xn-k,n. %n-k+i,n, Xn,n) or, equivalently, On ( Z o , Z\, . . . , Zfc) : = (X n _jfc, n , Xn-k+l,n
~ Xn-k,n,
> Xn,n -
Xn-k,n)-
91
Now consider the usual asymptotic setting, where k = k(n) -> oo and n/k ->
oo, as n -> oo, and hence Xn-k,n -* ** a.s. Then in view of the generalized
Pareto approximation we apply the maximum likelihood procedure to the limiting
Pareto distribution, which is explicit. Hence, the maximum likelihood estimator of
y (and consequently of the scale) is obtained by maximizing with respect to y (and
<T) the approximative likelihood nf=i hy^izi) with Zi = xn-i+i,n *n-k,n and
hy,a(x) = dHy(x/cr)/dx.
Note that this approximative conditional likelihood function tends to oo if y <
1 and y/a | (Xn>w Xn-k,n)~~l> so that a maximum over the full range of
possible values for (y, a) does not exist. We shall concentrate on the region (y, a) e
(1/2, oo) x (0, oo), since the maximum likelihood estimator behaves irregularly if
y<-\.
The likelihood equations are given in terms of the partial derivatives
d\oghy,0(z)
dy
dloghyt(T(z)
da
i + S*
- i + 4,.
The resulting likelihood equations in terms of the excesses X n _i+i t
follows:
k
J ] log ( l + -(X_ / + i, _ / j_
Vy
'
Xn-k,n are as
Xn-k,n))
\
^(Xn-i+hn-Xn-k,n)
Y
) \+
-{xn-i+hn-xn-k,n)
E/l
I , = 1 VK
'
(3.4.3)
%(Xn-i+\,n-Xn-k,n)
-T-
I * ti
(3.4.4)
+
<*-'+i. - *-*>
y +!
Note that the maximum likelihood estimator of y is shift and scale invariant, and the
maximum likelihood estimator of a is shift invariant.
92
Theorem 3.4.2 Let X\, X2,... be i.i.d. random variables with distribution function
1}
F. Suppose F satisfies the second-order condition of Theorem 2.3.8 with y > ~v
or equivalently,
U(tx)-U(t) _
a{t)
lim
t-*oo
xY-\
A(t)
= f
sy~l
Jx
f up~x duds
(3.4.5)
Jx
for allx > 0 and with y > \. Then, for k = k(n) > 00, k/n -> 0 (n -> 00), and
lim VkA (r)
= *
(3.4.6)
with X finite, the system of likelihood equations (3.4.3) has a sequence of solutions
(YMLE, MLE) that satisfies
Vk (yuLE - y, ^jfy - l ) 4 N(kbYtP9 E) ,
(3.4.7)
Y,P = J ( ( W X l + y - p ) ( l - P ) ( H y - P ) )
(1,0),
/ ( 1 + y) 2 -(1 + y) \
V-(l + y ) l + (l + y ) 2 ; '
Moreover, for any sequence of solutions (/MLE* ^MLE) for which the convergence
(3.4.7) does not hold, one must have V^IKMLE ~ Y\~>P or Vk\a^LE/a(k/n)
l|-*poo.
Recall the relation between the parameter a and the function / in (3.4.1): this
function was first introduced in Theorem 1.1.6 (Section 1.1.3), from where it is known
that fit) can be chosen as a(l/(l F)(t)). Then we see that a must be close to a(n/ k)
as n - 00.
We now give the line of reasoning for proving Theorem 3.4.2. A detailed proof
will be given only for y > 0. The proof for the other cases is similar.
For the true y > 0 we rewrite equations (3.4.4), which we want to solve as
[ /^i
/ 1 , Yf Xn-[ks],n - Xn-k,n\
l o g ( l + -7
rl ( 1 +y' xn-[ksln-xn-k,nyl
L v ^5(f))
)ds = y,
ds=
(3.4.8)
vn>
where ao is a suitably chosen positive function (more specifically the one from Theorem 2.3.6) and <7Q := <x//aoin/k).
93
Under the second-order condition given in Theorem 3.4.2, from Corollary 2.4.6,
we have that
(xn-[ksU-xn-k,n\
/s~y - l | zn(S)\
(349)
flo(^)
/ f S~Y - 1
Z(j)\
~~
o\ Y
Vk )
(y'
\s~y-l
y' Zn(s)
L
+ a^T ^7 j~r^ (3A1
= *~Y + ( -7-Y)~ Y )
(3A10)
o Vk
GH
1+
- ^
(K-Y)+S>
_Yy'Zn(s)
0 V*
Hence
,^,(l
, ^ _ x ^ ) h ( ^ y ) i -y
s
\-sy
yy'Zn(s)
o Vk
Now
y = / log(l +
Jo
\ <*Q
)ds,
o(f)
and hence
y -y=l l o 8 ^ ( l + ^
(%-,)
I'1^,,+
~(Y'
\ X
\ -y)TT
V^o
/1 +y
))*>
rL ['trip*
Jfl s
yz^s)A
+
Y
~7Tds
Jo
Vk
Starting again from (3.4.10), for the second equation in (3.4.8) we have
Y'xnHksln - xn-k,nyl
_1
/ V
y _ /V
\\<Y-s y
2
_2vy'Zn(s)
and so
"1 XY -JY
+1
V^o / i o
y' + l
%
v>
fl , , Z(S)
^ / o Vk
~7f
U-yj(7TlK27TT)+5/ios^^-
94
Summing up, we show that equations (3.4.8) are equivalent to linear equations in
the unknown parameters y' and CTQ, which can be solved readily.
For the proof of Theorem 3.4.2 we start by proving some auxiliary results.
Lemma 3.4.3 Assume (3.4.5) with y > 0 and (3.4.6). Let (y', CTQ) := (y'(n), a^(n))
be such that
(3.4.11)
-
&
Then
P\\ +
Xn[ks],n
~~
^nk,n
s e [{2k)
>Cns-
(3.4.12)
-> 1
- " )
n > oo, for some random variables Cn > 0 such that \/Cn = Op (I).
Proof. Let Uk,n, k = 1 , . . . , n, denote the order statistics from an independent and
identically uniform (0, 1) sample of size n. By Shorack and Wellner (1986) (Chapter
10, Section 3, Inequality 2, p. 416),
sup
1/(2*)<J<1
nU[ks\+\tn
~ ^
= Op(l) ,
ks
ks
sup
0<s<l
0P{\)
(3.4.13)
nU[ks]+l,n
as n -> oo. Combining these bounds with the bounds given in Theorem 2.3.6, for
some functions ao(t) ~ a(t) and Ao(t) ~ A(f), t -> oo, for all JCO > 0 and 8 > 0,
we obtain
^(PEHu)^^)
sup
K+/0+(5
.1
4>(f)
J[l/(2*),1]
{nU[ks]+l,n) ~l
,P
\nU[ks]+l,n J
oP(l).
ao(?)
Hence
i/
rj(
o(r>
i/
* y
/n\
y' 1 /
/n\
fc
95
-^(i)^(s^:) + -(-'*-'")
=/ + // + /// + /V+V+W.
By (3.4.13), 5K//7 is bounded away from zero uniformly for s e [(2kn)~l, 1]. We
will show that all the other terms tend to 0 uniformly when multiplied by sy, so that
assertion (3.4.12) follows with Cn := inf se[(2k)~1 l] sYIH n for a suitable sequence
en | 0 .
By the asymptotic normality of intermediate order statistics (see Theorem 2.2.1),
part / is 0P(k~l/2), hence syI = oP{\). By (3.4.13) and assumption (3.4.11), part
// is Op(A: -1 / 2 ), so that syII = oP(l). Next note that s^Vy^s'1)
= o{s~1/2) as
y
s i 0. This combined with (3.4.6) and (3.4.13) gives that s IVand ^ V a r e oP(l).
Finally, s yVI = op(l), provided one chooses 8 < \. Hence we have proved (3.4.12).
Define
:=
^ /vw, *-M _ ^ 2 \
V
o(f)
(3A14)
(read (s K l ) / y as log s, when y = 0). Then, from Corollary 2.4.6, for suitably
chosen functions ao and Ao, and for all e > 0,
Zn(^)=^-1Wn(5)-Wn(l)
+ VkA0 ( ) *y,p(s~l)
+ oP(l)s-y-l/2~e
(3.4.15)
as n -> 00, where {Wn(5)}J>o is a sequence of Brownian motions and the op-term
is uniform for s (0,1]. Moreover, under the conditions of Theorem 3.4.2, for all
s >0,
Zn(s) = Op(l)s-y-^2-
,
(3.4.16)
as n 00, where the Op term is uniform for s (0,1].
Proposition 3.4.4 Assume condition (3.4.5) with y > 0 and (3.4.6). Then any solution ()/, (7Q) 0/ (3.4.8) satisfying (3.4.11) admits the approximation
Vk(yf -y)-
( y + 1)
96
Remark 3.4.5 Though we prove Proposition 3.4.4 only for y > 0, in fact the statement is true for any y > ^. For more details see Drees, Ferreira, and de Haan
(2003).
Proof (of Proposition 3.4.4). We start by obtaining an expansion for the left-hand
side of the first equation of (3.4.8). Rewrite it as
fl
+
= /i + y ( l - 0 ( t - 1 -l 1o g i k ) ) + / 2 .
First we prove that I\ is negligible. Since Xn-[ks],n is constant when s e
(0, (2A:)-1], from (3.4.12), with probability tending to 1,
i +
)/Xn-[ksln-Xn-k,n
*o
x +
- Xn.k,n
o(f)
-1
^o
y'Xn,n
a
"o(V
0Vp
(2k)YCn
o(f)
l
0P(kr+s).
y (
V^o
^ ^
n - i ^ n - X
Then, from (3.4.12) it follows that 0 < 1/(1 A (1 + *)) < 1 v 1 / C = 0P{\) with
probability tending to one. Moreover, note that relation (3.4.16) implies
r(2krl
/
syZn(s)
/ C(2kyl
ds = 0Pl
s-l/2~e
\
ds J = Op((2k)-l,2+s)
oP(l),
97
'(U($-')1T:^H,*)
=((^)^
+ o
'
<
*-"
2 < a ^ , ,
k-l(2k)-1/2+^
where for the last equality we choose e < | . To sum up, we have proved that
Jo
l>
flo(f)
= K + ( 4 - y ) TTT: + 4*~1/2 [
sYZ
*^ds + oP(k-x'2).
a
V^o
/ y +l
o
Jo
The second equation of (3.4.8) can be treated with somewhat similar arguments.
Then one gets
fVi
-l
,
1
(y + l)(2y + 1)
-to
Hence, under the given conditions, the system of likelihood equations (3.4.8) is
equivalent to
y + (K-y)
W
/ y +1
><>
4*-1/2 / ' * 2 ^ ) ^
(K-y)
Y+\
\aQ
) (y + l)(2y +1)
a
+ 0/,(A;-
Jo
/ ) = - T L - . (3.4.19)
y' + 1
1 2
98
y + (^7-y)
\o
/ Y+1
Jo
- T T - ( ^ - y ) f -u n o 4 . n ~ ^
Y +1
V^o
/ (y + n ( 2 K + n
0p{k X 2)
t ^ " ^ ds +
''
^o
= - 7 T T - (3-4.20)
K'+ 1
The first equation and (3.4.11) show that \y' y\ = Op(k~1^2); hence \y' y\2 =
o/>(*- 1/2 ). Therefore l / ( y + 1 ) - l / ( x ' + 1) = ( / - y)/(y + l ) 2 + o(ifc_1/2), and
so (3.4.20) implies
T 7 " k~1/2Y
Y'-r-(K-r)
rih
-(K-Y)
(y + 1 ) 2
V^o
I *yZn(s) ds + oP(k~1/2) = 0,
/ (Y + V(2Y +!)
Jo
= 0 . (3.4.21)
Straightforward calculations show that a solution of this linear system in y' y and
Y'/CTQ - y satisfies (3.4.17).
Proof (of Theorem 3.4.2). We shall prove the theorem only for the case y > 0. The
case \ < y < 0 requires somewhat similar arguments. The proof in the case y = 0
requires longer expansions but the arguments are also similar. For the complete proof
we refer to Drees, Ferreira, and de Haan (2003). A different proof but only for the
case y > 0 can be found in Drees (1998), and for a slightly different approach we
refer to Smith (1987).
Hence suppose y > 0. Let ao(n/k) and Ao(n/k) from Theorem 2.3.6. From
Proposition 3.4.4 and (3.4.15) the sequence of solutions of (3.4.4), (KMLE, ^MLE) say,
satisfies
Vk(yMLE
-i
and
- y) (K + 1)
(Y + 1 }
VkAofy
- W(l)) ds
ds
99
ds,
as n - oo, and the convergence holds jointly with the same limiting standard Brownian motion W.
Next from Corollary 2.3.5 and Theorem 2.3.6 it follows that
flo(0
lim
t-+oo
1 1-l/p,p<0
A(0
-l/y,p=0/y \
0,
p = 0=y
=:L
and
^ fr + 1 ) 2
Hi-
d (.y + D 2
- W(l)) ds
and
x (!{p<0}^ +
-i ^
1{P=O})
ds\
100
To calculate the variance of the limiting normal random variable of V^(]PMLE K),
let
X(s) := Y'\Y
W(s) - W(l)) .
= (y + l ) 2 .
f E(X(s)X(t))dsdt
A/(KMLE
(W"
Y)
with
^ ) - W(l)) .
Then
Cov I /f X(s)
ds, fI Z(s)
ds\ = /I /I (X(s)Z(t))dsdt
X(s)ds,
Z(s)ds)=
E(X(s)Z(t))t
\Joo
Jo
/J Jo Jo
=-I
- y .
a(t)
(3.5.1)
(3.5.2)
M(nj)
P A
r -+ V\-
101
(3-5.3)
x > 0,
hm
= log x .
f-*oo
*-2S>
a(0//(0
-i
= * ^ i
y-
(3.5.4)
= y+
(3.5.5)
<
<
^_+
iogt/(yn-/,n)-iogf/(yrt-M)
qo(Yn-k,n)
(te)
y-
_- 11
/y
<
^-+e
\Yn-k,n)
102
lQ
k-l (^Ls.YK
_1
k-l ,
,=0
.y_+
X
i=0\
n-k,nJ
/=1
i=l
+ sE Yy'+e = - +
1 y1 y- e
is bounded above by
and a similar lower bound applies. Starting from these inequalities, we follow the
same reasoning as before. This leads to (3.5.3) for j = 2.
It follows from Lemma 3.5.1 (cf. (3.5.5)) that the Hill estimator converges to zero
for y <0; hence this estimator is noninformative in this range. However, this lemma
helps us to find a consistent estimator of Y for y < 0, since under its conditions,
K) 2
l - -2 K _
zu -Y-)y-)
(3.5.7)
(3.5.8)
This leads to the following combination of the Hill estimator and the statistic in
(3.5.7):
u\_m
v
2\
f(l)
yM:=MX>
+ \--\\-^-\
2I
MP
"I
(3.5.9)
103
Theorem 3.5.2 Let X\, X 2 , . . . be i.i.d. random variables with distribution function
F. Suppose F e V(GY) andx* > 0. Then
* P
YM^Y
Remark 3.5.3 The estimator y^f is called moment estimator. The name stems from
the fact that the left-hand side of (3.5.3) converges to E(Yy- - l)j/yL j = 1, 2,
which is the j\h moment of the limiting generalized Pareto distribution. In contrast,
remember that the Pickands estimator is a quantile estimator (Remark 3.3.4).
Next we prove that yu is asymptotically normal under appropriate conditions. In
particular we need a second-order condition for the function log U(t). From Lemma
B.3.16 (see Appendix B) we know that under the usual second-order condition for
u(tx)-u(t) _
a(t)
lim
xy-\
= fxsy~l
f u^-1 duds,
(3.5.10)
A(t)
for all x > 0, and ify^p
and p < 0 if y > 0, a second-order condition for log U(t)
holds:
t-+OQ
lim
io g E/(fjc)-iogt/(0 _
1(0
Q(f)
xy--\
Y-
(3.5.11)
with y- := min(0, y) as before, q := a/U a positive function, and Q not changing
sign eventually with Q(t) - 0, t -> 00. When y > 0 and p 0 the limit (3.5.11)
vanishes. One possible Q(t) is
A(t) ,
(2(0 =
y < p < 0,
^~ , P < Y < 0 or (0 < y < p and / ^ 0) or y = p,
jffjMt)
A{t) ,
p < 0,
(3.5.13)
y_ = p = 0
(see the theorem for the meaning of (log/(f)P and (log/(0)~ ) and c :=
lim,-xx) t~y~q(t) > 0, we have that for each e,8 > 0 there exists to = fo(, 5) > 0
such that for all t,tx > to,
104
\ogU(tx)-\ogU(t)
qo(0
-V
Qo(t)
Y-,P (X)
Theorem 3.5.4 (Dekkers, Einmahl, and de Haan (1989)) Let Xh X2, . . . be i.i.d.
random variables with distribution function F withx* > 0. Suppose the second-order
condition (3.5.10) holds with y ^ p. If the sequence of integers k = k(n) satisfies
k -> oo, k/n -> 0,
lim VkQ(y)=
X
(3.5.14)
with Qfrom (3.5.11) and X finite, then
*Jk(YM -y)~> N(Xby,p, vary)
(3.5.15)
by,p :=
< Q
y(i+y)
(l-K)(l-3y)
Y
(1+K)2 '
y-yp+P
Pd-P)2 '
1,
y > P = 0,
< y < 0,
and
[ y2 + l ,
varv :=
(3.5.16)
l^0)ory> - p > 0
y > 0,
(3.5.17)
Q
U
'
For the proof we start with an extension of Lemma 3.5.1. Since in the proof we
use the result of Theorem 2.3.6 (the uniform inequalities connected to (3.5.11)), we
use the function qo from that theorem for the formulation of the lemma.
Lemma 3.5.5 Assume the conditions of Theorem 3.5.4. Write Xt = U(Y(), i =
1,2,..., where Y\, Y2,... are i.i.d. with distribution function 1 1/JC, x > 1. With
the notation of Lemma 3.5.1:
1> ify<0orp^0the
random vector
(2)
M"n
1 ~ y- ql<Xn-k,n)
\qo(Yn-k,n)
(l-y-)(l-2y_)
(3.5.18)
p' \l-y--p"
x
(__\
1(0,0),
IE ( 1 ^ - ^ 1 * ^ , ( 7 ) ^
2(2-2y_-p')
(1-K_)(1-/_-P')(1-2K--/O')/
2(2-3y_)
,
'
Q
<
'
' - fW v
/o' = 0 = Y- ,
105
4(5-lly_)
I>
\ l-3y_ (l-2y_)(l-3y_)(l-4y_) /
* y - . / / ( * ) ==
y_+p'
'
^jc^-logx,
P <U'
/O' = 0 > K _
M (2)
1 , - ^
\
2
(3.5.19)
(420)
Proof. The proof is somewhat similar to that of the corresponding result for the
Hill estimator (Theorem 3.2.5). Theorem 2.3.6 tells us that one can choose functions
qo > 0 and Qo such that for any s > 0 there exists to such that for all t > to and
x > 1,
*-
+ Go(0*y..P'to - lGo(OI* K - + / / +
<
log U(tx)-
log t/(Q
4o(0
<
^ i l + eo(o*y-.P'(*) + eieo(oi*y-+p/+fi,,
(3.5.20)
Let us concentrate on the upper inequality. We apply this inequality with t replaced by Yn-k,n (tending a.s. to infinity) and x replaced by Yn-iin/Yn-k,n for
i = 0 , 1 , . . . , k 1. Then we eventually get that
MP ^ig.ter- 1
qo(Yn-k,n)
k^
y.
i=0
1 k~l /Y
\ln-k,n/
\y-^
+s|c.-,.|iE(fe)
106
k
Y
+ Qo(Yn-k,n)-j- J2 %-A i)
\Qo(Yn-k,n)\
/= 1
J^VF)*-*'**
1=1
rjl
1_\
M?
\qo(Yn.k<n)
l-y-J
1 *
+*Vk \Qo(Yn-k,n)\
- Y^(Y*)y-+>'+
K
=i
One easily verifies that the conditions of the central limit theorem are fulfilled for
the first term and the conditions of the law of large numbers for the other two terms.
Then the last term vanishes in the limit. Moreover, since by Corollary 2.2.2
k
Jtl
-tn-k,n
-* 1
n
and since g o is a regularly varying function, we have
Qo(Yn-k,n) P Co(f)
""*
Recall from Corollary 2.3.5, Theorem 2.3.6, and assumption (3.5.14) that if p' < 0,
H m V S e 0 ( J ) = H m ^ f i ( ; ) = At
*-oo
\k/
t-+oo p' \k)
p1
andifp' = 0,
lim VkQo (y) = lim VkQ (7) = k .
Hence, since a similar lower bound applies, as n - 00,
\qo(Yn-k,n)
l-Y-J
-
\kf^
k
Y-
Y-
107
\q%(Yn-k,n)
\l-Y-J
*Al)
_rJ
1 \(
\qo(Yn-k,n)
\-Y-J
\qo(Yn-k,n)
1 \
1 - Y- ) '
q^Yn-k,n)
(1 - Y-r
_ 2Vk
1
/ 1 y . (Y*y- - 1 _
- Y- \
~{
(Y*y--i\
y-
r-
V y-
y-
+2e|i2oWI^-^^-+p,+
y-
ql<X,
k
1 J ^ (F*)y- 1
+ 2Go(K n - M )- ^ y-- %_,P>(Y*)
1=1
i=l
Again we can apply the central limit theorem to the first term on the right-hand
side and the law of large numbers to the other terms. The last two terms vanish in the
limit. A similar lower bound applies. We conclude that as n -> oo,
108
Vk
MP
( ^(Yn-k,n)
(1 - y - ) d - 2y_) )
which can be found in a routine way by applying the Cramer-Wold device, Lyapunov'S
theorem, and the central limit theorem.
The proof of the second statement is similar.
V*
U(>-m
-Y-
M< 2 >
IV
= v*
M (2)
('40-^H^D|
(1 - y-)(l -
-i (1 - 2y_)(l - K_)2
KM
e-2P
w/iere (P, 2 ) is the limit vector of Lemma 3.5.5. Hence the limiting random variable
is normal with mean XbYfP where
-
a-X)(l-2r)
(fe&>
by,p :=
1
(1+K)2
1
(1-p) 2
10,
<O<0
P<Y<O,
0 < y < p and I ^ 0,
(0 < y < p and l =
y > p = 0,
0)ory>p>0,
109
and variance
Y >0,
2
3
varv := < (l-y)^2 (l-2K)(l-ll>/+48)/
" -2y)(l-ll)/+48"22-44
-^"
) / )^
(l-3y)(l-4y)
< Q
The bias of the limiting random variable in terms of (y_, p') is given in Exercise 3.10.
Proof. The result is straightforward from Lemma 3.5.5 and application of Cramer's
delta method.
Proof (of Theorem 3.5.4). It remains to study the first part of the estimator, i.e.,
We know that
W!)
'->-/
with P the first component of the limit vector from Lemma 3.5.5. Therefore
,_,
Q(t)
-l,
I
= { l,
< y < 0,
> P = o,
0,
otherwise .
l(K)
(3.5.24)
From this and Lemma 3.5.5 one gets the asymptotic distribution by straightforward
but lengthy calculations.
110
0<JC<
a
Ov(-y)
(1-H y t (jc)) dx = .
l - y
Jo
(3.6.1)
Moreover,
1
E [V (1 - HYta(V))}
rl/(0v(-y))
= -J
(I - Hy,a(x)f
dx =
2{2_yy
(3.6.2)
(3.6.4)
A sample analogue of the right-hand sides of (3.6.3) and (3.6.4) will provide estimators
for y and a.
Consider independent and identically distributed random variables Xi, X 2 , . . .
with distribution function F and suppose that F is in the domain of attraction of an
extreme value distribution GY. Then we know (e.g., Section 3.1) that for x > 0 and
a suitably chosen positive function / ,
J0
+ xf(t)) J
i-Fm
dx =
W)l
1
[ 1 - F(u)
du
f00
T=m
u-t
111
wdE{V(l-HYta(V))}by
2 Jo \
1-F(0
dX
2/(0 X u - F ( o / "
=-77T /
/(OJr
("-0
~F(M),
(l-F(f))2
(3.6.6)
^F(M).
Next we need sample analogues of (3.6.5) and (3.6.6). These can be obtained
by replacing t with the intermediate order statistic Xn-k,n and replacing F with the
empirical distribution function Fn. Then (3.6.5) becomes after normalization (note
that 1 - F ( X _ M ) = k/n),
Pn := J. J^ *\ ~ Xn-k,n,
(3.6.7)
YPWM
-= -f^Wn
=l
-\OQn-1)
(3 69)
and
: = ^T2G:= p fe- l )
(3610)
*(4-jt , c H
= f s-r-lWn(s)~Wn(l)ds
+ y/kAo(^)J
VY,p(s-l)ds + oP(l)
(3.6.11)
112
Vk
i:~<)
= I s-yWn(s)-sWn(l)ds
+ VkA0(^
f sVy,p(s-l)ds
op(l).
(3.6.12)
For the above relations in terms of the functions a and A instead of ao and Ao,
see the proof of Theorem 3.4.2. Then by working out the probability distribution of
the corresponding right-hand sides and applying Cramer's delta method, one gets the
following result:
Theorem 3.6.1 Let X\, X2,... be Ltd. with distribution function F.
1. IfFe
,
and
&PWM P ,
m\ ~ * l
- 1 I
A(1,0) ,
A(I,-^)
,/><o,
y ^ = 0,
y = /> = 0 ,
(2->/)(-2+6K-7)/2+2K3)
(1-2 K )(3-2 K )
(l-2y)(3-2y)
(2-K)(-2+6K-7>/2+2K3)
(1-2K)(3-2K)
Remark 3.6.2 For | < y < 1 the convergence of pp WM to y is slower than that for
y<\Remark 3.6.3 The statistic Pn is commonly called the empirical mean excess function. Its main use seems to be to distinguish between subexponential and superexponential distributions. It plays a role in insurance theory; see, e.g., Embrechts,
Kluppelberg, and Mikosch (1997). The statistic is often discussed in books on extreme value theory; see e.g., Falk, Hiisler, and Reiss (1994) and Beirlant, Teugels,
and Vynckier (1996).
113
is in the domain of attraction of G-Y (cf. Theorem 1.2.1). Hence we could apply the
Hill estimator to X, but this way we do not obtain a statistic, since x* is not known.
Fortunately, for y < ^ the endpoint JC* can be very well approximated by the largest
order statistic Xn^n (cf. Remark 4.5.5 below). This leads to what we call the negative
Hill estimator
k-i
1O
YF'-=TJ2
8(
(3A13)
--
v -X>n,n
Xn[ks],n
Mir-
l io g I Xn,n
^io l
~ Xn-[ks],n \
ood) r
= Y I s- Wn(s)ds^yVkA0(^)
f syVYiP(s-l)ds
+ op(l),
which, together with a similar expansion with Xn-[ks],n replaced by Xn-k,n, leads to
the following result:
Theorem 3.6.4 Let X\, X 2 , . . . be i.i.d. random variables with distribution function F.
114
1. IfF V(Gy)
provided k = k(n) -> oo, k/n 0, k^/logn > oo, for some small rj, as
n -> oo.
2. If the second-order condition (2.3.5) is fulfilled with 1 < y < | , k =
k(n) -> oo, &/n -> 0, k^/logn -> oo, for some small n and y/kA (n/k) - A,
as oo, then
Vk(yF-y)
has asymptotically a normal distribution with variance y2 and mean
Xy J syVy,p(s-1)ds =
Jo
p(l+y)(l-p)
A,
'^<0'
p = 0.
We give the proof of the asymptotic normality. The proof of the consistency is
left to the reader (Exercise 3.11).
Proof (via the tail quantile process (Holger Drees)). We shallfirstconsider the sum
in the definition of yp for i = 1 , . . . , j , where j = j(n) is some sequence with
1 < j(n) < k(n) - 1. Note that
^ 1 V^i
" k ^ g
/ Xn,n ~ Xni,n \
\Xn,n - Xn-ktn)
'
J ~~ 1 .
f Xn,n ~~ Xn-l,n
k
g V 1/(00) - Xn-k9n)
\
'
Xn^n ~" n
^n1,/t ~~ "n
a{n)
a(n)
a(n)
0P{\)
U(oo) - /(f)
(f)
Xn-k,n - E/(f) p
<)
*(f)
~* y '
/ Xnn
IZS1O&\Y
K
~~ Xni,n \
y
A
\ n,n
/,/* IOg (
~~ nk,n /
I
=
/
Jj/k
I ^n,n ~~ AB-[fcs],w \ ,
Z\^
\
T^)
d s
n,n ~~ ^nktn /
00(f) J dS ~Ll08 (
o,(f) ) *
(36 15)
and take j = k~*+\ i.e., j/k = k~s with 0 < 8 < (-(ly)'1
A (1 + Is)'1),
8
some s < \. Then from Corollary 2.4.5, for k~ < s < 1 we have
l0
H"^
o,(f)
115
for
where {W (s)} is a sequence of standard Brownian motions and the op -term is uniform
in s.
Hence if, moreover, 5 > \ so that / 0 log(ysy) ds = o(l/*Jic), then
-log(-K) + y + * ( - L ) .
The second term in (3.6.15) is similar but simpler. Again using Corollary 2.4.5,
we obtain
Hence
+ oP(-j=\
116
I
I Xn,n ~ Xn[ks],n \ ,
/ l0gl~?
= Y+
+
d s
o,(-L).
(3.6.16)
Vk(yF-y)-f
ds
l
Ys- Wn(s)-yWn{\)
+ ^fkA0(^YSY%As~l)ds
= oP(l) ,
- y) % JV2^N
+ kbytP
with N standard normal, where the constants A, var^, and bYiP are known (cf. Theorems 3.2.5, 3.3.5, 3.4.2, 3.5.4, 3.6.1, and 3.6.4).
117
For the asymptotic normality of the Pickands, MLE, PWM, and negative Hill
estimators we require the second-order condition of Section 2.3, (2.3.5), with auxiliary
second-order function A e RVp<o, and that the intermediate sequence k satisfy
As is most common for the asymptotic normality of Hill's estimator, we require the
second-order condition (2.3.22) of Section 2.3 with auxiliary second-order function
A* e RVp*<o, say, and that the intermediate sequence k satisfy \fkA*{n/k) > A.*,
say, with A* e R.
Finally, for the asymptotic normality of the moment estimator we require the
second-order condition (2.3.5) with p ^ y. Then we have the second-order condition
in terms of log U; cf. (3.5.11)recall U := (1/(1 - F))*~ with F the underlying
distribution functionwith auxiliary second-order function Q e RVp> with p' known
(cf. TheoremB.3.1) and the intermediate sequence k = k(n) satisfying VkQ(n/k) ->
A/, say, A/ e E.
Therefore, to compare the estimators we should first of all compare the orders of
k. In Table 3.1 are the relations among the several second-order auxiliary functions
and respective indices, for some combinations of y and p. We have (i) for some
cases the auxiliary functions are all of the same order but (ii) for some other cases
A(t) = o(A*{t)) or A(t) = o(Q(t)), t -> oo. In terms of the growth conditions
for k, in case (i) they are the same for all the estimators, but in (ii), for instance
\fkA(n/k) -> k > 0 corresponds to k of larger order than if VkA*(n/k) -> A.* > 0
or VkQ(n/k) - A/ > 0, meaning a slower rate of convergence for the former. We
shall come to this point later on, in the optimal mean square error analysis.
Table 3.1. Second-order items related to the nondegenerate behavior of the estimators;
l:=\imt-+oo(U(t)-a(t)/y)
^ YMLE>YPWM
yp>
/v
2nd-order condition
/s
(2.3.5)
2nd-order
auxiliary
if
function
for fc's
A
A
A
A
growth
y>-p^0
index of
2nd-order
auxiliary if
function
y <P
<y
0 < y
0< y
P
P
P
P
P
y <P < o
<y < o
<0
<0
< p and / ^ 0
< p and / = 0
Y>-P=O
^
yn
^
yu
^
YF
A
A
yA_
Y+P
-y
p
p
A
A
pA
Y+P
pA
Y+P
P
y
-y
p
p
A
A
:
P
P
P
P
118
rllll
rlCK
i
-2
-1
PWM
Falk
i
_..
25 20 -
\
\
15 -
10 5 -
^ ^ ^ ^ ^
0 H
I
4
Next, in Figure 3.2 we compare the asymptotic variances. The Hill estimator has
systematically the smallest variance in its range of possible values. Hill's and the
moment estimators have the smallest asymptotic variances for positive values of y.
The MLE and negative Hill estimators have the smallest asymptotic variances for
negative values of y.
It is more complicated to compare the bias of the estimators, since in general the
bias depends on both parameters y and p among other characteristics of the underlying
distribution. Nonetheless, we state some general comments:
When y = 0,
[ 2-2P-2-P+l+\
p2l0g22
1,
'
'
P = 0,
1
bo,p(YMLE) =
(1-P)2'
bo,p<o(YM) = 0 ,
*0,p(KPWM) = { { ^
= y(/0),o(/M)
= ^>/(<1/2),0(PPWA/) = y(<-l/2),0(yF) = 1
Pick"
MLE
Mom
119
"PWM
bY,-ooiyp) = oo ,
(l-y)(l-3 K )
' ^ -
y>0and/#0
y > 0 and / = 0 ,
0,
Xby.
y 0
y,p
Vk
Vk
Y y/VZTy =. +
/n\
+^
1/(1-2/0)
k0(n) =
I (#
vvar.
ar
l\
-2p/(l-2p)
120
and
,.
. ^/vKyN
A(f) \ 2
1 \
Hence note that under the given assumptions, a comparison of this quantity for the
different estimators reduces to a comparison of the asymptotic variances.
Applying this reasoning to all the estimators we have that for some cases depending
mainly on y and p, the order of the optimal sequence is the same for all the estimators.
But for some other cases, namely when A(t) = o(A*(t)) or A(t) = o(Q(t)),t -> oo,
the optimal order ko is smaller (cf. also Table 3.1).
3.7.2 Simulations
To illustrate the finite-sample behavior of the estimators we give some simulation
results for the distribution functions given in Table 3.2, namely standard Cauchy, normal, and uniform. Note that the uniform distribution satisfies the first-order extreme
value condition (2.1.2) but does not satisfy the second-order condition (2.3.5) because
the rate of convergence in (2.1.2) is too fast (cf. Exercise 1.3).
Table 3.2. Extreme value index and second-order parameter.
Distributions y
Cauchy
Normal
Uniform
1 -2
0 0
1 oo
Pick
Hill
50
121
Moml
MLE
100
Fig. 3.4. Standard Cauchy distribution: (a) diagram of estimates of y (the true value 1 is
indicated by the horizontal line); (b) mean square error (see the text for details).
Pick
- Mom
PWM
300
Fig. 3.5. Standard normal distribution: (a) diagram of estimates of y (the true value 0 is
indicated by the horizontal line); (b) mean square error (see the text for details).
122
Mom
PWM
Falk
200
1
300
r
400
Fig. 3.6. Standard uniform distribution: (a) diagram of estimates of y (the true value - 1 is
indicated by the horizontal line); (b) mean square error (see the text for details).
From the asymptotic theory one obtains the correspondent asymptotic confidence intervals. The most common approach is to assume VkA(n/k) -* 0 (or
VkQ(n/k) -* 0 in case of the moment estimator), so that the limiting distribution has zero mean. This avoids the bias estimation, which generally requires the
estimation of the second-order parameter p (for more on this we refer to Ferreira
and de Vries (2004) and for the estimation of p see, e.g., Fraga Alves, Gomes, and
de Haan (2003)). The (1 a) 100% approximating confidence interval is then given by
V?
Y - Za/2
where varp is the respective asymptotic variance with y replaced by its estimate and
Za/2 is the 1 a / 2 quantile of the standard normal distribution.
In Table 3.3 we give the 95% asymptotic confidence intervals for some values of
k. The value zero belongs to all these confidence intervals, which does not contradict
the hypothesis that the extreme value index is zero.
Table 33. Sea level data: 95% asymptotic confidence intervals for y.
25
50
100
I
i
rlCK
i
PWM
M01T1
i
123
IP"*'
*"""
150
250
300
it
Pick
Moml
100
300
Hill
(0.25,0.45) (0.24, 0.36) (0.28, 0.36)
Pickands (-0.90, 0.07) (-0.72, -0.04) (-0.70, -0.31)
Moment (0.16,0.77) (0.24, 0.67) (0.23, 0.47)
124
Pick
200
Mom
400
600
PWlfl
800
1000
Exercises
3.1. Note that the generalized Pareto distributions HY (Section 3.1) satisfy the following property: if X is a random variable with probability distribution Hy there exists
a positive function a such that for x > 0 and all t with HY (t) < 1,
P (^^-
V a(t)
P(X >x) .
125
Prove that this property characterizes the class of generalized Pareto distributions (cf.
proof of Theorem 1.1.3).
3.2. Let Xi, X2,... be independent and identically distributed random variables with
distribution function in the domain of attraction of some extreme value distribution
with y > 0. Let X\,n < X2,n < < ^n,n be the nth order statistics. Prove that if
k = k(n) -> 00, k/n > 0, n > 00,
Xn-k,n
j- > p X
P _1
n-k,n
^ rk Xnin
3.3. Can you prove an asymptotic normality result for this estimator?
Hint: see Theorem 2.4.8.
3.4. Assume the conditions of Exercise 3.2. By using the methods in the proof of
Theorem 3.2.2 and Lemma 3.2.3, prove that
log Xn,n - log Xn-k>n P
log A:
and that the distribution of
log Xn,n - log Xn~k,n ~ Y log k
converges to Go; note that in the notation of the proof of Theorem 3.2.2 the distribution
of log Yk,k log k converges to Go(x) = exp(e~x). Is this estimator better or worse
than the one in Exercise 2.19?
3.5. Define
y := 1 - 2" 1 ( l - ( m ) 2 / m ^ ) " 1
with nin = k l X^?=i(^n-i,n Xn-k,nV for j = 1,2. Prove that y is consistent
for y provided y < \ and that y/k (y y) is asymptotically normal for y < | under
appropriate conditions. Calculate the asymptotic variance and bias.
3.6. Prove Theorem 3.3.5 for y = 0. Check that the variance and bias of the limiting
random variable are the same as taking the limits of the given variance and bias of
Theorem 3.3.5 when y converges to zero.
3.7. Check that when p = 1 the Pickands estimator, conveniently normalized as in
Theorem 3.3.5, has asymptotic bias \bY,-\, where
bY-\ =
U ) 1
(y-l)(2y-l)log2
' ^
2,
y= h
'
126
iogxn-kin
min(l,sy-+l^)\Vk(l0gXn-[ksln-]
' I
\
qoKk>
- s-y-xWn(s)
%_,P>{s-1)
s-y--v
Yp
0.
*(S-i4)- 4 '
with P(Wn) = / J s-yl Wn(s) - Wn(l) + VkQo(n/k)VY_tf/(s-1)
ds.
(1-K_)(1-2K_)
(l-K_-pO(l-2K_-pO,
where the random variables (P, Q) are defined in Lemma 3.5.5 and the mean values
and covariance matrix are given in terms of (}/_, pr).
(b) Use this to provide the asymptotic bias of the moment estimator in terms of y
and p1.
3.11. Assume the first-order regular variation condition for some y < \. For a
sample of size n, prove the consistency of the negative Hill estimator for some intermediate sequence k with ky~n2e -+ 0, s > 0.
Hint: use the methods used in the proofs of the consistency of the Hill or the Pickands
estimators.
3.12. Assume the second-order regular variation condition for some 1 < y < j .
For a sample of size n9 prove the asymptotic normality of the negative Hill estimator
for some intermediate sequence k with ky~sn2e -> 0, e > 0, using the methods used,
for example, in the first proof of Theorem 3.3.5.
4.1 Introduction
With the sea level case study introduced in Section 1.1.4 and further discussed in Section 3.1, we illustrated the role of extreme value theory in extreme quantile estimation.
In the sequel we explore this example a bit further.
The Dutch government requires the sea dikes to be so high that in a certain year
a flood occurs with probability 1/10000. In order to estimate the height corresponding to that probability, available are 1877 high tide water levels, monitored at the
coast, one for each severe storm of a certain type, over a period of 111 years. The
observations are considered as realizations of independent and identically distributed
random variables. In Figure 4.1 is the empirical distribution function Fn based on
these observations, i.e., we assign mass \/n to each observation where n represents
the sample size.
One possibility to estimate a quantile is via the empirical quantile, that is, one of
the order statistics. As shown in Figure 4.1 for the 0.9 quantile, following the curve this
0.8 A
*0.6 -J
0.4 A
0.2 A
100
200
300
400
500
128
i
i
2 -i
1 -^
J^
0 -1 -
2.0 -
- ^
1.5 -
- $
1.0-
- ,2
0.5 0.0 -
-? i"
0
v=l|
Y 0
-y=-l
i
' /
^"'''
/
i
i
2
Fig. 4.2. Extrapolation function: (a) for U(t); (b) for - log(l - F).
quantile is just the level corresponding to the given probability. Now we are interested
in estimating the sea level, say w, with probability (111/1877) x 10~~4 ^ 6 x 10~6
of being exceeded, that is, 1 F(u) ^ 6 x 10~ 6 . So this is the probability of a flood
during one windstorm of a certain (severe) type. But for the given data set the highest
order statistic corresponds roughly to F*~ (1 1/1878), that is, 1 F(X\s77,isn) =
1/1878 ^ 5 x 10~4. Hence in order to give a nontrivial answer one needs somehow
to extrapolate beyond the range of the available data.
Recall that we want to estimate the level u such that 1 F(u) ^ 6 x 10~6. In terms
of the function U = (1/(1 -F))<~ this means that u /(l/6x 106) /(17x 104).
Remember that (cf. (1.1.27), and also (3.1.6) and (3.1.8))
U ( l 7 x 10 4 ) = U(tx) U{t) + a(t)
XY-I
(4.1.1)
We see that the function (xy \)/y plays a crucial role (cf. Theorem 1.1.6, Section
1.1.3). Basically, the extrapolation beyond the quantile U(t) is via this function multiplied by the scale factor a (t). Roughly speaking, apart from a scale factor, the function
(xy l)/y approximates U = (1/(1 F))*~, or log((jcK l)/y)*~ approximates
log(l F). Figure 4.2 shows these functions for y = 1, 0, 1. We see that the
real parameter y determines their shape; for instance, in the log(l F) scale one
gets a straight line when y equals zero.
Hence (4.1.1) motivates the quantile estimator
(l7xl04xk\y
_ i
where n is the sample size and k is an intermediate sequence, i.e., k -> oo and
k/n -> 0.
Figure 4.3 displays the empirical distribution on a log-scale, i.e., the step function
log(l Fn) (which is a convenient scale when one is interested in the largest values
4.1 Introduction
129
14 ->
CM
_-'"'
00
j * - * ^
^
^ ^ ^ ^
.^^^
i
i
^
I
300
400
500
Fig. 4.3. Step function log(l Fn) of the sea level sample and estimated model attached to
the intermediate order statistic X1699,1873of the sample) and the quantile we are interested in. Moreover, it shows one possible
model fitted to the tail of the distribution, which gives the following estimate of the
sea level for a failure probability of 6 x 10~6: if we take k = 174,
/1873\
#(17 X 104) = Xi873-174,1873 + 5 f ^ - J
/l7xl0 4 xl74V _ !l
1873
Then for instance using the moment-type estimators discussed in Section 3.5 above
and Section 4.2 below, with k = 174, we get y = yM = 0.02 and 5(1873/174) =
aM 40.3, hence
U(\l x 104) = 286 + 40.3 (
cl04xl74\QQ2 _ !
l
1873
)
O02
= 715.6
The adjusted model in Figure 4.3 represents this formula with the quantile as a function
of the given probability and with the other components fixed. It is attached to the
empirical function at the intermediate order statistic Xi873-n4,i873 = ^1699,1873 =
286. More details on the data analysis are given in Section 4.6.
Thus we see that a key issue in quantile estimation is the estimation of the extreme
value index y. Moreover, we have to deal with the estimation of U(t) and the scale
function a(t). Recall that in Chapter 3 we have already discussed two estimators
of the scale: the maximum likelihood estimator (cf. Section 3.4) and the probability
weighted moment estimator (cf. Section 3.6). In the next section we discuss another
possibility, this time related to the moment estimator of Section 3.5.
In Section 4.3 we develop some limiting theory of extreme quantile estimation.
The dual problem of the estimation of tail probability is discussed in Section 4.4.
A related problem is the endpoint estimation, which we also address in Section 4.5.
Finally, in Section 4.6 we give some simulations and continue discussing the three
case studies: sea level, S&P 500, and life span.
130
U{tx)-U(t) _ JC^-1
a(<
>
*
A(f)
=i(^"-
A*
(4.2.1)
for JC > 0 with p < 0. Then for y 7^ p and p < 0 if y > 0 w e know that a
second-order condition for log U(t) holds, namely
\ogU(tx)-\ogU(t) _
lim
a{,),m
/-oo
lf
xy--\
^ - = -, ( -
P- #\ ! .Y-+
. . .P
/
Q(t)
- -
Y-
-)
I)
(4.2.2)
with >/_ = min(0,)/) and Q not changing sign eventually with Q(t) -> 0, f -> 00
(cf. Lemma B.3.16 in Appendix B). When y > 0 and /> = 0 the limit (4.2.2) vanishes
for all g satisfying A(f) = 0(Q(t)), t -+ 00.
We now study an estimator for the scale a, related to the moment estimator of y
discussed in Section 3.5. Recall the notation introduced in that section,
k-i
. I - I U *M<2>
2
(4.2.3)
- Y-)
(4.2.4)
In the next theorem w e give the consistency and asymptotic normality of &M.
Note that for the nondegenerate limit the conditions are the same as those in Theorem
3.5.4, which states the asymptotic normality of YinTheorem 4.2.1 Let Xi, X2, ...be Li.d. random variables with distribution function F.
1. IfFe
131
>l
lim V*G ( 7 ) = A
wiYA G : = A from (4.2.1) ify>0
Xfinite,then
41
(4.2.6)
N(kbYtPivaiy)
(l-2y)(l-3y
i,
P < y < 0,
2
bYiP := <1 (l+y)p '
<
P,
Qimt^ooU(f)-a(t)/y=0,
(1-P) 2 '
[o,
y >
p > 0,
= 0,
(4.2.7)
and
V + 2,
vary := <
y>0,
2
(4.2.8)
5
ny
' Y
<
n
U
'
Proof The proof of the consistency is left to the reader. Next we prove the asymptotic
normality.
First observe that with q := a/U9
<*M d
< )
Mx(i)
a(Yn-k,n)
qWn-k,n)
1}
MJ
*(f)
(1
ao(y n - M )a(y n - M )
?o(y-*.) ?(yn-*.)
a(f)
(i - ?-),
where the function go is from Theorem 2.3.6, which gives the uniform inequalities
connected with the second-order regular variation condition for the function log U.
Then for thefirstfactor we know from Lemma 3.5.5 that,
v* /(i-y-)*?>
\ qo(Yn-k,n)
A
)
(1 -
y-)P,
(4.2.9)
132
where the random variable P is normally distributed. For the second factor use Corollary 2.3.5, Theorem 2.3.6, and (4.2.5) to get
A _p _
rr (qo(Yn-k,n)
A.l{)^o}
-^(^-e'-)>^((M'-)
The second term in (4.2.10) converges to yB with fl a standard normal random
variable (Corollary 2.2.2). By inequalities (2.3.18) of Theorem 2.3.6,
K(a<Xn-k,n)
(k
V\
(k
+ VkA\-\(i
\y+P
+ o(i))l-Yn-k,n)
xmaxM -Yn-kA
A-Yn-kA
J .
Hence, since A = O(Q) by Lemma B.3.16 and kYn-kfn/n -+p 1 by Corollary 2.2.2,
the first term on the right-hand side of (4.2.10) tends to zero in probability as n -* oo.
Finally, for the fourth factor, from Corollary 3.5.6 we know that
^ ( j ^ - l )
2(1 - 2 y _ ) ( l -)/_)/> - I ( 1 _ 2 K _ ) 2 ( 1 - K _ ) e ,
where the random variables F and Q are normally distributed, and P is the same as
in (4.2.9).
Hence the limiting distribution of ^(&M/a(n/k)
1) is the distribution of
(1 - y_)(3 - 4y_)P - 1 ( 1 2
K _)(l
- 2y_) 2 0 + yB -
Ww<0or'<",
p' + y l ^ o }
(4.2.11)
where the random variables P and Q are from Lemma 3.5.5 and B is standard normal
independent of P and 2 (recall that {Yn-kH,n/Yn-k,n}^Zo *s independent of Yn-ky,
cf. proof of Lemma 3.2.3).
133
&M
,,
1 Xn-k,n-U{j)\
where the random vector (R, S,T) has a multivariate normal distribution with mean
vector X(byMp, b^fp, 0), where by*?p and by**p are respectively given by (3.5.16) and
(4.2.7), variances given by (3.5.17), (4.2.8), and 1, and
\ y - 1,
y > 0,
Cov(#, S) = i (l->/)2(-l+4K-12y2)
ft
I
(l-3y)(l-4y)
> / < u>
Cov(P, T) = 0, andCow(S, T) = y.
Proof. From Theorem 2.4.1,
^x_M-^(f)_^
(42i2)
"(f)
where B is a standard normal random variable. Moreover, from the proof of Theorem
3.5.4 we know that Vk(y>M y) has the distribution of
X
T^I+
y+p + (1
2P
\'
(4,213)
where the random variables P, Q, and B are the same as those in the proof of Theorem
4.2.1. Combining (4.2.11), (4.2.12), and (4.2.13), the result follows.
+ yx)~l/y)
1 + yx > 0 .
Then for some positive function a and moreover taking b(t) = U(t) F^~(l 1/0
we have
,. u(tx) - u(t)
xy - I
hm
=
'-oo
a(t)
*-x
A(0
P\
r+p
y J
134
holds for x > 0 with p < 0 and A a function not changing sign and such that
A(t) -* 0, t -> oo.
Let be an intermediate sequence, i.e., k = &(w) -> oo, k(n)/n -+ 0 ( oo).
Suppose that for suitable estimators p, a(n/k), and b(n/k),
with (T, yi, 5 ) jointly normal random variables with known mean vector possibly
depending on y and p and known covariance matrix depending on y (not on p).
Now we are ready to consider extreme quantile estimation. Note that in the examples we needed to estimate a (1 p) quantile on the basis of a sample of size n,
where in fact p is much smaller than 1/n. Let xp := U (l/p) be the quantile we want
to estimate. We are particularly interested in the cases in which the mean number
of observations above xp9 np equals a very small number. This means that we are
looking for a number xp that is to the right of all (or almost all) observations, or what
is the same, we want to extrapolate outside the range of the available observations.
Since this is the central issue in our problem, we want to preserve in the asymptotic
analysis the fact that np should be much smaller than any positive constant. Hence we
are "forced," when applying asymptotic methods, to assume that p in fact depends
on n, p = pn, and that
lim pn = 0 .
W->00
That is, we want to estimate xPn with 1 F(xPn) = pn, or equivalently, xPn =
U(l/pn) with pn -> 0, as n -> oo .
Theorem 4.3.1 Suppose for some function A with A(t) -> 0, t > oo, the secondorder condition (4.3.1) holds. Suppose:
1.
2.
3.
4.
Define
Then, as n -* oo,
a{l)qY(dn)
Y-+P
135
V C Pn~
\ Pn*
. 22D
, >i
^d r^ + ,(K_)
fi-y_A
^~
^log/,
y=o,
^(logO ,
1/K 2 ,
Y <0.
Moreover, from the definition of qY (t) it is clear that qy (t) is increasing in t and that
<7y(0 is also an increasing function of y when t > 1.
Remark 4.3.4 Condition /?n = o{k), i.e., ;? <3C k/n, is quite natural since if it is not
satisfied, nonparametric methods can be employed (Einmahl (1990)); see also Remark
4.3.7 below. In particular, when npn -> 0 the condition log(npn) = o(Vk), i.e.,
pn > n~le~*k for each > 0 and sufficiently large n, means that the extrapolation
cannot be pushed too far.
Note that by checking the components of the asymptotic variance in the theorem,
one sees that for y near zero the uncertainty in the estimation of xPn is to a large extent
caused by the uncertainty in the estimation of y and not so much in the estimation of
b(n/k) or a(n/k).
For the proof we need the following lemma:
Lemma 4.3.5 If (4.3.1) holds with p < 0 or p = 0 and y < 0 then
U(tx)-U(t)
lim
,-,
y
xY l
<'>
(t)
-,
L-.
P + y-
Proof Use the inequalities of Theorem B.3.10 for p < 0, or p = 0 and y < 0.
Proof (of Theorem 4.3.1).
*P.-^.=*+^nri-l'()
136
-*G)-GH(i)Pr-^)
n\\ dvn - 1
-Hi)-"-*?1
Hence
l
Pn
Vk^
Kk
I (6 part)
Vk
II (y part)
Kk)
>
a (I)
VkI (4-1
a(g)
III (a part)
qy(d)
,\
4T-l\|
dZ-i
a (l)
)yqy(dn)
Vk (U(idn)-U(i)
qy(dn)
(f)
IV (nonrandom bias)
<#-l\
logt ,
y=0,
-7.
K<-
(4.3.5)
Vk (4-1
d%-\\
qY(dn)
We write this as
Vk(y-y)
qy(dtW
[d
)]o s
le^'y ^ -l
Jl
(y
\(y -y)logs\
< \Vk(y
-y)logs
logs ds .
logdn P
yfk
"*
137
sup
_ i
1.
(y - y ) l o g , s
It follows that
ut-\
Vk
di~\\
qy{dn)
has the same limit distribution as
Vk(y - y) ,
i.e., T.
Finally, we deal with part IV:
(uM)-u(i)
qY(d)
tf-A
y
(f)
n\
= -V~kAQ
dYn-\
U(dnl)-U{j)
S^T-1
<i)
yqy(dn)
^(i)
\qY(dn)
-1
Itn
sy-y
sY
log s ds
< \y -y\\logdn\
= \Vk(y
-y)
logdn
p
0
y/k
138
Remark 4.3.7 For quantile estimation in a less extreme region (e.g., if npn/k -> c e
(0, oo)) one can use the results of Section 2.4 (in particular (2.4.2); see Exercise 2.17).
On the other hand, it is possible to relax the condition npn = o(k) to npn = 0(k)
in Theorem 4.3.1. That is, under the same conditions of Theorem 4.3.1 but with
dn = k/(npn) -> r > 0, one can show by similar arguments that
a(%)qY(r)
qy(r)
y qy(z)
p qy(r) \
y +p
/ '
as n -> oo. So one could follow the approach of Theorem 2.4.2 or that of Theorem
4.3.1. The rates of convergence are the same for both cases.
A simpler version is valid when y is positive:
Theorem 4.3.8 Suppose for some function A, with A(t) -> 0, as n - oo,
Km JM
t-+oo
=XY5.
A(t)
i.
p
Suppose:
1. the second-order parameter p is negative,
2. k = k(ri) -> oo, n/k -> oo, and *s/kA(n/k) -> A G R, ->oo,
3. npn = o(k) andlognpn = o (Vk), n > oo.
Define
xPn := Xn-k,n (
)
\npnj
Then as n oo,
log ^xn \vxp^n
and xPn := U I ) .
\PnJ
wiYA dn := k/(npn).
Proo/ The proof is similar to that of Theorem 4.3.1. First note that
l0gdn V
139
(W
_ j
d-r a
u(n}
n
lim
..
-^oo
A (I)
Hence
1
Vk
//J_)log4i\
\PnJJ
dYnU($)
V*
U(f) 7
y/k
log 4 VV"
/)
logd
\ogdn
A (|)
A()
(n-*oo).
The previous results are quite general in the sense that they are valid for any
estimators of y, a(n/k), and b(n/k) satisfying (4.3.2). For the estimation of the
location b(n/k) = U(n/k) the natural estimator is its empirical counterpart Xn-k,nThen, from Theorem 2.4.1,
/r *(!)-") <
-ft)
^'
with 5 a standard normal random variable independent of (T, A). Next we shall find
the parameters of the limit distribution in Theorem 4.3.1 for some of the estimators
for y and a(n/k) introduced before. A similar exercise will be done in Sections 4.4
and 4.5.
4.3.1 Maximum Likelihood Estimators
We start with the maximum likelihood estimators of Section 3.4. Recall from Theorem
3.4.2 the joint limit distribution of (F, A) for y > \.
Hence if y := XMLE and a ( | ) := &MLE in (4.3.3) are maximum likelihood
140
Vk
*^
[0,
K < 0 = />,
and variance
(l + y)2,
y > 0,
2
(4.3.8)
1 + 4)/ + 5y + 2y + 2y , )/ < 0 .
4.3.2 Moment Estimators
Another possibility is to use in (4.3.3) the moment estimator of Section 3.5. To estimate
the scale use &M = Xn-k,nMn (1 y~), introduced earlier in Section 4.2, and take
as usual b(n/k) = Xn-k,nWhen using the moment estimator in Theorem 4.3.1 one needs to take into account
that the conditions of Theorem 3.5.4 (asymptotic normality of moment estimator) and
the conditions of Theorem 4.3.1 (asymptotic normality of quantile estimator) are not
the same. For the asymptotic normality of the moment estimator (and for the scale as
well) the extra conditions are U(oo) > 0, so that the estimator is well defined, and
y ^ p, so that a second-order condition for log U holds, with auxiliary second-order
function Q. Besides, one needs
^o
AR,
n -> oo ,
(4.3.9)
instead of VkA(n/k) -> A (for more details see Remark 4.3.10 below).
Consequently, under the given conditions, if y := }>M and a(j) := &M in (4.3.3)
are the moment estimators, then
V > ~ * *
(4.3.10)
MqyMydn)
V<D<0
'
'
-(j^ji.
limr_>Oo^W-a(0/y/0and0<y < - p ,
^ffiffi .
and variance
[K 2 + I ,
{
,
7
1 (l-y) 2 d-3y+4^)
K>O,
[ (l-2y)(l-3K)(l-4y) '
141
(4.3.12)
0
< U
'
Remark 4.3.10 From Theorems 3.5.4 and 4.2.1 and (4.2.12) we have that ( r , 71, #)
has distribution
( - ^ ^
+ y+P + ( l - 2 y - ) ( l - K - ) 2 { Q - K _ ) G - 2 p } ,
+ (3 - 4y_)(l - Y~)P ~ \(\ ~ 2 y - ) 2 d - Y-)Q + K^
where 5 is a standard normal random variable and the random vector (P, Q) is the
one from Lemma 3.5.5 (Section 3.5). Further, B and (P, Q) are independent.
As mentioned above, Theorem 4.3.1 is not straightforward with respect to the
moment estimators. Recall that for the asymptotic normality of Vk(yM y) and
*Jk(oM/a{n/k) 1) we require VkQ(n/k) = 0(1), as n -> oo, where Q is the
auxiliary second-order function in the second-order condition for log U. In contrast,
in Theorem 4.3.1 we require VkA(n/k) = 0(1), where A is the auxiliary secondorder function in the second-order condition for U. From the proof of Theorem 4.3.1
we see that the second-order condition for U is necessary for the b part (I) and for
the nonrandom bias part (IV). Hence, if one wants to use the moment estimator in
quantile estimation, assume y/kQ(n/k) -* k e R, n -> oo, and Lemma B.3.16
provides the (finite) limit of A(t)/Q(t). Then the limiting distribution of (4.3.10) is
r + (y-.fB-y-A-y-U{^0}
Y-+P
Use Corollary 4.2.2 to obtain the bias and the variance given in (4.3.11) and (4.3.12).
n{\-F(xn)Y
142
-1/9
(...,*#)f
with xn known.
Theorem 4.4.1 Suppose for y > ^ and some function A with A(t) - 0, t > oo,
the second-order condition (4.3.1) holds. Write as before dn = k/(npn). Suppose:
1. the second-order parameter p is negative, or zero with y negative;
2. k = k(n) > oo, n/k -> oo, and \fkA{n/k) -> k e R, n oo;
5. dn -^ oo and wy(dn) = o(y/k), n -> oo, where for t > 0,
u ; y ( 0 : = r y /" j ^ l o g j d j ;
4. condition (4.3.2) holds for some estimators ofy, a(n/k), and U{n/k), say y,
a (n/k), and b(n/k), respectively.
Then, as n -> oo,
- ^ -
f^
- l ) 4 . r + ( K _) 2 * - (y_) ^ - X-*=-
w y ( 4 ) \/>
(4.4.2)
Y-+P
Wy(t)
logf,
y>0,
i(log0 2 ,
y = 0,
&-y,
y<o.
I Y
Moreover, since
Wy(t)
= J s~l-y(logt
- logs) ds=
s'l'yds
u~l du ,
143
Corollary 4.4.4 Condition (3) of Theorem 4.4.1 implies what we may call consistency:
Ml.
Pn
Corollary 4.4.5 The conditions of Theorem 4.4.1 imply
^(4) p
Wy(dn)
l
A)n) \Pn
Wy{d
Y- + P
The latter form of the result is more useful for constructing a confidence interval
for/v
Note that the limiting random variable is the same as the one in Theorem 4.3.1,
as could be expected from the Bahadur-Kiefer representation. For the proper nondegenerate limit distribution of the difference of the normalized left-hand sides of
(4.3.4) and (4.4.2) we refer to Einmahl (1995).
Condition dn -> oo means that we extrapolate outside or near the boundary of
the range of the available observations; cf. Remark 4.3.4 above.
Finally, note that condition (3) of Theorem 4.4.1 implies for all real y that
log</ n =o(Vfc) .
(4.4.4)
Then
-1/9
Pn
, I
= dn {max
Pn
= dn \ max I 0 , 1 + y
,y
/ Xn
r -~X
r n
ri.LdYn
- Il \\ \\ l1
*(f)
vn \\ II
~X
. /v v
xnn
= dn \ max I 0 , dl + y
y
-l/y
))\
144
*n
Vk Z
,"
a(fjqY{dn)
A r +
x2
>
A
(Y-)D2B-Y-A-X
Y-
Y-+P
by assumption
9y(dn)
and
_1
^L>V
Xn
Xn
a{l)qY{dn)
0P{\))
Wy(dn)
Wy(dn)
Wy(dn)
Wy(dn)
Wy(dn)
The first factor converges in probability to 1 by Corollary 4.3.2 and assumption (3)
of Theorem 4.4.1 (cf. (4.4.4)). The second factor also converges in probability to 1
since dn/dn >p 1 and the function wp (t) is regularly varying.
* (-0
%w
^r+'x*
w;K(r)
y wy(r)
pwy(r)\
y+p
/ '
as w - oo. So one could follow the approach of Theorem 4.4.1 or that of Theorem
5.1.2. The rate of convergence is the same in both cases.
A simpler version is valid when y is positive.
145
Theorem 4.4.7 Suppose for some function A, with A(t) > 0, as n -> oo,
lim ^ - = xy
/-oo
A(t)
.
p
Pn : = - I v
and
Pn:z=U
[ )
with dn := k/(npn).
The proof of the theorem is left to the reader.
4.4.1 Maximum Likelihood Estimators
Recall that the limiting distributions of the suitably normalized quantile and tail
probability estimators, i.e., the left-hand sides of (4.3.4) and (4.4.2) respectively, are
the same. Therefore if the maximum likelihood estimators YMLE and <TMLE are used
in (4.4.1), then under the conditions of Theorem 4.4.1 the limiting random variable
(4.4.2) is normal with mean (4.3.7) and variance (4.3.8).
4.4.2 Moment Estimators
When using the moment estimators (cf. Sections 3.5 and 4.2) one needs extra conditions; the same considerations of Section 4.3.2 apply here. Then, if the moment
estimators YM and &M are used in (4.4.1) the limiting random variable (4.4.2) is
normal with mean (4.3.11) and variance (4.3.12).
146
=b(k)-9--
(4 51)
As in the previous sections, we assume in the following that (y,a,b) when suitably normalized are asymptotically normal. Denote the limiting random vector by
( r , y l , * ) ( c f . (4.3.2)).
Theorem 4.5.1 Suppose that for some function A(t) > 0, / -> oo, the second-order
condition (4.3.1) holds with y negative. Suppose k = k(n) > oo, n/k oo, and
> X e R, n > oo. Then,
^
^ r + y2B-yA-
^T-
()
X-^
(4.5.2)
Y+P
This version is more useful for constructing asymptotic confidence intervals for x*.
Corollary 4.5.3 Under the conditions of Theorem 4.5.1,
" > *
s k
JC* - > J C * .
iim
. 1
+
y =
1 .
t-+<*
A(t)
y(p + y)
Proof From the second-order condition (4.3.1) and the second-order condition
(2.3.7) for the function a, it follows that
(um - ^ ) - (u(t) - f)
t-*oo
a(t)A(t)/y
U(tx)-U(t)
=
_ xy-\
a(tx) _
rY
_ W . L _ _ Um
A
'-oo
A(t)/y
t-H>o A(t)
+
xy f> -1
n m
Y + P
Now Lemma 1.2.9(2) implies that lim^oo (U(t) - a(t)/y) exists. This limit must
be U(oo), since the function a is regularly varying with negative index and hence
tends to zero. Lemma 1.2.9(2) also implies
tf (oo) - (U(t) - Sip.)
v
7
Iim
-a(t)A(t)/y
hence the result.
j
y+ p
147
'--"'(iHGHGKK)
-('(iHGHH^-""?)Hence by (4.3.2) and Lemma 4.5.4,
(f)
(f)
(!)
VK
K/
Remark 4.5.5 As pointed out in Aarssen and de Haan (1994), for estimating x* when
Y < \ it is more efficient to use different fc's for the various estimators involved in
Jc*. For instance, the authors suggest, with k\ fixed and k = k(n) -> oo, k/n -* 0 as
usual, to take
(453)
where M^ \ is like M but with k fixed equal to k\, b (n/k\) = Zn_^1>rt, and YM
is the moment estimator of Section 3.5 with the intermediate sequence k(n). When
Y < \ this estimator converges more quickly than the one from Theorem 4.5.1
(Exercise 4.6).
Another way to estimate x* when y < \ is simply to use the sample maxima Xn,n, similarly as in Remark 4.3.6. Under the natural conditions the rates of
convergence of x* in (4.5.3) and Xn>n are the same (cf. Corollary 1.2.4).
4.5.1 Maximum Likelihood Estimators
Note that for y negative the limiting distribution in (4.5.2) is still the same as that of
Theorem 4.3.1 on quantile estimation. Hence if the maximum likelihood estimators
YMLE and <TMLE are used in (4.5.1), then under the conditions of Theorem 4.5.1 with
Y > \ the limiting random variable (4.5.2) is normal with mean (4.3.7) and variance
(4.3.8).
4.5.2 Moment Estimators
When using the moment estimators (cf. Sections 3.5 and 4.2) one needs extra conditions; the same considerations of Section 4.3.2 apply here. Then, if the moment
148
estimators YM and &M are used in (4.5.1) the limiting random variable (4.4.2) is
normal with mean (4.3.11) and variance (4.3.12).
Another option would be to use j / _ , i.e., (4.2.3), to estimate y since we assume
the latter negative, but it turns out that this is no better than using YM> The extra
conditions needed are the same as for the moment estimator. Then, if y := p_ and
a(n/k) := &M are used in (4.5.1),
V k y l ^ ^
(4.5.4)
and variance
'
y < o < 0
r < P - U'
n<V<0
(4.5.5)
(1-K)2(1-3K+4)/2)
(4.5.6)
(l-2y)(l-3^(1-4)/)
The variance of the limiting variable is the same as when one uses the moment
estimator YM> The bias is also the same for y < p < 0; otherwise, the bias is larger
(in absolute value) than the one with YM> Therefore this latter option of using p_
shows no advantage.
Moml
MLE
Hill
149
10000
S8000
5 6000 H
1,4000
400
Fig. 4.4. Standard Cauchy distribution: (a) diagram of estimates of the quantile (the true quantile
2931.7 is indicated by the horizontal line); (b) mean square error (see the text for details).
I
1
<*>
&
|
a
8
4.5
4.0
\
\
3.5
/ \
30
&
2.5
Mom
i
2.5 2.0 -
PWM
i
- w
_ 2
\
100
200
300
40()
100
200
300
400
Fig. 4.5. Standard normal distribution: (a) diagram of estimates of the quantile (the true quantile
3.62 is indicated by the horizontal line); (b) mean square error (see the text for details).
4.6.2 Case Studies
Sea Level
We continue the data analysis of Section 3.7. Recall that we want to estimate the
quantile corresponding to a tail probability of 1/17 x 10~ 4 , on the basis of 1873
observations of the sea level during severe storms. We give results for Mom (the
quantile estimator from Theorem 4.3.1 with the moment estimator and the scale
estimator from Section 4.2) and PWM (the quantile estimator from Theorem 4.3.1
with the probability-weighted moment estimators (3.6.9)-(3.6.10)). Moreover, from
Section 3.7 we know that y is close to zero. Then one can consider the following
options: use the quantile estimator as given in Theorem 4.3.1 or assume y = 0
and use
K
-'+'(>()
Pn
150
Mom
PWM
Falk
0.0010 - j
1.02
0.0008 H
1.01
0.0006 -|
; 1.00
0.0004
"\
[j 0.99
0.0002
J- S o.o
0.98
200
300
400
i^
100
200
300
r
400
it
Fig. 4.6. Standard uniform distribution: (a) diagram of estimates of the quantile (the true
quantile .99985 is indicated by the horizontal line); (b) mean square error (see the text for
details).
200
1
400
1
600
r
800
Fig. 4.7. Sea level data, diagram of estimates of the quantile (cm).
The correspondent diagram of estimates for both options are shown in Figure 4.7. As
expected, one finds less volatility when y is fixed to 0.
In any case one has to estimate the scale. The corresponding diagram of estimates
is shown in Figure 4.8.
Under the conditions of Theorem 4.3.1 with A = 0 we have
Vk-*fc~Xp" i y/vaiyN
a(l)qf(d)
qy(d)^-^,
151
200
300
S&P500
We continue the S&P 500 data analysis of Section 3.7. Recall that we focus on
the log-loss returns comprising 2643 observations. A short summary of the largest
observations is in Table 4.2.
Table 4.2. S&P 500 data.
3rd Quantile Xn_\n
0.009
Xn,n
0.086 0.228
Next we show estimates of the probability that the log-loss return exceeds the
value 0.20, using the quantile estimator from Theorem 4.4.7 with the Hill estimator
(we simply call it Hill) and the quantile estimatorfromTheorem 4.3.1 with the moment
152
Mom
H,
0.00005 0.0 -
300
Fig. 4.9. S&P 500 data, diagram of estimates of the tail probability.
estimator and the scale estimator from Section 4.2 (we call it Mom). The diagram of
estimates is in Figure 4.9.
Fig. 4.10. Life span data, diagram of estimates of the right endpoint.
Life Span
We continue the life span data analysis of Section 3.7. The data set consists of the
total life span (in days) of 10 391 residents of the Netherlands, and we are interested
in the estimation of the right endpoint of the underlying distribution. In Section 3.7
we analyzed its existence by not rejecting the hypothesis that the underlying extreme
value index is negative. So now we assume that the right endpoint exists and proceed
with its estimation.
We give results for Mom (the quantile estimator from Theorem 4.3.1 with the
moment estimator and the scale estimator from Section 4.2) and PWM (the quantile estimator from Theorem 4.3.1 with the probability weighted moment estimators
(3.6.9)-<3.6.10)). In Figure 4.10 is the diagram of estimates of the right endpoint.
153
Table 4.3. Life span data: upper limit of 95% asymptotic confidence intervals for the endpoint
(in years).
k
V k
where varp is the respective asymptotic variance with y replaced by its estimate and
ZQL is the 1 a quantile of the standard normal distribution. In Table 4.3 we give the
one-sided confidence intervals for the endpoint for k = 100, 200,400 (cf. Exercise
4.7 for PWM).
Exercises
4.1. Prove the consistency of &M, that is, the first part of Theorem 4.2.1(1).
4.2. If one replaces pn by cpn (c > 0) in Theorem 4.3.1, how does the limit result
change?
4.3. Prove that wy(t) t~yqy(t) = HYio(t) with qY from Theorem 4.3.1, wy from
Theorem 4.4.1, and HYP from Corollary 2.3.4.
4.4. Let y < 0 or p ^ 0. Verify that the expected value of the limiting random
variable of Theorem 4.2.1(2), (1 - y_)(3 - 4y_)P - 2~1(1 - y_)(l - 2y-)2Q ^l{x#o and (x<0 or /0<0)}(p/ + Y 1{^=0})_1 as a function of (y_, p'), equals - A / / ( l y_ pf)~l (1 2y_ p')~l (consider the random variables (P, Q) of Lemma 3.5.5
and recall that in the statement of this lemma the mean values and covariance matrix
are in terms of y_ and pf).
4.5. Prove Theorem 4.4.7.
4.6. (Aarssen and de Haan (1994)) Let Xi, X2,... be i.i.d. random variables with
distribution function F. Suppose U(oo) > 0 and (4.2.2), i.e., the second-order condition for log U, with y negative and auxiliary second-order function Q. Let k\ be
fixed and k = k(n) -> 00, k/n -* 0, and >/kQ(n/k) -* 0, as n -> 00. Then, with
the notation of Remark 4.5.5,
154
+<
st-v&r
where Z\, Z 2 , . . . are i.i.d. random variables with a standard exponential distribution.
4.7. Consider the quantile estimator of Theorem 4.3.1 with J^WM and a(n/ k) = <JPWM
from Section 3.6.1. Check that the variance of the limiting random variable in (4.3.4)
is given by
(l-yX2-y)2(l-Y+2y2)
(l-2 K )(3-2y)
2
Y >0,
4
^n
'
<
Advanced Topics
Chapters 1^4 constitute the basic probabilistic and statistical theory of one-dimensional
extremes. In this chapter we shall present additional material that can be skipped at
first reading. It is not used in the rest of the book.
Section 5.1 is a mirror image of Sections 2.3 and 2.4: it offers an expansion of the
tail empirical distribution function rather than the tail quantile function as in Section
2.4.
Section 5.2 offers various ways to check the extreme value condition in facing a
data set. Some procedures use the tail quantile function and others the tail empirical
distribution function.
Section 5.3 uses an expansion for the tail distribution function (not empirical
distribution function) developed in Section 5.1 in order to obtain uniform speed of
convergence results in the convergence of maxima toward the limit distribution. This
also leads to a large deviation result.
Some classical results are presented in Sections 5.3.1 and 5.4: convergence of
moments and weak (in probability) and strong (a.s.) behavior of the sequence of
maxima.
In Sections 5.5 and 5.6 the conditions "independent and identically distributed"
that we used throughout Chapters 1-4 are relaxed: in Section 5.5 the assumption
of independence of the initial random variables is relaxed and in Section 5.6 the
assumption of stationarity is relaxed.
(XnHks]
-U(l)
ao(f)
_ s^_-l\
rY_x
156
5 Advanced Topics
= P(s)
n-oo
(5.1.2)
Sn
[I E
UiXi-UWWaotn/k^x}
- (1 + YX)~l/Y
a.s. locally uniformly for those x for which 1 + yx > 0. The conclusion is that
^ U
^-^/WM*/*)-}
(5.1.3)
in Z)(0, l/(max(0, ->/))).
The present section aims at obtaining a weighted uniform version of (5.1.3). That
is, we discuss expansions for the (empirical) distribution function similar to those
of Section 2.4. Since the proofs are quite technical and lengthy, some parts will be
sketched rather than carried out in detail. For a full account of the theory and full
proofs, see Drees, de Haan, and Li (2006). We begin with an expansion of the tail of
the distribution function analogous to Section 2.3.
Theorem 5.1.1 Let F be a probability distribution function. Suppose the secondorder condition (2.3.5) holds for the function U, the inverse of 1/(1 F). Letao, bo
and, Ao be the functions defined in Corollary 2.3.7.
1. If not y = p = 0, for all c> 8 > 0, then
hrn^
sup
((1 + yx)-l/yy
exp ( - |log(l +
t (1 - F(b0(t) + xa0(t))) - (1 +
A0(t)
- a + Yxrl'y-lVytPi
yxy1/y\)
yx)-l/y
5.1 Expansion of the Tail Distribution Function and Tail Empirical Process
157
with
P <0^y
p < 0 = y + p,
log* ,
* y , p W :=
+ p,
^Hogx,
p = 0^y,
awd
Dt,p,8,c
2. Ify
p < 0,
{jcia + ^r^^iAowr }, P = O.
lim
t-+OG
up
sup
~
-e|log(f{l-F(MO+*ao(0)})l
e
,x-e\x\
I (
min
,e
{l-F(b0(t)+xa0(t))}
*A
t(l-F(bo(t)+xao(f)))-e
A0(t)
= 0.
(5.1.5)
Proof. Here is a proof for p < 0 and F continuous and strictly increasing. As usual,
we revert to the properties of U rather than F to obtain the necessary approximations.
Hence we define
1
y := t{l-F(b0(t)+xa0(t))}
'
so that
x =
U(ty)-b0(t)
a0(t)
This leads to the following expansion, where the notation g(x) := (1 + yx)
qt(x) := (U(tx) - bo(t))/ao(t) - (xy - \)/y is used:
-.w
= v- K -%(v)-(i + }/)j
and
-I/K1
,-l/K
oo(')
yxyl/y
l y
)-*VT-)\
J (i + y ( -
+ MJ)
^*-
Now, the integrand function (1 + y ((yy X)/y + u)) l/y 2 always lies between
its value for u = 0 and for u = qt (v), i.e., it lies between (\ + y{yy \)/y)~lly~2
=
y~l~2y and (1 + y((yy - \)/y + qt{y))Txly-2
= v" 1 " 2 ^! +
yy-yqt(y)r1/y-2.
158
55 Advanced Topics
By examining
lining the uniform inequalities of Corollary 2.3.7 for the quantile function
one sees that for ai
any c, 8 > 0 (recall p < 0),
lim
sup
y~Y \qt(y)\ = 0 .
(5.1.6)
Hence
ce
\t(l-F(bo(t)+xao(t))-
-y-l~Yq,(y)\
<2\l + y\y-l-2rq?(y)
(5.1.7)
for all11 y > ct8~l and t sufficiently large. Upon multiplying the left- and ri
sides in (5.1.7) by y we see that these two statements easily lead to
lim
sup
\y(l(1 + yx)~
yx)-ll/y/y
\y
-1l | = 0
(5.1.8)
yl~pexp(e\logy\)
sup
X |(1 + YXr(MM*y,p
( d + Y*)l/Y) -
y-(l+Y)*Y,fi,O0|
=0.(5.1.9)
\t{l - F(bo(t)+xao(t))}
|)
- (1 +
yx)-l/y
A0(t)
- (1 + yxrl'y-lVYtP
yl-Pe-e\logy\
,, ,.
+
+
- p ee--e\fozy\
|l0.,|
y lyi-p
yl-Pe-e\logy\
'
^ W
_ y-O+X)^(y)|
Anil)
l ^ ^ )^*K.pO')
o o -- (I(1 ++ yxr^y-^y^ai
+ Yx)l/y)t
y*)_1/,'-1*y
(5.1.10)
5.1 Expansion of the Tail Distribution Function and Tail Empirical Process
2|1 + y\ (y-^e-^*
159
|fg[) {y^qt(y)}
For the last factor we use (5.1.6). The uniform boundedness of the other factor on
y > ct8~l is again a result of the uniform inequalities of Corollary 2.3.7 for the
quantile function.
The second term on the right side of (5.1.10) converges to zero uniformly on
v > ct8~l: this is just Corollary 2.3.7.
The third term on the right side of (5.1.10) converges to zero uniformly on v >
ct8~l by (5.1.9).
Finally, note that v > ct8~l if and only if x e E>t,p,bycforp < 0.
This result will be used later in the chapter but also for the proof of the next result:
an expansion of the tail empirical distribution function, that is, the tail empirical
process.
Theorem 5.1.2 Let Xi, X 2 , . . . be i.i.d. random variables with distribution function
F. Let Fn be the empirical distribution function, based on X\, X2, ..., Xn. Suppose
that the function U, the inverse of 1/(1 F), satisfies the second-order condition
(2.3.5) with y R, p < 0. Let k = k(n) be a sequence of integers such that k -> 00
and /icAo(n/k) is bounded, as n > 00, with Aofrom Corollary 2.3.7. We also use
#0 and bo from that corollary. Then the underlying sample space can be enlarged to
include a sequence ofBrownian motions Wn such thatfor all XQ larger than the lower
endpoint of the limiting extreme value distribution
l/(yv0):
1. If not y = p = 0, then as n -> 00,
sup
yxr^y1/2+
Ul +
*o<*<i/((-y)vO)
'
|v^(>-^+,.))-o+>-)
- Wn ((1 +
lly
Yxy
^SPoom(l,,)*|VS{J(l-F,(fto(J)+xfl0(J)))-e-J
-Wn(e-*)-VkAoQje-
4o.
Proof The result is well known in the case of the standard uniform distribution (see,
e.g., Einmahl (1997), Corollary 3.3) and this result will be our point of departure. Let
160
5 Advanced Topics
Un be the uniform empirical distribution function. Then the underlying sample space
can be enlarged to include a sequence of Brownian motions Wn,
sup/- 1 / 2 *-* 110 *"
* ( & ( * ) - ) -
t>0
-+0,
(5.1.11)
asn -> oowithA: -> oo,k/n 0. By the well-known quantile transformation, lFn
has the same distribution as Un{\ F). Hence by (5.1.11) for suitable versions of
Fn,
(zn(x)rl/2e-llogZn(x)l
sup
{x:zn(x)>0}
p
0 (5.1.12)
k* { (i ~Fn (b + x a ) ) " Zn(x)\ ~Wn(znM)l
with
*=-F('-'K) + ~G)))
In order to get the result of the theorem, we are going to replace zn(x) by (1 +
yx)~xly in (5.1.12) and we shall see how this can be done for p < 0.
First note that by (5.1.8) for 0 < 8 < 1 and sufficiently large t with XQ >
-l/(y
v0)andc>0,
Uo,
) = [x : 0 < (1 + yx)-1^
C [x : (1 + yxyl/y
<qmthq
< cr*+l]
:= (1 + yxo)'l/y
= Dt,pAc
< oo}
(5.1.13)
((l +
yxr1/y)~
xo<x<l/((-y)vQ)
-I'-lil'-'W^-ffl"''^^"-'")!^(5.1.14)
First note that by the law of the iterated logarithm, lim^oo t
for s > 0. Hence by time reversal for s > 0,
e l 2
^ W(t) = 0 a.s.
5.1 Expansion of the Tail Distribution Function and Tail Empirical Process
lim
s-l/2W(s)
161
(5.1.15)
0 a.s.
o^O-'Wi)))
lim
sup
k\
V \k
~ ^o<^<V((-y)vO)
Hence for (5.1.14) it is sufficient to prove that if
n >00
tn(s)
lim sup
^ 0<s<s0
= 0.
(5.1.16)
= 0,
n oo
then
W(tn(s)) - W(s)
(5.1.17)
lim sup
= 0 a.s.
Ot(*)) 1 / 2 -
^0<S<So
Take a sequence sn - so > 0, n - oo. Then by (5.1.16), also tn(sn) -> so,
n - oo. For so > 0 by continuity of Brownian motion (5.1.17) is true. For so = 0 by
(5.1.15) and (5.1.16) both W(tn(sn))/(tn(s))l'2-e
and W(s)/(tn(s))l'2~e
converge
to zero and (5.1.17) follows.
Remark 5.1.3 The Brownian motions in Theorem 5.1.2 (on tail empirical distribution
functions) are the same as the Brownian motions in Theorem 2.4.2 (on tail empirical
quantile functions). This can be seen most easily by applying Vervaat's lemma (see
Appendix A) to the functions in Theorem 5.1.2, restricted to a compact interval.
The result of Theorem 5.1.2 can be simplified in the case y > 0 and reads as
follows.
Theorem 5.1.4 Let X\, X 2 , . . . be i.i.d. random variables with distribution functions
F. Let Fn be the empirical distribution function based on X\, X2, ..., Xn. Suppose
that the function U, the inverse of1/(1 F), satisfies the second-order condition of
Theorem 2.3.9, hence in particular y > 0. Let k = k(n) be a sequence of integers
such thatk -+ 00, \fkAo{n/k) bounded, n - 00 with Aofrom this theorem. Then the
underlying sample space can be enlarged to include a sequence of Brownian motions
Wn such that for all XQ > 0,
S xa/-wU|J(i-F.(xi,(I)))-x-*j
X>XQ
VY
-Wn(x-Vr)-VkAoQx-
x^y
-1
-^>0,
(5.1.18)
yp
as n -> 00.
Proof. The proof follows the line of the proof of Theorem 5.1.2 but now we use the
inequalities
l-F(tx) _ -\/y
X
1-^(0
_
a(f)
x-\/Y
XYP
- \
p/y
<6x-l/y+f>/ymsix(x8,x-8)
162
5 Advanced Topics
Example 5.1.5 As an example let us apply this result to get another proof of the
asymptotic normality of the Hill estimator:
l*" 1
n C
= - /
k
JXn-k,n
f
= /
ds
(l-F(*))-
Jxn-k,n/U{n/k)kV
Vt//i
Hence
= : I + II.
For part / note that by Theorem 2.4.8,
v ^ f e ^ - l U ^(1)4-0.
(5.1.19)
Hence Xn-k,n/U(n/k) -+p 1. Using the approximation of Theorem 5.1.4 for the
integrand, we see that
Vkf1
g{i_ F l ( (^(J))}_,-i/x)*4o.
(5.1.20)
Vkf
lU-Fn(sUQ)\dS
+ YWn{l)U.
(5.1.21)
163
It follows that
ds
ds
X,Y , s ^ - l
+ opiX) .
Ys~ 7
p/Y
Now
f W(u)^-\(-W(l)+
fl fl
Jo Jv
= Y2[l+2
"
W(v)^-\
du dv
u v
v^-2
JO Jv
Jo
f1
Jo
dv\
v J
v^\=Y2
V )
^-
Recall the representation of the tail quantile process via a special construction
(Corollary 2.4.6) that holds under the second-order condition: for e > 0,
164
5 Advanced Topics
Xn-m,n
- Xn-k,n _ s^^-1
ao ()
+ VkA0Q)
1 ls-y-iWis)
Wn(l)
Vk I
(*K,p(j-1)-*K,p(l))+o/.(l)max(l,5-''-1/2-)j)
(5.2.1)
where o/>(l) tends to zero uniformly for 0 < s < 1. Since the left-hand side is small
uniformly in s, we use it for die test. However, it contains die two unknown quantities
y and ao. We replace tiiese by estimators y and d(n/k). The test statistic becomes
s2 ++lds
"
'
(5 2 2)
--
with j/+ := max(0, y). The weight function is necessary to ensure that all integrals
converge.
We are going to see that h,n - > p 0 under the extreme value condition, so that it
can be used as a goodness-of-fit criterion, and that klk,n has a nondegenerate limit
distribution under the second-order condition.
Theorem 5.2.1 Let X\, X2,... be i.i.d. random variables and suppose that their distribution function F is in the domain of attraction of some extreme value distribution.
1. Letk = k(n) -> 00, k/n -> 0, asn -+ 00. Assumeyn-^p
->p I. Then
anda(n/k)/a(n/k)
/*,- 0.
2. Suppose moreover that the second-order condition (2.3.5) holds and that
VkA(n/k) -> 0, as n * 00, where A is the second-order auxiliary function.
Further assume that
-y'-Hl-l)J*\Y-Y.
4 f r - 1 I - (r(W), or(WW) 4- 0
(5.2.3)
where T and a are measurable real valued Junctionals of the Brownian motion
from (5.2.1). Then
klk,n -> ly
where
Iy:=
J [s-y-lW(s)-W(l)-a(Wn)S
*
y
2
+ T(Wn) I U~y-llOgudu\
with W Brownian motion.
S2y++lds
165
Remark 5.2.2 The reader may want to check that (5.2.3) holds for all the estimators
discussed in Chapter 3.
For the proof we need some lemmas. We have seen the following one previously:
Lemma 5.2.3 For s > 0,
lim sup s~l/2+sW(s)=0
QWo<s<$
a.s.
s~y - \
= ()/ > / ) / u
* log w du
ex - 1
< ew - 1
s~y-i
{9 ~ y) I u
logu du
= \9 -y\
\ (y - y ) l o g w
<\9-r\f
< \9-y\(s~l9~yl-l)
11u
log u du
(u-ly-y\-i)u-y-l\\ogu\du
I
u-y~l\\ogu\du.
a(|)
where the op (I) term tends to zero uniformly for 0 < s < 1.
166
5 Advanced Topics
Proof.
S~Y ~ 1
Xn-[ks],n - Xn-k,n
o(l)
(5.2.4)
(Xn-[ks),n - Xn-k,n __ S
*(i) I
>(f)
- 1\
Y )
(5.2.5)
(5.2.6)
j y
(f) W ! )
s~r - 1
s-r -
(5.2.7)
!
- U(Y-k,n)
s-r - 1
( Yn-[ks],n \
ao(Yn-k,n)
( Yn-[ks],n \
(5.2.8)
We start with the second term on therightside of (5.2.8) and use Lemma 2.4.10.
As in the proof of Corollary 2.4.10 one sees that the supremum in Lemma 2.4.10 can
be taken over 0 < s < 1. By combining the expansions for Yn-[ks],n and Yn-k,n in
Lemma 2.4.10 we get
( yn-[ks],n \
sup s
K+l/2+e
Vk
\-Y^7)
~l
s-r-l
0<s<l
-s-r-lWn(s)
s-rwn(l) = op(l)
(5.2.9)
(check separately for y > 0, y < 0, and y = 0). It follows by Lemma 5.2.3 that
sup s
0<s<l
y+l/2+e
' (Yn-{ks),nV
\ Yn-k,n )
s-y-i
= oP{\)
(5.2.10)
167
For the first part of the right-hand side of (5.2.8) we use the uniform inequalities
for extended regularly varying functions (Theorem B.2.18): there exists ao(t) ~ a(t)9
t > oo, such that for all e > 0 there exists to(s) such that for t > to and x > 1,
-y-e
U(tx) - U(t)
xY - 1
< .
<*o(t)
We apply this with t := Yn-k,n (-* oo a.s., as n - oo, cf. Lemma 3.2.1) and
x := Yn-[ks],n/Yn-k,n and get
U(Yn-[ks),n) ~ /(?-*,>.)
n-k,n
1
= Op(l)
(Yn-[ks],n\Y
0(^n-it,n)
with the op (I) term tending to zero uniformly for 0 < s < 1. Next we apply (5.2.10)
to get for 0 e R,
s-e
V Yn.k,n )
+ op(i)s-e-v2-
We have
u(Yn-m,n) - t/(r-t,B)
oP(l)s-v-V2-.
ao(Yn-k,n)
~~ 1
= (1 +Op{\))Op(\)s-Y-V2-S
+
+ (1 +
0P(l))Op(l)
S - Y - l
oP(l)s-s-y\logs\.
>
Y
- + T(W) f
Js
+ oP(l)s-r+-V2-,
u-r-hogudu
(5.2.11)
where the op(l) term tends to zero uniformly for 0 < s < 1.
Prao/ Firstnotethatsince(2.3.5)holdswithareplacedwithao,wehaveao(0 ~ (0
and even ao(t)/a(t) - I o(Ao(t)), t -> oo. Hence (5.2.3) implies
168
5 Advanced Topics
Vk\ '-(I)
Ml)
lj-a(W)4-0.
The left-hand side of (5.2.11) is, according to (5.2.1) and Lemma 5.2.4,
aojl)^
a
(i)
/x.-m
,n-Xn-k,n _ s^-l\
\
o(f)
y J
J^fklm-\L
(5.2,3)
_^(fll-f2zl)
(5.2,4)
u~y-llogudu
- l ) / M-y-'llogiil dw .
By Lemma 5.2.3 the error term connected with (5.2.12) is op(l)i _ > / _ 1 / 2 _ . The error
term connected with (5.2.13) is op(l)(s~Y l)/y. The error term connected with
(5.2.14) is
0P(l)
(s~li>-rl
- l ) /* i T ^ l l o g K l d K .
Now
-Ir-rl _ ! = |p _ y | /
M-lK-yl-irfM<
|y - yis-ly-yli '|logj|
i
Js
= op(l)fl
(s-r-W-e
(s-y-1'2-*
l) V
+ +
l ds
v lfs-ss2r++1
ds.
(Xn-.[ks],n -Xn-k,n
*(!)
y
2
< (2(a(Wn))
5-x+-i/2-^
169
Hence
2p++1
)
2es-lY+-l-2e}s-es2y++\
which is integrable (and note that the distribution of a(Wn) does not depend on n).
Hence by Lebesgue's theorem on dominated convergence (and Skorohod construction), klk,n ~^d ly, as n -> oo.
Simulations seem to tell us (cf. Hiisler and Li (2005)) that this quite natural test
does not perform as well as a similar one involving the logarithms of the observations
(Dietrich, de Haan, and Hiisler (2002)). The background is the following. The domain
of attraction condition
hm
f-oo
=
a(t)
x > 0,
U(t) _ xY~ - 1
~
y-
and
a(t)
hm
= V-L. .
/+
^ o o U(t)
Now the moment estimator (Section 3.5) provides separate estimators y # and y_
for y+ and y_. Lemma 3.5.1 states that for y e R, the estimator YH which in fact is
the Hill estimator from Section 3.2, satisfies
YH
P
1
r
(f)/^(i)
~y-
provided k = k(n) -> oo, k/n -> 0 as n -> oo. This suggests that one use the
following test statistic:
(1 - P-)V ,2
7
rfjt
where y// and j?_ are as in Section 3.5, with y>H the Hill estimator and p_ the one in
Remark 3.5.7.
We have the following result.
Theorem 5.2.7 Let X\, X2,... be i. i.d. random variables and suppose that their distribution function F is in the domain of attraction of some extreme value distribution.
170
5 Advanced Topics
2. Suppose that moreover, the second-order condition for log U (3.5.11) (cf. also
Lemma B.3.16) holds. Let k = k(n) be such that
lim Vk Q (j)
= 0.
+-
P(W\
y2
02 U
du\] s ds
s-y--lW(s)-W(l)ds,
(s-y--lW(s)-W(l)}ds,
P
.10 0.30 0.50 0.70 0.90 0.95 0.9750 0.99
.181
.174
.169
.168
.169
.169
.173
.176
.222
.213
.208
.206
.207
.208
.212
.218
-s-^)
y- /
P
sup
-Y--X
171
Wn(s) + W(l) - 0 ,
replaced with
Jk(y_-y_)-R(Wn)^P0,
with
P(W,'n) := f
-yU
Jo
R(W) := (1 - y_) 2 (l - 2 K _)
x
S Y
- 2 P + (1 - 2y_) f
'~
(s~y-1
Wn(s) - W(l)) ds
fl
>(f)
l o g X-[ksln
~ log
Xn-k,n
90 (I)
- 1 a-y-
-1
Jo
+ OP(1)
rfi
-If
ds + ^={
I s-y--lWn(s)
- W(l) ds
/k [Jo
s-Y-M2-e ds
I_
Jo
It follows that
V^l^--!)-P(W)4-0.
Next note that by Remark B.3.2, (3.5.11) holds with q replaced by qo, since
(5.2.15)
172
5 Advanced Topics
Similarly,
M(n _ _
Hence
/
Mf>
V(o(f))
(1-Y-W-2Y-))
Cl s~y~ 1 /
(s~y lW(s)-Wn(l)j
-2
Since y_ = 1 - 2 _ 1 (1 finishes the proof.
YH/MP)'1
ds-^0.
(1 - Y-)
s~y- -
(1 " Y-)
Y-
u~y-~~llogudu
Proof (of Theorem 5.2.7). The proof is similar to that proof of Theorem 5.2.1, now
with the use of Lemmas 5.2.8-5.2.10. It is left to the reader.
Remark 5.2.11 Here and in the next theorem a similar result can be proved when
one replaces s2ds in the definition of Dk,n with s^ds as long as n > 0. Hiisler and Li
(2005) recommended the value n = 2.
In the special case that only positive values of gamma are possible (for example
if the distribution is not bounded above), a simpler test can be used.
Theorem 5.2.12 Let X\, X2,... be i.i.d. random variables with distribution function
F. Suppose F is in the domain of attraction of an extreme value distribution GY with
Y > 0. Define
173
2
LJl
Sktn := /
^
JO \
YH
where yn is the Hill estimator from Section 3.2.
+ log^
^ ds,
J
'
m
/ x
= JC-1^-
- .
Qf(f)
(5.2.16)
yp
M_1^(M)
rfw J
s^ds
^knogxn.[ksU-iogxn.k,n+io^
0<s<\
-+ 0. (5.2.17)
|
According to Hiisler and Li (2005) this test does not perform so well as the others.
Next we discuss the behavior of the test of Theorem 5.2.12 under two types of
alternatives.
Example 5.2.13 (Super-heavy tails) Let F(x) = 1 - (log JC)"^ for x > e and p
a positive parameter. Note that logX = Y1^, where Y has distribution function
1 1/JC, x > 1. Then with a 1/fi (cf. proof of Lemma 3.2.3),
_
5 M =
fl I
(Yn-lkS],n/Yn-k,n)a - 1
/n 1 1 V * - 1 (Y
^
^ ~ T
J0
\k Zw=0 {Yn-i,n/Yn-k,n)
~1
S I S
ds.
174
5 Advanced Topics
d
/ 7 , ^f-W-*-1
+log,) , ' * .
i=0
i=l
> 1; hence
i=0
Sk,n + ( ((s-a-l)(l-a)
Jo
logs)2s2ds>0.
yHlogs
~ [En-k,n] + Qogs)-
= [(En-lks],n ~ E-k,n
( [ - , > ] - [-*,])
,=0
1*_1
+ (log s)- J2 [{En-i,H
i'=0
175
with E\, 2, 3 , . . . as before. Note that the expectation of [E\ + p] is ep/(e 1).
Hence, as n > 00, and for this sequence k = k(n),
Sk,n -+ f
([-logs]
+ (e-
l ) " 1 log*) V
ds>0.
(Wn(x) + L(/\x)f
x"-2dx
L /\x)
:= I
-4 0
Here
r(Wn)xlogx
l+Y
y*o,
y=0.
176
5 Advanced Topics
Table 5.2. Quantiles of the asymptotic test statistic kTk,n with rj = 1 and the maximum
likelihood estimators.
y
4
3
2
1.5
1
0.5
0.25
0
-0.1
-0.2
-0.3
-0.4
-0.45
-0.499
P
.10 0.30 0.50 0.70 0.90 0.95 0.9750 0.99
.086 .123 .161 .212 .322 .393
.085 .120 .156 .205 .307 .372
.083 .116 .150 .195 .286 .344
.082 .115 .148 .192 .282 .340
.082 .114 .146 .189 .276 .330
.083 .116 .149 .194 .285 .343
.085 .119 .153 .120 .295 .355
.089 .126 .163 .213 .319 .388
.091 .129 .168 .221 .330 .400
.093 .133 .174 .231 .350 .425
.096 .139 .183 .242 .369 .449
.100 .145 .192 .256 .393 .484
.103 .150 .199 .320 .416 .511
.107 .157 .210 .338 .439 .546
.462
.440
.402
.400
.388
.404
.415
.455
.471
.500
.531
.576
.605
.652
.558
.532
.489
.480
.466
.481
.499
.542
.569
.604
.653
.690
.735
.799
XY
- 1
y
177
\ =
)
lim
(X--V<ri\k
f xk dG
{x)
(533)
yx)~l,Y
for all x with 1 + yx > 0. Application of Lemma 1.1.1, Theorem 1.1.2, and Lemma
1.2.12 gives
,. v(tx) - v(t) xy - I
hm
=
t^oo
a(t)
y
for all x > 0. By Theorem B.2.18 for s, e' > 0 there existsfasuch that for t, tx > to,
V(tx) - V(t)
f l
x-Ye-
\ Zx\
00 (0
xY - 11
Y
<s
(5.3.4)
for some ao satisfying ao(t) ~ a(t), t -> oo. We write for n > fy>
/V(wZ)-y(w)\*
^o(") /
=
*I
= : I + H.
a0(n)
1 {
^0}
ood.)
l{nZ<t ]
For / we use inequalities (5.3.4). The upper bound is (note that \a + b\k <
2 (\a\k + \b\k))
k
2k j Y-f
178
5 Advanced Topics
(rrX
xK dGy(x) .
I V(nx) - V(n) \k
d(e-V*)
a0(n)
-L
V(x) - V(n) I
ao(n)
JO
< ne-(n~1)/t0
2k
\Jo
(-l)M)W
e-(n-D/to^
< (n
[ne
d {e~nlx)
a0(n)
ne-^~^xd(e-llx)
\ao(n)\
\ao(n)\ J0
//
W W
\ao(n)\k
Since the sequences V (n) and a(n) are of polynomial growth and since the first factor
tends to zero exponentially fast, part II tends to zero.
Since(-logF(x))/(1-F(x)) -> l,as* f
U(oo),weget(U(n)-V(n))/a0(n))k
-> 0, n -> oo (Lemma 1.2.12, Chapter 1). Hence E((Xntn - U(n))/a0(n))k =
E((Xntn V(n))/ao(n) + sn)k with sn -> 0, n -> oo. By going through the proof
again with this modification, it is easy to see that also E((Xnyn - V(n))/ao(n) +
sn)k - fooXkGy(x) for any en -> 0. It follows that (5.3.3) holds with U(n) replaced by V(n). Finally, note that changing from ao to a does not affect the result.
This finishes the proof.
For y ^ 0 we have somewhat simpler results. The proof is very similar to the
proof of Theorem 5.3.1 and it is omitted.
Theorem 5.3.2 Suppose that the conditions of Theorem 5.3.1 hold.
l. ify > o,
lim E
w-oo
(m) -P'M-*"*)}
2. Ify < 0 (note that U(oo) := lim^oo U(x) is finite in this case),
"-oo
\U(oo)-U(n)J
/J
179
'max(Xi,X 2 , . . . , X ) - fc,
- < * ) = exp ( - ( 1 + yjc)- 1 / y ) =: Gy(x)
n->oo
for all x with 1 + yx > 0. Then by the continuity of the limit distribution function,
lim sup \Fn(anx + bn) - Gy(x)\ = 0 .
(5.3.5)
^ xeR
The speed of convergence in (5.3.5) is not the same for all distributions in some
domain of attraction. For example, the convergence rate for the exponential distribution is of order n~l (Hall and Wellner (1979)), but for the normal distribution it
is of order (logn) - 1 (de Haan and Resnick (1996)). Rootzen (1984) proves that if
the convergence rate is faster than exponential, the initial distribution must be an
extreme value distribution. In fact, the convergence rate depends on the second-order
behavior. This came out for the first time in a paper by Smith (1982). As we shall
see, the second-order condition is sufficient for a uniformly weighted version of a
second-order expansion for Fn.
We are going to assume that the function V := (l/( log F))^~ = F*~(e~1/1)
satisfies the second-order condition of Section 2.3. Consequently, for some y e R,
p < 0, and all , 8 > 0 there exists to = to(e, 8) > 0 such that for all t, tx > to,
n
x-iY+p)e-8\logx\
with
v(tx)-v(t) _
a0(t)
A0(t)
xy-i
xY+P-\
*y,p(*) =
Y+P
y ^^logjt
1
*vfpW
<
(5.3.6)
P<0,
P = 0#y,
(5.3.7)
2" (logJc) ,
and where the functions ao and Ao(t) are from Theorem B.3.10.
In order to get a uniform rate of convergence in (5.3.5) we have to choose the
normalizing constants an and bn in a special way. For the first take ao(n) with the
function ao from (5.3.6), and for the second,
V(n) ,
V(oo) + y~lao(n) ,
V(oo) + y~la0(n) + (y + p)-lao(n)A0(n)
b0(n) =
y > 0,
y < 0 = p,
,y <0, p <0
with Ao(n) again from (5.3.6). Further, define the function ^y,p by
^
/rx
y,plj
180
5 Advanced Topics
Theorem 5.3.3 Suppose the function V := (l/(logF))*~ satisfies the secondorder condition, so that (5.3.6) holds. Then
Fn(a0(n)x + b0(n)) lim
A0(n)
oOO
Gy(x)
= -J
(5.3.8)
(Wy(x))
yx)v0)l'y
Vy,p(x)f
and J(0) and 7(oo) are defined by continuity, i.e., 7(0) = 7(oo) = 0.
Remark 5.3.4 The second-order condition for this theorem is imposed on the function V := (l/( logF))*~, whereas in Theorem 5.1.1, for example, it is imposed
on the function U := (1/(1 F)) 4 ". The relation between these two conditions is
discussed in Drees, de Haan, and Li (2003).
For the proof we need some lemmas.
Lemma 5.3.5 Assume the conditions of Theorem 5.3.3. For any e, 8 > 0, there exists
no > 0 such that
Pn AX)
>
A0(n)
?,(*)
<emax(xY+p+*9xY+f>-*)
(5.3.9)
V(nx) - bpjn)
x* - 1
xy-i
Aoin)
-Vy,pix)
<emzx(xy+e+s,xy+p-8)
This gives (5.3.9) for y > 0. For y < 0, it is easily checked by the definitions of
aoin), Aoin), and boin) that
Vin)-bpjn)
tf0(w)A0(n)
fQ,
= 0,
181
sup x -Q+Y)e~l/x
lim
-+ <Xn<S<Pn
lim
sup
PlyM
00
"-* Ctn<S<Pn
Ao(n)x2Y
= 0.
(5.3.10)
= 0,
(5.3.11)
x-^y)e~l/x
sup
(Xn<X<Pn
e-l/xm&x(x-l+p+8,x-l+p-8
= e sup
x>0
so that (5.3.10) holds by noting that sup x > 0 e~l/x max (x~l+p+8, x~l+p~8)
< oo.
4,1 := A0(n)
sup x
2y
ttfi<*<Ai
^-%,i*)
Ao(n)
JC 2 *"-**)
Otn<X<Pn
-> 0 .
sup x
2y
=2
V Ax) -> 0 ,
n - oo .
<Xn<X<Pn
Pl.oM
Y,Py
sup
Reading y
T -
<2(/M+/n,2).
182
5 Advanced Topics
/l
/
V(nx)-b0(n)\
\
ov
Mx) = G0 I- log ( 1 + y
^
M - Go(logx)J ,
x> 0 .
Moreover, for any function / on (a, b) with oo < a < b < oo, define
/ ( A ) := lim r _^ / ( 0 and f(b) := limr_>^ /(f) if the limits exist, e.g., 7(0) =
/(oo) = 0.
lim sup
-*0<x<oo A0(n)
(5.3.12)
= 0
-J(x)
Proof. We shall prove (5.3.12) only for the case that Ao is positive near infinity,
because the proof for the other case is similar.
Since for every positive integer n and x > 0, there exists 0 = 0(n, x) e [0, 1]
such that
Jn(x) = G0(logx + qn{x)) - G 0 (log*) = qn(x)G0(logx
+ Oqn(x))
+ Oqn(x)) - J(x)\
+ (A0(n))~
x~ pniy(x)
G0(\ogx + 0qn(x))
Go(logx + Oqn(x)) -
pn,y(x)
G0(\ogx)\
VY,p(x))\
Pn,y(x)
__
xy +OoypniY(x)
- 1)]
X-Ypn,y(x)
1 +00yx-ypn,y(x)
'
Letting M = max(supJC>0 G^logx), supJC>0 G^log JC)), we have from (5.3.11) that
sup
(Xn<X<Pn
(A0(n))
(qn(x)-x
pn,y(x))
0ln<X<Pn
= M \y\ sup
and that for some 6 e [0, 1]
PlyW
x2YA0(n)
\l+0oyx-rPn,y(x)\
-0
183
7,2(*)
(AoOz))"1 x~Ypn,y{x)
sup
<M
GQ (logx + 06qn(x)J
Oqn(x)\
(A0(n)r1JC-2yp2fyW[l+floy^"yPn,yWr1->0.
SUp
Cln<X<Pn
sup
\Jn(x) - /(JC)| = 0 .
(5.3.13)
sup
sup ||/ n (x)|
Pn<X<00
<(Ao(n)yl
sup (l-Go(y-l\og\l
Pn<X<OC
+ (Ao(n))" 1
ya-l(V(nx)-b0(n))}))
+
l
J / /
sup (1 - Go(logx))
Pn<X <00
n -> oo .
su
Noting that /(oo) = 0, we have lim n ^oo P<jc<oo Unix) J(x)\ = 0. Similarly, it may be shown that limn_>oo s u p o ^ ^ \Jn(x) J(x)\ = 0, completing the
proof of the lemma.
T-T-T
Ji(Oyiu))
Aoin)
+ J((Dy(xn(u)))
_ Go(\ogxn(u)) - Gp(logcoyju))
Aoin)
+ {Ji(0y(u)) - J((Oy(Xn(u)))}
=: Kn,i(u) + Kn,2(u) .
In order to establish (5.3.8) we need only to prove
lim
sup
0<F(a0(n)u+b0(n))<l
!*,, GO I = 0 ,
lim
sup
F(a0(n)u+b0(n))=0
,\Kn(u)\ = 0,
i = l,2,
(5.3.14)
(5.3.15)
184
5 Advanced Topics
and
lim
sup
-*
F(a0(n)u+b0(n))=l
\Kn(u)\=0.
(5.3.16)
It follows from the definition of V that if 0 < F(ao(n)u -h bo(n)) < 1, then
V(nxn(u))-b0(n)
V+(nxn(u)) - b0(n)
<u<
, n = l,2,...
a0(n)
a0(n)
(recall that V + is the right-continuous version of V). Therefore for u > 0 such that
0 < F(ao(n)u + bo(n)) < 1, we have
- Go ( i log (l + y V + ( n t y ( W ) l )
A0(n)
< Kn,\(u)
Go(logxnm
Go(logxn(u)) - Go ( i log j l +
y ^ y ^ l )
A0(w)
Combining this with (5.3.12), we obtain (5.3.14) for i = 1. Since /() is continuous
on (0, oo) and /(0) = /(oo) = 0, it is easily seen that (5.3.14) for i = 2 is also true.
Since F(ao(n)u + oOO) = 0 implies u < (V(0) bo(n))/ao(n), using (5.3.12)
once again, we have
hm
sup
< hm
F(a0(n)u+b0(n))=0 A 0 ( w )
A0()
= lim |/((>)- 7 ( 0 ) 1 = 0 .
/i->>oo
sup
|/(toy(M))| <
"-oo F(ao(n)u+bo(n))=0
sup
u<(of(e)
0<jc<e
sup
\Kn(u)\
F(ao()K+*o(n))=0
GyW
< lim
n >0
sup
- F(a0(")w+M>0)=0 ^ 0 W
< sup |/(JC)| .
,-
+ hm
n
sup
/(a> y (i0)
F(a0(n)u+b0(n))=0
0<Jt<
Hence (5.3.15) is obtained by letting s -> 0 in the above. Similarly we may prove
(5.3.16), completing the proof of the theorem.
185
Remark 5.3.8 The uniform limit (5.3.8) gives an Edgeworth expansion as follows:
P {Xn,n < a0(n)u + bo(n))
( - l o g ^ 1 Gy(u)^y,p
= Gy(u) - A0(n)Gy(u)
( - l o g " 1 GK(w))
+o(A0(n))
holds uniformly on R.
Remark 5.3.9 The uniform limit (5.3.8) also gives a rate for convergence, that is,
lim sup
Gy(u)
The surprising part is that under a weak extra condition the converse of Theorem
5.3.3 holds.
Theorem 5.3.10 (Cheng and Jiang (2001)) Ifthere exist sequences an > 0, bn real
and An > 0 satisfying
lim An = 0 and
n-*oo
lim - = 1
n^-oo
An
P(X
V nn<
'
anx + bn) ^
n-*oo
G
Y Y(x)
= K(x)
(5.3.17)
An
then
lim ^ 1 = 1
n-oo
(5.3.18)
An
lim 2
n-*oo
x^l
2L_ = K(x)
(5.3.19)
An
holds locally uniformly on R, with K not a multiple of (xy l)/y, then f satisfies
the second-order condition.
Proof. Let [t] be the integer part of t e R and set
186
5 Advanced Topics
l-{*/(M+l)F
(t/[t])r-[t/([t]+l))y
, log(M+l)-lQg*
I log(M+l)-log[f] '
a(t) = {
'
^'
Q
u
X -
>
ft/MF-l
(t/[t])y-{t/([t]+i)}y
\ogt-\og[t]
log([f]+l)-log[r] '
o =
'(*)
(0 =
. %]
j8(0
%]+l
r . *(0=(o[^(o+^^/j(oi.
J
L%]
%]+l
J
and A(t) = A[t]. It follows from (5.3.18) and local uniformly for (5.3.19) that
f(tx)-b(t)
a(t)
XY-l
A(t)
/(**)-*([*]) _
a
[t)
= (0
(tx/[t])y-i
Y
[t]
f(tx)-bqt]+i)
+ /K0
-*0,
K(x)
{tx/m+w-i
K(x)
%]+!
A[t]
00
_ xZ^l
lim _ C 0
f-*oo
A(0
completing the proof of the lemma.
x_
= Jj:(
K(x)
Gy(x)
holds locally uniformly on R and therefore
If
1
An [ n\ogF(anx+bn)
_ n log F (anx + bn)co2y(x)K(x)
>
GvW
+ :
1
logGy(x)
187
holds locally uniformly on R. By Vervaat's lemma (Appendix A), the above gives
V(nx)-bn
_ gT-l
/ v
1 X
r_ = xY^eVxK(x_Z\ .
\ V J
_^L_^
n->oo
An
lim
This implies that V satisfies the second-order condition by Lemma 5.3.11, completing
the proof of the theorem.
l-Fn(anxn+bn)
n-+co
= 1
l-Gy(xn)
lim
*(ti/((-y)v0)
IV
(l + yx(t)yl'y
-TT^)^((l +
yx{t))l/y =
)} -
lim
t{l-F(bo(t)+x(t)ao(t))}
*->00
-lOg
Gy(x(t))
=1 .
188
5.4 Weak and Strong Laws of Large Numbers and Law of the
Iterated Logarithm
Let X\, Z2, X3,... be independent and identically distributed random variables with
distribution function F. Define Xn,n := max(Xi, X 2 , . . . , Xn) for n = 1, 2,
We
are going to discuss analogues for partial maxima, of the weak and strong laws of
large numbers, and the law of the iterated logarithm for partial sums.
Whereas in the partial sum case existence of moments is therightthing to consider,
for partial maxima conditions of regular variation type turn out to play an important
role. We start with a weak law of large numbers.
Theorem 5.4.1 Suppose F(x) < 1 for all real x. The following assertions are equivalent:
1. There exists a sequence an ofpositive numbers such that
Xn,n P
an
1.
(5.4.1)
1.
(5.4.2)
K
3. For all x > 1.
l
i m
i ^ > = 0 .
r->oo 1 -
4. f0
(5.4.3)
F(t)
ftsdF(s)
*S- \ f = 1 .
f-+oo t{\ -
(5.4.4)
F(t))
]imF"(anx)=\1'
"-<x>
[0 ,
> h
0 < x < 1.
*> h
[00 , 0 < x < 1 .
(5.4.5)
(5A6)
189
n -> oo .
n > oo.
Next we prove that (5.4.1) implies (5.4.2). Let U be the (generalized) inverse
function of 1/(1 F). By inversion (5.4.5) is equivalent to
lim
= 1 for x > 0 .
(5.4.7)
with J7 the inverse function of 1/(1 F). For JC > 1 by (5.4.6) and (5.4.7),
lim sup log Fn (bnx) = lim sup n log F(bnx)
II(1 - F(fex))
5hmsup
FTTT;
1 - F(^njc)
= 0.
For 0 < x < 1 one proceeds in a similar way.
Next we prove the equivalence of (5.4.3) and (5.4.4). First assume (5.4.3). For
X > 1 there exists xo(k) such that for x > JCO(A.),
I-F(AJC)
<2_1(1-F(JC));
190
1-F(knx)
<2"n(l
-F(JC))
and
roo
poo
/
Jx
_^_
pxkn+l*
w
pxk"
(1 - F(t)) dt = V ) Jxk>
/
n=0JxX
00
= J2xkn
(1 - F(t)) dt
pk
(\-F{txkn))dt
< Y"xkn(l-
00
<x(l-F(x)(X-l)J2
F(xkn))
pX
dt
/k\n
U
'
/;
F(x))
= 0,
(5.4.8)
. _ 1 - F(f n * 0 ) ^
A
lim mi
> c >0.
n_>oo
1 F(f n )
1 - F(tn)
- Jx
lim inf
ds > c
(XQ
- 1) > 0 ,
n-+oo 1 - F(t
F(t
ns)
n)
= e
= 0.
t
Remark 5.4.3 Clearly if F is in the domain of attraction of GY (x) for some y > 0,
relation (5.4.3) does not hold (cf. Theorem 1.2.1(1)), hence (5.4.1) cannot hold.
191
N o w w e turn to strong (a.s.) laws. The validity of the strong law of large numbers
depends o n the finiteness of a certain integral. For the law of the iterated logarithm
the second-order conditions are basically sufficient.
We provide proofs for all the results except for the necessity of the condition for
the strong law, which is lengthy and complicated. We need the following lemma.
Lemma 5.4.4 Let cn be a sequence of positive constants and bn : = (1/(1 F))^(ri).
Suppose that bn+xcn
is an ultimately nondecreasing sequence for all real x > 1.
1. For each distribution function F we have almost surely
,. . Xn n bn
hm inf
< 0.
n-*oo
cn
2. Let c be a finite constant. We have almost surely
Xn,n-bn
h m sup
n>oo
= c
Cn
if and only if
00
J2(l-F(cnx+bn))
(5.4.9)
n=l
converges for all x > c and diverges for all x < c.
3. If for all-1 < x < 0,
00
] T ( 1 - F(cnx +
n=\
fcn))exp(-n(l
(5.4.10)
Xn
'"-bn
cn
>o,
(5.4.11)
= e~l > 0 .
192
(3) Since ]C^Li(! F(bn)) = oo, we have almost surely Xn,n > bn infinitely
often. Hence also Xn,n > cnx + bn infinitely often for all x < 0. So to prove (5.4.11)
it is sufficient to show that
P(Xn,n
(5.4.12)
converges. Now
1 - F(cn+\x
and
F n ( c * + bn) = exp (log F(cnx + &w)) < exp { - n ( l - F ( c n * + &))} ;
hence the convergence of (5.4.12) is implied by (5.4.10).
Theorem 5.4.5 Let F(x) < 1 for all real x. Equivalent are:
1. For some sequence bn,
Xn n
'
bn
-* 1 A.J.
(5.4.13)
< oo .
1 F(vx)
(5.4.14)
Proof. We prove that (5.4.14) implies (5.4.13). First note that (5.4.14) implies
lim
7T, 7 = 0
x->oo 1 F(vx)
for 0 < i; < 1. Hence (5.4.14) implies
193
fOO
{1 - F(vU(t))}exp(-t{l
- F(vU(t))}) dt < oo .
By applying bn <U(t)<
00
J2(l
- F(vbn+i))exp{-(n
n=l
( 1 - F(vbn))exV{-n(l
- F(vbn))} < oo
w=l
for 0 < v < 1. This is the condition of Lemma 5.4.4(3) with cn = bn. Hence
liminfH-^oo Xn,n/bn > 1 a.s. In order to get lim s u p ^ ^ Xnfn/bn = 1 we need to
prove that the sum in (5.4.9) with cn = bn converges for x > 0 and diverges for
x < 0. That is, Y1%L\ * ~ F(vbn) is finite for v > 1 and infinite for v < 1. First note
that YOZLi 1 - F(vbn) is finite if and only if f 1 - F(vU(t)) dt is finite. Clearly
by the definition of U(t) this integral is infinite for v < 1. For v > 1 by partial
integration
Jl
Jl
1 - F(VJ)
We omit the proof that (5.4.13) implies (5.4.14), which can be found in BarndorffNielsen (1963).
For the law of the iterated logarithm we have conditions that are believed to be
new (cf. Pickands (1967) and de Haan and Hordijk (1972)).
Theorem 5.4.6 Let F(x) < 1 for all x. Define V(x) := U(ex) for real x. Suppose
there is a positive function p such that for all real x and some real f$,
v
V(t+xlost)-V(f)
lim
Then almost surely
'-<*>
e*-\
=
Pit)
lim sup
H^OO
Xn,n-V(logn)
and
liminf
n^-oo
eP-1
P(lOgtt)
Xn,n-V(logn)=()
p(logn)
(5.4.15)
194
Corollary 5.4.8 If for some positive function a, a distribution function satisfies the
second-order condition of Section 2.3 with y 0 and p < 0, then the result of
Theorem 5.4.6 is true with bn = U(n) and cn = a(n) log log n.
Proof The uniform inequalities of Theorem 2.3.6 imply for x > 0,
V(t+X)-V(t) =
x +
o{l)Ao{et)e{P+e)^
ao(e )
where the o(l) term is uniform for x > 0, as / -> oo. Hence
ao(e{) logt
logt
logt
It is easily seen that the last two terms tend to zero as t - oo; hence (5.4.15)
holds.
(l/t - l/t3 4-
o(l/t3))
as t -> oo, with <f> the standard normal distribution function, we have, with V *~ the
inverse function of V,
V*"(0 = log (_
195
x
2"
t)/\flt.
bn)exp(-nP(X
n=l
00
0,if^2
196
J^i, X2, X 3 , . . . :=
Yu...,Yn),
so that
P (max(Xi,..., X2n-i)
Hence max(Xi,..., Xn) behaves approximately like the maximum ofn/2 independent and identically distributed random variables F,-.
The situation is more or less the same if {Xn }<n*L1 is an infinite-order "max-moving
average" process.
Example 5.5.4 Let Y\, Y2,... be as in the previous example and let V be some
random variable. Define for n = 1, 2 , . . . ,
Xn := Yn + V .
Then m a x ( * i , . . . , Xn) = V + m a x ( 7 i , . . . , Yn) and
Pin'1 max(Xi, ...,Xn)<x)
197
Example 5.5.7 Let X\, X 2 , . . . be a stationary Gaussian sequence and let EX\ = 0,
EX\ = 1. Let rn be the correlation function rn = E(X\, X n +i). Consider also
a sequence Y\,Y2,... of independent and identically distributed standard normal
random variables. Then we know from Example 1.1.7 that sequences an > 0 and bn
exist such that
hm P
n->oc
/max(Yi,...,Yn)-bn
\
<x
an
\
)
= exp l-e
_x.
)
for all x. If
lim rn log n = y > 0 ,
then
n->oo
/max(Xi,X2, ...,X)-fc
an
Ar
\
/
where M and iV are independent, M has distribution function exp (e~x), and A^ is
standard normal.
If rn log n -> 00, under certain conditions with different normalizing constants a
normal limit distribution is obtained (Pickands (1967)).
One aim of extreme value theory for stationary dependent sequences is to formulate general conditions that allow most of the theory for sequences of independent
and identically distributed random variables to go through.
The Conditions D and D'
The direction followed in the basic book of LLR is to formulate mixing conditions,
as weak as possible, for probabilities of events connected with large values of the
random variables.
Suppose X\, X2, X 3 , . . . is a stationary stochastic process and the distribution
function F of X\ is in the domain of attraction of some extreme value distribution;
that is, if Y\, Y2,... is an independent and identically distributed sequence with the
same marginal distribution, there are sequences an > 0 and bn such that for all x and
some y e R,
n(meix(YuY2,...,Yn)-bn
lim P I
n^oo
an
< x I = Gy(x) .
(5.5.1)
For fixed x define un := bn + anx. The mixing conditions for the process {Xn} will
be concerned with events above the level un, for fixed x and sample size n.
Let / and p be positive integers. For any random vector ( X i , . . . , Xp) let Pi,...,p
denote its joint distribution function. The condition D(un) will be said to hold if for
any integers
1 < i\ < < ip < j \ < < j p < n
for which j \ ip > /, we have
198
< xnJ ,
(5.5.2)
(5.5.3)
7=2
as k -> oo.
Theorem 5.5.8 (LLR, Theorem 3.4.1) Let X\, X2,... be a stationary random sequence and let Y\, F2, be an independent and identically distributed sequence with
the same marginal distribution for which (5.5.1) holds. Assume that the conditions
D(un) and D'(un) hold with un=bn+
anx. Then
D{maiL(XuX2,...,Xn)-bn
hm P
n-+oo
^ \
< x J = GY(x) .
an
Next we describe a situation in which the normalizing constants are still the
same as under independence but the limit distribution is slightly different. This is the
situation of Examples 5.5.2 and 5.5.3.
Definition 5.5.9 Let X\, X 2 , . . . be a stationary sequence. Let Y\, Y2,... be an independent and identically distributed sequence with the same marginal distribution for
which (5.5.1) holds. If for some 0 < 0 < 1,
hm P
n^oo
/max(Xi,...,Xn)-fc
\
an
\
e
< x I = Gv(x)
)
for all JC, then the sequence X\, X2,... is said to have extremal index 0.
Note that our definition is slightly more specific than the definition in LLR, p. 67.
An example of a sequence with arbitrary extremal index 0 e (0,1] is the process
Xi:=Yu
X + 1 : = m a x ( ( l - 0 ) X n , 0 y + i ) far/i > 1,
where Y\, Y2,... are independent and identically distributed with distribution function exp( 1/JC), JC > 0. It is easy to see that P(Xn < x) = exp( l/x) and that 0 is
the extremal index of the sequence.
Many processes have an extremal index between zero and one: well-known examples are a moving average process of stable random variables (LLR, Section 3.8)
and a process satisfying a stochastic difference equation (de Haan, Resnick, Rootzen,
and de Vries (1989)). The point process convergence of Section 2.1 can be generalized
to this case: the epochs of the points in the limiting point process still form a homogeneous Poisson process. However the two-dimensional points are now arranged on
vertical lines where the mean number of points on a vertical line is 1/0.
199
Several estimators for the extremal index have been developed based on this
interpretation of 0. A general form for those estimators is given by (cf. AnconaNavarrete and Tawn (2000))
0 := TTTr,
(5.5.4)
N(un)
where N(un) is the number of exceedances of a high threshold un and C(un) is the
number of clusters. There are two general ways of identifying clusters. The runs
estimator for 0 is, for 1 < / < w,
R :=
1 n~l
~M7^\ J2 l{Xi>un)l{Xi+l<uH} ' ' ' i{Xi+i<Un}
iy\un)
It recognizes two different clusters of exceedances when there are at least / consecutive
observations below the threshold between them.
The second kind of estimator for 0 is called a blocks estimator. By first dividing
the sample into k blocks of length m, so n approximately equals km, the number of
clusters C(un) in (5.5.4) is estimated as the number of blocks in which at least one
exceedance of un occurs.
This approach is mainly connected with the behavior of extreme order statistics
and point process convergence (as in Section 2.1). For results concerning intermediate order statistics, convergence to Gaussian processes and asymptotic behavior of
estimators we need other conditions than the conditions D and D\ and the existence
of the extremal index. That is what we discuss next.
Mixing Conditions
A different way to deal with dependence has been explored by Rootz^n (1995) and developed by Drees. The aim is to formulate rather general but quite specific conditions
under which the approximation of the tail empirical quantile process by Brownian
motion of Section 2.4 can be generalized. As before, this approximation can serve as
a basis for a wealth of statistical results. It can be proved that many specific stochastic
processes used in applications satisfy the conditions.
We consider a version of the conditions and the ensuing result. Let {Xn} be a
stationary sequence. The common distribution function is denoted by F and the
inverse function of 1/(1 F) is denoted by U. We assume that F is in the domain
of attraction of an extreme value distribution Gy.
Further, we assume that the sequence {Xn} is ^-mixing, i.e.,
P(f) := sup E I
meN
sup
sup
/ -> oo
\n*m+l+l
200
Condition 5.5.10
^ )
, i
-1/2,
2 i
ln
where ln and kn are sequences of integers, ln, kn oo, n -> oo. The growth of the
sequence kn has to be restricted by Condition 5.5.13 below.
Condition 5.5.11 There exist s > 0 and functions cm, m = 1, 2 , . . . , SMC/I that
lim ^ - P ( * i > U (-^-)
, Z m + i > U (^-))
= cm(x, y)
,y))<(y-x)(pm
D1^\
'-''> =-KsX).'
Condition 5.5.13
^te)-^fe) *-'-i
eX+l/2
-w
= 0,
w/zere a > Ois a suitable version of the auxiliary function of Theorem 1.1.6.
Theorem 5.5.14 Let X\, Xz,... be a ft-mixing stationary random sequence with
common marginal distribution function F e V(GY), for some real y. Under Conditions 5.5.10-5.5.13, for some sequence ln = o(n/kn), there exist versions of the tail
quantile process, denoted by {Xn-[knS],n}o<s<h and a centered Gaussian process E
with covariance function c defined by
00
such that
sup ^ + 1 / 2 ( l + | l o g j | ) " 1 / 2
0<5<1
Xn-[kns],n
~ Ai
- \
-(y+i)
E(s)
0,
n ~> oo .
(*)
Here D\, >2,... is a sequence of random variables which for y > | can be chosen
as U(n/kn) (hence nonrandom).
201
Under very general conditions, estimators of y, a(n/kn), large quantiles, etc. can
be expressed as functionals of the tail empirical quantile function, and an appropriate
invariance theorem entails asymptotic normality of such statistics, very much in the
same way as for the approximation developed in Section 2.4. For details we refer to
Drees (2000, 2002) and in particular to Drees (2003), where the case y > 0 receives
special attention. It is shown that under certain conditions a stochastic difference
equation (hence also the ARCH process) satisfies the stated conditions, as well as a
moving average process.
(5.6.1)
and
\an+i/an
-* 1,
(5.6.2)
[(bn+i -bn)/an
-> 0 .
Then
f log G(x) is convex
ifx*(G) = 00 ,
{
_ x
[ log G (JC* e ) is convex ifx*(G) < 00 .
(5.6.3)
= l .
(5.6.4)
n->oo
By taking k fixed in (5.6.4) one sees that (5.6.1) follows. Next take k(n) = n in
(5.6.4). This implies lim^oo 112=1 Fk(anx + bn) = G(x) for continuity points x of
G. Since we also have \imn-^oQ fl/Li ^k(anx + bn) = G(x) (5.6.2), follows by the
convergence of types theorem.
202
Proof (Balkema, de Haan, and Karandikar (1993)). First the direct statement. Define
anx := anx + bn ,
n
= 6
I an+*(n)
Then along a subsequence n' we have convergence of <x~f+k,nf\<xn'to <*<?> say. Now
Mn+u
<Xn'+k(n')
Mn> =
Mn
= 2
-logFr
Also,
\ <Xn'+k(n')
(5.6.5)
for some a > 0 and & R. Hence M(P*x) M{x) is a tail function for all t > 0,
i.e., either
M(a r (;t - JCO) + Jc0) - M(x)
(with a ^ 1)
(5.6.6)
203
or
(5.6.8)
Let *JC := inf {JC : G(x) > 0}. Then if (5.6.6) holds with a > 1, we have JCO > JC*; if
(5.6.6) holds with a < 1, then xo <* *; if (5.6.7) holds, then & > 0.
Let us consider the case a > 1. Then JC* < oo. We know that for all t > 0 the
function
M(at(x - xo) 4- *o) - M((x - xo) 4- *o)
is nonnegative and nonincreasing for JC < JC*. That is, for all t > 0,
M(JC 0 - e'ey)
- M(x0 -
ey)
is nonnegative and nonincreasing for ey > xo JC*. It follows that M(JCO e~y)
is nondecreasing and convex for ey > JCO JC*, i.e., (with M' the right one-sided
derivative of M) eyM'(x$ ey) is nonnegative and nondecreasing for ey > JCOJC*.
Writejco ey = x* e*; then also exMf(x* ^ ) is nonnegative and nondecreasing
for JC G R, since 1 4- ex (JCO JC*) is nonnegative and nonincreasing. It follows that
Af (JC* ex) is a convex function and hence also M(JC* e~x). Similar reasoning
applies in the other two cases.
Conclusions: if (5.6.6) holds with a > 1, then JC* < oo and M(JC* e~x) is
convex; if (5.6.6) holds with 0 < a < 1, then *JC > oo and M(JCO + ex) is convex
for some JCO <**;if (5.6.7) holds, then JC* = oo and M(JC) is convex. Some reflection
shows that in fact the convexity of M (JCO 4- e*) for some JCO < * JC implies the convexity
of M(JC). This proves the direct statement of the theorem.
Conversely, suppose that G satisfies (5.6.3). We shall consider only the case
log G(JC) convex. The other case is similar. Relation (5.6.7) implies that the function
JO,
is a distribution function for n = 1, 2,
JC <*Jc(G)4-log(n + l),
Moreover,
Corollary 5.6.3 If (5.6.6) holds with a > 1, then the same relation holds with JCO
replaced by any x\ satisfying JC* < x\ < JCO- If (5.6.1) holds, then (5.6.6) holds for
a > 1 and JCO > JC*. If (5.6.6) holds with 0 < a < 1, then the same relation holds
with xo replaced by any x\ < xo; moreover, (5.6.7) holds.
Proof. Let us consider the case a > 1. The previous proof shows that then (5.6.6)
holds with JCO replaced by JC*. The same proof also gives (5.6.6) with *o replaced by
any JCI satisfying JC* < JCI < JCQ. Similar reasoning applies in the other two cases.
204
Exercises
5.1. Let F be twice differentiable and write r(t) := ((1 - F)/F')' (t). Recall that
]imt-oor(t) = 0 implies that F e V{GQ). Prove that if r(t) log log 1/(1 - F(t)) -
0, then the condition of Theorem 5.4.6 holds.
5.2. Prove that Theorem 5.4.6 holds for any gamma distribution.
5.3. Prove that under the conditions of Theorem 5.1.1 with p < 0,
hm
n(\-
sup
F(ao(n)x + bo(n))
=1
-* * 0 <*<1/((-K)V0)
~ log Gy (x)
( max
Xi+cbi-(l+c)bn
converges to
I
w r
P I sup
\->i
Ty-1
+ c
<xI
\
<x ,
hm P ( max
->oo
\l<i<n
Xi+cbi-(l+c)bn
an
=exp
< xJ
J
Hint: Use the point process convergence of Theorem 2.1.2, the interpretation of point
process convergence given at the end of Section 2.1, and Theorem 1.1.6(2). Note that
for a Poisson point process the probability that a certain set is empty (contains no
points) equals e~~q, where q is the mean measure of that set.
5.5. Let E\, E2,... be i.i.d. standard exponential random variables. Define for x > 1,
N := min {n : Ek < x log k for k > n].
Show that N < 00 a.s. For what values of x does EN exist?
5.6. Let X\, X2,... be i.i.d. positive random variables with distribution function F.
Prove that max (X\, X 2 , . . . , Xn) l^fn -+p 0 if and only if lim^oo x2(l F(x)) =
0.
5.7. Let X\, X2,... be i.i.d. positive random variables. Prove that max(Xi, X2,
. . . , Xn)/n -> 0 a.s. if and only if EX < 00.
Part II
Finite-Dimensional Observations
6
Basic Theory
208
6 Basic Theory
8 -
6 -
-^~^zzz^
^uV/y
-z~~^~-^~~^~~-/^_/_
4 -
2 -
mp*
0 -2
/ / / / / /
~-^^^^-^-^~-^L_Z
-^ul/ / / '///////
10
n->oo
= G(x, y)
an
^
<
JC,
\
< vI
J
(6.1.1)
209
In this section we are going to determine the class of all possible limit distributions
G. In doing so we will heavily rely on the theory developed in Chapter 1. Since (6.1.1)
implies convergence of the one-dimensional two marginal distributions, we have
n/max(Xi,X2,...,X)-fr
^ \
< x = G O , oo)
lim P
n^oo
an
(6.1.2)
and
n /max(ri,r 2 ,...,r)-4
lim P
n^oo
^ \
< y
Cn
= G(oo, y) .
(6.1.3)
Now we choose the constants an,cn,bn, and dn such that (cf. Theorem 1.1.3) for
someyi, YI e R,
G(JC, oo) = exp ( - (1 + y i x ) ~ 1 / n )
(6.1.4)
and
G(oo, y) = exp ( - (1 + y 2 y)" 1 / K 2 j
(6.1.5)
We note in passing that since the two marginal distributions of G are continuous,
G must be continuous as well.
Next we are going to use the results of Lemma 1.2.9 and Corollary 1.2.10. Let Ft,
i = 1, 2, be the marginal distribution functions of F. Define U((t) := F.*~(l l/t),
t > 1, for i = 1,2. Then according to Theorem 1.1.6 there are positive functions
at(t),i = 1, 2, such that
r
Ui(fx)-Uj(t)
hm
and
x* - 1
=
(6.1.6)
at(tx)
r
y.
lim - = xn
t^oo atit)
lim
n^oo
r
lim
n-oo
Ul(nx)-bn
an
Ui(ny)-dn
Cn
x*-l
Y\
=
y^-1
Yl
(6.1.7)
.
210
6 Basic Theory
lim Fn(anx + bn, cy + d) = G(x, y) .
(6.1.8)
n>oo
Note that if xn -> w, yn -> v, then by the continuity of G and the monotonicity of F,
lim F"(ajtw +
+ d) = G(u,v).
(6.1.9)
n->oo
U\(nx)-bn
,
an
U2(ny)-dn
,
Yl
.
Yl
Theorem 6.1.1 Suppose that there are real constants an,cn > 0, bn, and dn such
that
lim Fn(anx + bn, cny + dn) = G(x, y)
n-+oo
for all (JC, y) ofG, and the marginals ofG are standardized as in (6.1.4) and (6.1.5).
Then with F\(x) := F(x9 oo), F2(y) := F(oo, y), and Ut(x) := f}*"(l - 1/JC),
i = l,2,
lim F71 (Ui(nx), U2(ny)) = G0(x, y)
(6.1.10)
n-*oo
,...,
) < nx ,
max {
,...,
I < ny \
y
Vi-f^ri)'
' \-F2(Yn))~
\
= Go(x, y)
for JC, y > 0, i.e., after a transformation of the marginal distributions to a standard
distribution, namely F(x) := 1 1/x, x > 1, a simplified limit relation applies. This
means that we have reformulated the problem of identifying the limit distribution in
such a way that the marginal distributions no longer play a role. From now on we can
focus solely on the dependence structure.
211
Corollary 6.1.3 For any (JC, y) for which 0 < Go(x, y) < 1,
lim n {1 - F(tfi(iuc), l^toO)} = -logG 0 (JC, y) .
(6.1.11)
Proo/ Taking logarithms to the left and to the right of (6.1.10), we get
lim -n log {F(Ui(nx), U2(ny))} = - log G 0 (JC, y) .
(6.1.12)
w-*oo
^
~* '
(JC, y) for
which 0 <
GO(JC,
y) < 1,
(6.1.13)
00
l-F(Ui(nx),U2{ny))
i-F(Ul(na),U2(na)y
=:
Ha(x,y)
exists for all JC, y with max(jc, y) > a. Hence by Billingsley (1979), Theorem 29.1,
we have that Ha is the distribution function of a probability measure, Pa say, and it
follows that
lim Pn,a(A) = Pa(A)
n-voo
for all Borel sets A c R^. \ [0, a]2 with Pa(dA) = 0. Now clearly
vn := n {1 - F (Ui(na), U2(na))} Pn,a
212
6 Basic Theory
is a measure on R+ \ [0, fl]2> n 0* depending on a and such that for all Borel sets
A C M^ \ [0, a]2,
lim v(A) = v(A)
with
v := -logG0(a,a)
Pa .
Note that since a > 0 is arbitrary, vn(A) and v(A) are defined for all Borel sets A
with
inf max(jc, y) > 0
(6.1.14)
(jf,y)eA
: s > x ort
> y\
= -logGQ(X,
U2(ny))},
y) .
Finally, for all Borel sets A such that (6.1.14) holds we have lim^-^oo vn(A) = v(A).
We formulate these results in the following theorem.
Theorem 6.1.5 Let F and Go be probability distribution functions for which (6.1.11)
holds, i.e., for JC, y > 0 with 0 < Go(x, y) < 1,
lim n {1 - F(tfi(a*), t/2(#o0)} = -logGoC*, y),
w/i^r^ Ut (1/(1 x)) is the inverse function of the ith marginal distribution, i = 1, 2.
7%e f/*re Are set functions v, vi, V2,... defined for all Borel sets A C R+ with
inf
(x,y)eA
such that:
1.
vn Us, t) G R 2 .
* > xort
> y\ =n{l
- F(U\(nx),
U2(ny))},
213
One can write this space as a product space by using the transformation
U, y) -
max(x, y),
\
I .
max(x,y)/
Then
1% := (0, oo) x Q,
where
(6.1.15)
(6.1.16)
with
Remark 6.1.8 Relation (6.1.15) does not hold for all Borel sets Ax,y.
The characterizing property of the exponent measure is the following homogeneity
relation.
Theorem 6.1.9 For any Borel set A C M+ with inf(Xi),)GAniax(x, y) > 0 and
v(d A) = 0, and any a > 0,
v(aA)=a~lv(A)
,
(6.1.17)
where a A is the set obtained by multiplying all elements of A by a.
Proof. Taking tn = na for some a > 0 in (6.1.13) we obtain
lim n{\ F(U\(nax), U2(nay))} = a~l logGo(*, y) .
On the other hand, by direct application of (6.1.11),
lim n{\ - F(U\(nax), U2(nay))} = - l o g G 0 ( a x , ay) .
n-+oo
Hence
(6.1.18)
214
6 Basic Theory
-a'1
(6.1.19)
and the statement of the theorem holds for all sets AXty defined by (6.1.16). It is then
clear that this relation must also hold for the generated a -field.
Remark 6.1.10 Relation (6.1.17) implies that v (A) isfinitefor all sets A with positive
distance from the origin, but v is not bounded. Note in particular that
G0(ax, ay) = GlQ/a(x, y) ,
for
a, x, y > 0 .
(6.1.20)
A nice intuitive background for the role of the exponent measure is provided by
the following theorem. The proof is very similar to the proof of Theorem 2.1.2 and
is omitted.
Theorem 6.1.11 Let (X\ ,Y\), (X2, Y2) ...,be i.i.d random vectors with distribution
function F. Suppose (6.1.1) holds with an, bn, cn, anddn as in (6.1.2)-(6.1.5), i.e.,
lim nP I 1 + y\
n^oo
\ \
)
an
> xor [1 + yi
I
Cn
> y]
-logG0(x,y),
that is, more generally for each Borel set AofQ (for the definition of Q see Remark
6.1.6),
i+
,+
M(( ^T'( ^n
eA
= v(A) .
Nn(B)
=^1^i/n(1+yi(Xi_bnyan)l/n
Define also a Poisson point process N on the same space with mean measure k x v
with k Lebesgue measure and v the measure defined in Theorem 6.1.5. Then Nn
converges in distribution to N, i.e., for Borel sets B\, B2,..., Br e M + x Q with
(kxv)(dBi)=0,i
= l,...,r,
( # ( * ! ) , . . . , Nn(Br)) A (N(B0,...,
N(Br))
This theorem opens the way to estimating the measure v by just counting the
number of observations in certain sets, as we shall see later on (Sections 7.2 and 8.2).
6.1.4 The Spectral Measure
The homogeneity property (6.1.17) of the exponent measure v suggests a coordinate
transformation in order to capitalize on that. Recall R^_* from Remark 6.1.6. Take any
one-to-one transformation M+* -> (0, 00) x [0, c] for some c > 0,
215
r = r(x, y),
d =
d(x,y),
r(x,y) = x + y,
x
d(x, y) =
x+ y
r,
J(JC, y) = arctan - ,
x
and
r(x, y) = x V y,
(6.1.21)
t/(x, y) = arctan :
It will turn out that the measure v has a simple structure when expressed in the new
coordinates.
Let us start with the first transformation. Define for constants r > 0 and 0 e
[0,7r/2] the set
Br%e := I (x, y) e R+ : yjx2 + y2 >r and arctan - < 01 .
Clearly
Br,o =rB\e
(6.1.22)
This relation means that after transformation to the new coordinates r(x, y) and
d(x, y) the measure v becomes a product measure. Set for 0 < 0 < 7r/2,
(6.1.23)
*(0):=v(fli,*) .
Clearly *I> is the distribution function of a finite measure on [0, n/2\. This finite
measure is called the spectral measure of the limit distribution G. The spectral measure determines the distribution function G in the following way. Write s = r cos 6,
t = r sin 0. Take x,y > 0,
- log Go(*, y) = v {(s, 0 :
> x or t > y]
= V[CM)
r > cos 0
y i
sin 0
(6.1.24)
216
6 Basic Theory
We consider two subsets in order to evaluate. First the subset where x/cosO <
y/sinO. Then r > min(jc/cos#, y/sinO) translates into r > x/cosO. Hence by
(6.1.22) and (6.1.23) the v measure of this set is the integral
f
Jx/(cos9)<y/(sin0)
dr
Jr>x/(cosO)
cosO
J(cosO)/x>(sinO)/y
The integral over the other subset, namely where Jt/(cos0) > v/(sin#), can be
evaluated similarly and yields
sind
J(cosO)/x<(sinO)/y
*)
sinO\
, ^
The term "spectral" can be seen as an analogue to the light spectrum, which highlights the contribution of each color separately. Here the spectral measure highlights
the contribution of each direction separately. The terminology comes from corresponding results in the theory of partial sums rather than partial maxima, see, e.g.,
Breiman (1968), Section 11.6.
We have proved the direct statement of the following.
Proposition 6.1.12 For any extreme value distribution function Gfrom (6.1.1) with
(6.1.4) and (6.1.5) there exists a finite measure on the set [0,7r/2], called spectral
measure, with the property that ifty is the distribution function of this measure, for
x, y > 0,
/*"-l
yy2-l\
fn/2/cos0
sin0\
T/
(6.1.25)
where y\ and yi are the extreme value indices of the marginal distributions of G.
Moreover, we have the side conditions
pn/2
/
Jo
pn/2
cos<9 *(<W) = /
Jo
sin6> V(d0) = 1 .
(6.1.26)
Conversely, any finite measure represented by its distribution function V gives rise
to a limit distribution function G in (6.1.1) via (6.1.25) provided the side conditions
(6.1.26) are fulfilled.
Proof We have already proved the direct statement. The side conditions (6.1.26)
stem from the fact that GoQt, oo) = Go(oo, JC) = exp( \/x) for x > 0.
For the converse we first prove that Go defined by (6.1.25) is the distribution
function of a probability measure.
217
(-l>\-T-v-r))
v^v
/ cos 0i
sin 0i \ \
< 6 - i - 27 >
is a distribution function for any 0 < 0\ < < 0n < n/2 and *I>; > 0, 1 =
1, 2 , . . . , n. Now the expression on the right-hand side in (6.1.25) can be approximated
by a sequence of type (6.1.27). This proves that Go is a distribution function.
Next we prove that G can serve as a limit distribution in (6.1.1). Note that for all
JC, v > Oandn = 1,2,...,
G^{nx,ny) = Go(x,y) .
Hence for all JC, y with 1 + y\x > 0, 1 + Yiy > 0,
Gn(
+nJc, -
K2
+nY2y)
/
(6.1.28)
Definition 6.1.13 We call the class of limit distribution functions G in (6.1.1) the
class of max-stable distributions, as suggested by relation (6.1.28). Hence any extreme value distribution is max-stable and vice versa. The class of limit distribution functions Go in (6.1.10) is called the class of simple max-stable distributions, "simple" meaning that the marginal distributions are fixed as follows:
GO(JC, 00) = Go(oo, x) = exp ( - 1 / J C ) , x > 0.
So far we have considered only the first transformation from (6.1.21). A similar
analysis of the other transformations yields the following result. The proof is left to
the reader.
Theorem 6.1.14 For each limit distribution Gfrom (6.1.1), (6.1.4), and (6.1.5) there
exist:
218
6 Basic Theory
1. A finite measure (denoted by the distribution function W) on [0, 7t/2] such that
forx,y > 0,
G(-J
V Y\
-)=G0(x,y)
Yl
)
I
C71'1 /cos (9 sin<9\
\
= exp I - /
(
v
J V(d0) 1 (6.1.29)
= e x p ( - 2 f (-v]^\
H(dw)\ . (6.1.30)
3. A finite measure (denoted by the distribution function <&) on [0, n/2] such that
forx,y > 0,
Yl
Yl
= exp
219
]0
= 2l
fl (w
(-v
- ? fn/2
(jc(l+cot0)
r*'1 /sin<9
Jo
1-uA
J
\y V ~TTcote))\
cos0\ / J _
V
\x
H(dw)
y )\faO coso)
f/
(sm0
1+cot0/
T+hie)
+ cos0)(^A^)
1 Acot0\
n 2
H d
_l_\
[l A tan(9
{l+cot0
V ~ l+cot<9/J
(\ Atan<9
1 Acot0\ _ ,_
l+cot6>/
That is,
/.l/(l+cat0)
d>(6>) =
= 22 //
(w v (1 - tu)) #(dw) .
Remark 6.1.16 We discuss two extreme cases of spectral measures. For simplicity
we formulate only the results for H. We consider
'""
Yd J
= eXp|-<*
(
//
/
w\-\
|^v...v^)H(rfw)
(
\-Wd=l
: wi + . + wd = 1, u;,- > 0, / = 1, 2 , . . . , d]
/.../, H(dw) = d\
w\-\ Vwd\
for/ = 1,2, . . . , d .
(6.1.32)
220
6 Basic Theory
1. Let the spectral measure be concentrated at the point (l/d, l/d,..., l/d) with
mass 1. If (X\, X2,..., Xd) is a random vector with distribution function
G ((* - l ) / n , . . . , {xYdd - I)/yd), then Xx = X2 = = Xd a.s.
2. Let the spectral measure be concentrated at the extreme points of the set
(6.1.32), i.e., the d points (1, 0 , 0 , . . . , 0), (0,1, 0 , . . . , 0), ..., (0, 0 , . . . , 0, 1)
with masses l/d. If (X\, X 2 , . . . , Xd) is a random vector with distribution function G [{x\l l ) / y i , . . . , (x%d 1) /yd), then Xi, X2, ...,Xd are independent.
Let us consider some examples of limit distributions and their spectral measures.
It is useful to note first that the transformation theory for integrals implies that if,
for example, the spectral measure has density *!>', then log Go (A:, y) has density
q(x, y), say. For instance, for a Borel set A C M+ with inf (x,y)eA max(jc, y) > 0 and
v(3A) = 0, and with r = y/x2 -f y 2 and 0 = arctan(y/jc),
v(A) = J q(x, y)dx dy = /
hence
ty'(0) = r3q(r cos(9, r sin6) .
Similarly for the other spectral measures (cf. Exercises 6.6 and 6.7).
Example 6.1.17 (Geffroy (1958)) Take W(p) = 0 for 0 < 0 < TT/2. Then the side
conditions are fulfilled and
GQ(X, y) = exp J - (x~2 + y ~ 2 ) 1 / 2 [ ,
x > 0 ,y > 0 .
JC> 0 , y > 0 .
(JC
+ y)" 1 ) j ,
JC >
0 , y > 0.
x> 0 ,y > 0 .
( j + i]og2)(-IogGo^
1 I 7T
X ^ , / *\/2
B
+ arctany- '+ 0 v log
y |\
4 ,xA"^""
\jx2 + y2)\ '
To see this note that
anfl
rC7 '/^2/ l/ A
l tAtan<9
lAcotfl
lAcot<9\
_
/ "* /^2
= /
./arctan(x/y)
Jaictan(x/y)
11 AA tan
tan6161
x
fucun(x/y)
, /
dO + /
1 A cot 6>
dO .
JO
lAtanfl
x
J arctan (x/y)
1 rx/4
/**/
- /
1 r*/2
/**
tan<9d<9 + - /
/(7r/4)Aarctan(jc/y)
* y(7r/4)Aarctan(jc/y)
dO .
* /(7r/4)varctan(;t/;y)
J(n
Finally note that J0Z tan 0 dO = - log cos z, for 0 < z < n/2.
Example 6.1.20 This example is based on the normal distribution: for c > 0,
-logG0(x,y)^E(^v^N-^
with N a standard normal random variable. Then
Go(x,,)
where F is the standard normal distribution function. Again, when c -> oo we have
independence and when c | 0 we have full dependence. This distribution function can
be obtained in several other ways (Eddy and Gale (1981), Hiisler and Reiss (1989),
and de Haan and Pereira (2005)).
6.1.5 The Sets Qc and the Functions L, x> and A
Finally, we discuss a few other ways to characterize the max-stable distributions. Since
the dependence structure is quite general, one could describe the dependence using
copulas. If F is the distribution function of the random vector (X, F), the copula C
associated with F is a distribution function that satisfies F(x, y) = C (F\ (JC), i<2(y))
with F\(x) := F(x, oo) and F2OO := F(oo, y). It contains complete information
about the joint distribution of F apart from the marginal distributions (for more details
see Nelsen (1998) and Joe (1997)).
222
6 Basic Theory
Define for 0 < JC, y < 1,
C(x,y):=G0(-l/logx,-l/logy) .
Then C is a copula and relation (6.1.20) translates into the following: for 0 < x, y < 1,
a >0,
C(xa,ya) = Ca(x,y) .
Since this relation is not very tractable for analysis, it is usual to consider instead the
function L defined by
L(jc,y):=-logGo(l/JC,l/y)
for JC, v > 0. We can express the function L in terms of the exponent measure v (cf.
Section 6.1.3) as
L(x,y) = vUs,t)eR\
(6.1.33)
: s > l/x]
= - log Go ( - , 00 J = - log G I
, 00 j = x ,
= ^ + y,
for all JC, y > 0. On the other hand,
1/JC}
1/JC}
y)=x
223
- l o g G o ( l / x , l/y) = /
Jo
is convex.
L(x,y)<c}.
(6.1.34)
The set Q\ is closed convex and the points (0, 0), (0, 1), and (1,0) are vertices.
Conversely, any closed convex set Q \ with vertices (0,0), (0,1), and (1,0) gives rise
to a limit distribution Go for which (6.1.34) holds. The mapping is one-to-one.
Proof. Let Go be a simple max-stable distribution. The convexity of Q \ is an immediate consequence of the convexity of the function L (cf. Proposition 6.1.21 (6)). The
statement about the vertices follows from the side conditions for *I>.
Conversely, let Q\ satisfy the stated properties. The closed convex set Q\ can be
approximated from below and above by sequences of sets Q^ and Qff that satisfy
the properties and have a polygonal boundary, i.e., a boundary on R^_ satisfying
224
6 Basic Theory
and
e c Gi c e
(6.1.35)
(6.1.36)
Qf =
(X,V)ER2_:^((A/X)V(5/V))<1
I=I
r(n)
G ^ = j (x, y)eRl:
^((QJC)
v (D/30) < 1
i=i
for some sequences m(n), r(n) and positive constants A;, 5,-, C,-, A > 0 satisfying
G^C*,?) :=exp
-l(f^))-
Clearly G and G^ } are simple max-stable distributions and there exist discrete
n)
}
spectral measures *[ and * with
G
) =
X P
- / ;
fnl2
G%kx,y) = CXP
/ 2
( ^ v ^ ) <
(cos 9
sin 6> \
^ ,
Wo ( V ) * i } (<*0)
T (n)
r(n)
i=l
n - oo,
n -+ oo ,
225
for JC, y > 0 and hence there is a distribution function, Go say, such that for JC, y > 0,
lim
G^Oc,
y) =
lim
(JC, y ) e E + :
/"
((JCCOS0) V
we have
Qf c G* c e>
for all n. It follows that Q* = Q\. Hence
Qi = {(x,y)eRl
with
/
r/2
G 0 (*,y):=expl-J
/cos<9
f - v
sin<9\
J *(<W)J .
Since the convexity of the level set ?i is typical for a limit distribution, this property can be used to check whether the tail of a given distribution function resembles
a limit distribution. Details will be given later.
A related function to the function L (or the measure v) is the function R:
R(x, y) := x + y - L(x, y) = v |(.s, 0 e R+ : s > l/x and t > l / y | .
Note that the function R is the distribution function of a measure.
Finally, we review two other ways of characterizing the limit distribution Go in
the two-dimensional context.
Sibuya (1960), see also Geffroy (1958), introduced for t > 0,
X (0 := - log Go (1/f, 1) + log Go (1/f, oo) + log G 0 (oo, 1)
= I(f, 1) - L(t, 0) - L(0, 1) = -R(t, 1) .
(6.1.37)
By the homogeneity of the function - log Go, the function x determines the function
Go- The determining properties for the function / are as follows:
1. X is convex,
2. ( ( - 0 v (-1)) < x ( 0 < Ofor t > 0.
Pickands (1981) introduced for 0 < t < 1,
226
6 Basic Theory
(6.1.38)
By the homogeneity of the function log Go, the function A determines the function
Go. The determining properties of the function A are the following:
1. A is convex,
2. A(0) = A(l) = 1,
3. ( ( 1 - O v r ) <A(t) < 1.
Any function A satisfying Properties (l)-(3) leads to a unique limit function Go- The
convexity of A can be proved using
A(t) = 2 f (u;(l - r) v ((1 Jo
w)t)H(dw)
with H as in Theorem 6.1.14: in case H has a density, break the integral into two
integrals according to w(1 t) > (1 u>)r or w(l t) < (1 w)f and differentiate.
If H does not have a density, one needs to approximate the measure H.
For the characterization of multivariate max-stable distributions one can use the
distribution function of the spectral measure or any of the functions / and A. Since
the functions / and A are more complicated objects (convex functions rather than
monotone functions), we concentrate on the use of the spectral measure, which also
has a simple intuitive meaning. Moreover, the generalization to higher-dimensional
spaces is straightforward in that case.
for all x, y M. This is the same as convergence in distribution since any max-stable
distribution is continuous.
Theorem 6.2.1 Let G be a max-stable distribution. Let the marginal distribution
functions be exp ( - ( 1 + y\x)~l/yi)for i = 1,2 and let *I> or H or&be its spectral
measure according to the representations of Theorem 6.1.14.
1. If the distribution function F of the random vector (X, Y) with continuous
marginal distribution functions F\ and F2 is in the domain of attraction of G,
then the following equivalent conditions are fulfilled:
227
l i m
17/77
77 /,NN
( 6
2 1 )
r->oo 1 - F ( t / i (0,1/2(0)
vWtfi 5(JC, y) := log G ((*" - l)/yi, (y^ - \)/Yi) I log G(0, 0).
(&) (Via ^ circle) For allr > 1 ant/ a// 0 [0, TT/2] fter are continuity points
of*,
lim P ( V 2 + W2 > t2r2 and < tan<9
V2 + W2 > t2J
-i *(fl)
(6.2.2)
= r
ofH,
+ W > tr and
lim P
f-*00
("
V+ W
< 5
V + W>f
= r-1H(j),
(6.2.3)
of,
lim P [WW
r-oo
W
-i *(0)
(6.2.4)
(f)'
2
+
((y> W)etB\WW
>ta)
228
6 Basic Theory
for Pa-continuity Borel sets B in R+ \ [0, a]2. Since this is true for all a > 0, in
particular (6.2.1) holds. Hence the statements (la)-(ld) are equivalent. We proceed
with statement (la).
Since the function S is homogeneous of order 1, the statement implies that the
function 1 F(U\(t), U2{t)) is regularly varying with index 1. Hence there exists
a sequence an > 0, an -> oo as n -> oo, with
lim n{\ - F(Ui(an), U2(an))} = -logG(0,0) .
n-+oo
Y\
Y2
,
Y\
Y2
Xj
= G\-^-,oo)=exp()
\
Yl
J
Since the distribution function F(U\(x), oo) is in fact 1 1/JC, x > 1, we have also
lim Fn (Ui(nx), oo) = exp ( ) .
n-*oo
xj
/XY\ _ 1 yYl _ 1 \
,
\
Yl
Y2
(6.2.5)
We now proceed as in the proof of Theorem 6.1.1. The convergence of the marginal
distributions implies, for an,cn > 0,
,.
Ui(nx)-Ui(n)
lim
n-+oo
v
hm
n->oo
jc^-l
an
U2(ny)-U2(n)
cn
Yl
yn
- 1
Yl
(o.z.o)
229
n-+oo
Remark 6.2.2 In fact if the marginal distributions are in some domains of attraction,
if for all x, y,
\-F(Ux(tx),U2{ty))
t^>
l-F(Ul(t),U2(t))
exists and is positive and if the regularly varying function 1 F(U\(t), U2(t)) has
index 1, then F is in the domain of attraction of some max-stable distribution.
A particular case is the domain of attraction of a max-stable distribution with
independent components, i.e., one that is the product of its marginal distributions. A
random vector (X\, X2,...,
Xj) whose distribution is in the domain of attraction of
such a max-stable distribution is said to have the property of asymptotic independence.
A simple criterion for this to happen is given in the next theorem.
Theorem 6.2.3 Let F : M.d -> R+ be a probability distribution function. Suppose
that its marginal distribution functions F; : R > R+ satisfy
Hm^ F/1 (a^x
+ &>) = exp ( - ( 1 +
Yix)~l,Yi)
for all x for which 1 + yix > 0 and where a > 0 and bn are sequences of real
constants, i = 1, 2 , . . . , d. Let(X\, X2,...,
X<f) be a random vector with distribution
function F. If
, , "(*>"''/>"/>..0
t^oo
P(Xi>Ui(t))
for all 1 < / < j < d, then
lim F (<'>*, + b\ . . . , a^xd
+ b?)
= exp L p i
<6 . 2 . 7)
yiXi)-l/A
Xd) are
230
6 Basic Theory
Since this is true for all pairs (i, j ) , the exponent measure must be concentrated on
the lines
/,- = | (s\, S2,..., Sd) R+ : si > 0 and Sj = 0 for
i ^ j \ .
This is the same as saying that the spectral measure is concentrated on the extreme
points, i.e., the limit distribution has independent components (cf. Remark 6.1.16).
Example 6.2.6 (Sibuya (I960)) Consider the random vector (X, 7), normally distributed with mean zero, variances one, and correlation coefficient p < 1. We shall
prove asymptotic independence in this case, i.e.,
lim nP(X>bn,Y
>bn) = 0
n-+oo
> bn\
Now, (X -j- Y)/2 has a normal distribution with variance (1 + p)/2. If p = 1 the
result is immediate. If \p\ < 1,
IX + Y
lim nP ( - > bn I
Ai-KX)
= lim nP [X > J
n-+oo
V 1+ p
bn
= lim
n-+oc
P (X >
bn)
Exercises
6.1. Let A\, A2 be positive random variables with EAi = 1 for i = 1, 2. Prove that
,.:p(-.(v))
is a distribution function that is simple max-stable.
231
6.2. Let Fn be two-dimensional normal distribution functions with means zero, variances one, and covariances pn. Define an := (21ogn) 1 / 2 and bn := (21ogn
log log n log(47r))1/2. Then we have (Example 1.1.7) Y\mn-+oo n{\ F(anx+bn)) =
e~x, for x e R, where F is the standard normal distribution function. Take pn such that
a%/(lpn) -> A. > 0,w -> oo.Provethat/i(3 2 /(3x8y))(l-F n (a n x+^ n ,ay+^ w ))
converges to 2~l log(A./(4jr)) - (4A.)"1 - X4~l (x - y) 2 - 2~l (x + y) for JC, y R.
Conclude that
lim /i (1 - F(a n * + &, ay + K)) = - log Go(*, y)
with Go form Theorem 6.1.1. Cf. Example 6.1.20.
6.3. Prove that if (X, Y) are random variables with distribution function F with
continuous marginals then the following are equivalent:
(a) F is in the domain of attraction of some max-stable distribution G.
(b) For any Borel set A C R+ with inf (X,y)eA max(;c, y) > 0 and v(9A) = 0,
Hfo-^n'-^T)
= v(A),
where the sequences an, cn > 0, bn, dn e R are chosen so that G(JC, OO) is as (6.1.4),
G(oo, y) is as (6.1.5), and v is the exponent measure defined in Section 6.1.3.
(c) For any Borel set A C R+ with inf (* ^ A maxQc, y) > 0 and v(9A) = 0,
lim np\(-
ITTZTZ,-,
^ 7
) nA\
= v(A),
->oo
IVl-Fi(X) 1-F2(F)/
J
where F; : R -> R+, i = 1, 2, are the marginal distribution functions of F and
satisfy, for some sequences an,cn > 0, bn,dn e R, limn-^oo F" (anx + bn) =
e x p ( - ( l 4 - y i ^ ) _ 1 / ) / 1 ) and lim-+oo F2n (c* + d n ) = e x p ( - ( l + Yix)~l,Y1) for
all x for which 1 + yix > 0, i = 1, 2.
(d) For any Borel set A c R+ with inf (x,y)eA maxQt, y) > 0 and v(3A) = 0,
lim t~lP {(1 - Fi(X), 1 - F2(Y)) fA" 1 } = v(A)
with Fj, i = 1, 2, as before.
6.4. Prove Theorem 6.1.14.
6.5. A distribution function F in the J-dimensional space is called max-infinitely
divisible if for all n there is a distribution function Fn with F% = F , i.e., for each n
the random vector can be written as the maximum of n independent and identically
distributed random vectors. Using the method of Section 6.1.3 prove that F is maxinfinitely divisible if and only if
- l o g F ( x i , x 2 , ...,**)
= v{(s\,S2, ...,sn)
232
6 Basic Theory
for all (JCI , * 2 , . . . , Xd) with 0 < F{x\, JC2,..., Xd) < 1, where v is a measure (not
necessarily homogeneous).
6.6. With Hf the density of the spectral measure H and q(x, y) the density of
-logG 0 (Jt, y), verify that for r = JC + y and 0 = JC/(JC + y), H'{0) =
r3q(0r,r(l-9)).
6.7. With <' the density of the spectral measure <f> and q(x,y) the density of
log Go(x, y), verify that for r = max(jc, y) and 0 = arctan(jc/y),
6.8. If (X, y) is a random vector with some simple max-stable distribution function,
then L(JC, y) = JC + y for all x, y > 0 if and only if (X, Y) are independent.
6.9. Discuss properties of the function R (cf. Proposition 6.1.21).
6.10. Prove that R(x, y) is positive for all JC, y > 0 or R(x, y) = 0 for all x, y > 0.
6.11. Let (Vi, V2,..., Vd) be independent and identically distributed random variables with distribution function exp (1/JC), JC > 0. Let {rtj}f , = 1 be a matrix with
positive entries. Show that the random vector (v*y =1 n ?; V),..., vj^rdj Vj) has a
simple max-stable distribution. Find the distribution function. Show that any twodimensional simple max-stable distribution function can be obtained as a limit of
elements in this class.
6.12. Let X, Y be independent positive random variables with distribution function
F. Suppose lim^-^oo nP(X > JC a(n)) = x~a for some a > 0 and all JC > 0. Show
that for k\, A.2, v\, V2 positive, the random vector (k\X\ + X2X2, vi^i + V2X2) is in
the domain of attraction of the extreme value distribution
exp-
te^r (^)>
s-a-lra-ldsdt
233
where B := {(s, t) e R+ : k\s + Xzt < x, v\s + V2t < y}. Conclude that the distribution of (X\X\ + X2X2, V1X1 + V2X2) is in the domain of attraction of the extreme
value distribution with nondiscrete spectral measure. The marginal distributions have
extreme value index 2a.
Hint: Apply Theorem 6.1.5 to v n () := n 2 P((X, Y) e a{n)-).
7
Estimation of the Dependence Structure
7.1 Introduction
In Chapter 6 we have seen that a multivariate extreme value distribution is characterized by the marginal extreme value indices plus a homogeneous exponent measure
or alternatively a spectral measure. In particular, there is no finite parametrization
for extreme value distributions. This suggests the use of nonparametric methods for
estimating the dependence structure, and in fact we are going to emphasize those
methods.
In Sections 7.2 and 7.3 we shall consider estimation of the exponent measure v
exemplified by the function L and the sets Qc of Section 6.1.5, as well as estimation
of the spectral measure introduced in Section 6.1.4.
Further, in Section 7.4 we shall discuss a simple coefficient that summarizes the
amount of dependence between components of the random vector.
Finally, in Sections 7.5-7.6, for the case of asymptotic independence of the components, we shall discuss a submodel that allows for a more precise analysis in that case.
Yl
-) ,
)
where y\, yi are the extreme value indices of the marginal distributions. The relation between Go and the exponent measure v from Section 6.1.3 is (cf. Theorem 6.1.5)
Go(x, y) = exp l-v 10, t) 1R+ : s > x or t > v)[) ,
JC, v
> 0.
236
:= - l o g G o l - , - )
t-+oo
(7.2.1)
<7 22)
'
where k may depend on n but we need to have k = o(n). We shall see that in order to
get consistency for our estimators we also need to assume k = k(n) -> 00, n -> 00.
Replacing F by its empirical distribution function, U\(n/{kx)) by X n _[^]+i,/i, and
U2(n/(ky)) by Yn-[ky]+i,n in the left-hand side of (7.2.2), we get
1
or Yi>Yn.[ky]+hn}
(123)
* ,-1
1
l{R(Xi)>n-kx+l
orR(Yi)>n-ky+l},
*,.,
where R(X() is the rank of Xt among (X\, X2,...,
Xn), i = 1 , . . . , n, i.e.,
(7.2.4)
237
*<X.0:=5>*,<*},
and R(Yi) is the rank of Yt among (Y\9 F2, > Yn).
Indeed, this estimator is invariant under monotone transformations of the components of the random vector; hence it does not depend on the marginal distributions.
We will establish consistency and asymptotic normality for L. We start with the
consistency.
Theorem 7.2.1 Let (X\, Y\), (X2, F2), ...be i.i.d. random vectors with distribution
function F. Suppose F is in the domain of attraction of an extreme value distribution
G. Define
(fx~Yl - 1 v~n - 1 \
L(*,y):=-logG(-J
-)
\
Y\
Y2 J
for x,y > 0, where y\, 72 aw the marginal extreme value indices. Let G be such
that L(JC, 0) = L(0, x) = x (this means that the marginal distribution functions of
G are exactly exp ((1 + Yix)~l/yi) for i 1, 2). Then for T > 0, as n -> 00,
k = k(n) -* 00, k/n - 0,
L(x, y)-L(x,
sup
y) ^ 0
0<*,;y<r
Proof First we show that it is sufficient to prove pointwise convergence. Fix s > 0.
Select
(0,0) = (*o, yo); (xi, v i ) , . . . , (xr, yr) e [0, r ] x [0, r ] ; (xr+i, yr+i) = (T, T) ,
such that for 1 = 0 , 1 , 2 , . . . , r,
0 < L(JC,-+I, y;+i) - L(xi9 yt) < - .
sup
L(*i,?i)-
L(xi,yt)
\0<i<r+l
0-
,sup
\(0,0)<(.x,y)<(T,T)'
\L(x,y)-L(x,y)\
>e
'
(7.2.5)
238
: =
7 Z2 ^{Ui<kx/n or Wt<ky/n)
(7.2.6)
-{,-;(i~*(,-'"*))r
with /7n,^ := P(Ut < kx/n or W( < ky/n). We know by (7.2.2) that npnik/k -
L(JC, y), n -> oo. It follows that the characteristic function converges to exp (itL(x, y)),
i.e.,
p
Vn,jt(A:,y)->L(x,y) .
Again by continuity and monotonicity the convergence is locally uniform. Next note
that
/n
n
\
L(X, y) = Vntk \j-U[kx],n, TW[ky],n)
(the random objects U[kx],n, ^[ky],n, and Vn^ are dependent but this is not relevant
for the present consistency proof).
Since
n
TJ
T U[kx],n -*X
and
ML7
- W{ky],n ~+ V
L(x,y)-+L(x,y)
239
- 1 v~n -/-
l\
-)
U2 ( - ) ) } = L(x, y) + 0(ra)
(7.2.8)
f -> 00, where Ut := (1/(1 F())*~, i = 1,2, holds uniformly on the set
|JC2 + V2 = 1, JC > 0 , y > 0 J .
Suppose further that the function L has continuous first-order partial derivatives
L\(x, y) := L(x, y) and L2(x, y) := L(x, y)
ox
ay
forx, y > 0. Then fork = k(n) -> 00, k(n) = o (n2a/<1+2a>), as n -* 00,
Vk(L(x,y)-L(x,y))4>B(x,y)
in D([0, T] x [0, T]), for every T > 0, w/*ere
B(x, y) = W(JC, v) - LI(JC, v)W(JC, 0) - L2(x,
y) W(0, v)
240
rHS,'{1-F(^G)'%G))}=L<*^>Then, provided k = k(n) -> 00, k/n -> 0 05" n > 00,
J2frW(Xr,yr)
r=l
for d = 1,2,..., all real numbers f i , . . . , td, and all (x\, y i ) , . . . , (xd, v^).This can be
done conveniently by applying Lyapunov's form of the central limit theorem (Chung
(1974), Theorem 7.1.2).
In order to establish tightness we define subrectangles
\_m
m J
Lm
Wn{x, y) := V* (VnM*, y) ~ \ {1 - F ( * ( ) . U2 ( i ) ) J)
forO < JC, v < 1.
The main tool is an inequality from Einmahl (1987), which for our purposes can
be written as follows. Define for a rectangle S e [0, l ] 2 ,
1 A
6 S]
and
v,k(S) := \P (j (1 - *i(X), 1 - F2(Y)) 6 5 j .
Einmahl's inequality: Let R be a rectangle in [0, l ] 2 with
241
Cexp(
(_ - ^ _
] |
for A > 0. The function ^ satisfies the following conditions: VOO is continuous
and decreasing, xx/rix) is increasing, and ^r(O) = 1. In particular, this implies that
VK*) >0forjc > 0 .
We shall apply the inequality to the rectangles
L
m \
\_m
\m
m J
|_
m J
i, 7 = 1,2, . . . , m - 1.
First note that if for some x = (JCI, JC2) and y = (yi, y2) with |x y| < 8
(consider the Euclidean norm) we have |Wn(x) Wn(y)\ > s, then there exist i, j
such that x and y are in //7 with m = \\fl/8'\ and
6
Wn(x)-Wn(-,^]\
Wn(y) - Wn
>-or
s
> -
L l
( - )\
Hence
P I
|\Wn(x)-Wn(y)\>4s
sup
\|x-y|<a/2
<P\
< T
max
sup \Wn(x) - Wn (-,
yu=0,i,2,...,m-iXG/.. I
\m
T
j=0 ; = 0
\ X G / 'i '
m-lm-1
< V
^ ) > 2s)
m)\
J
Y >
'
* \ I
/
\
+ P ( sup W fjci, ^
m-lm-1
- W ( - , - )
2
> e)
\\
1=0 ; = 0
+ C exp I
\ 2vn,k(Kij)f
\VkVn,k^ij)))
'
(7.2.9)
242
where for the last inequality we apply Einmahrs inequality with R replaced with Jtj
and Kij. Note that WXuo = WO,JC2 = 0 for x\, X2 > 0.
Next note that
^ w s ;,{!(,- a ( n[i.ii])}.i
and
^ , s :,{:(- m c[i.ii])j-I.
Hence by the monotonicity of x\/r(x) expression (7.2.9) is at most
m1m1
2C
\\
Sg -(-T*(5))'
Since \[r(0) = 1, this clearly converges to zero as A: > oo, ra > oo, m = o(Vk).
Corollary 7.2.4 If moreover (7.2.8) /wfcfo ifc(/i) -> oc andk(n) = o (2c*/(i+2<*))
fl5
VS{^C.,)-i[i-i'(,(i).ft(i))]j-wu.,)
sup
0<x,y<T
-> 0
a.s.
Then
Vfc (Vnfifc(jc, y) - L(x, y)) - W(x, y)\
sup
0<x,y<T
<
sup
0<x,y<T
Vk\vn,k(x,y)--
+ sup Vk
0<x,y<T
\-F
N)-0)] }-"<"
"(l-'(">()
">&)))-""*>
(7.2.10)
243
>" [I-Kw(5K7w)^(?i557iii))
VW'WJI
(^))]
Vlx| \x\J\\
This expression remains bounded uniformly for |x| < 1 as t | 0 since |x| 1 + a < 1
and since by (7.2.8) the second factor remains bounded uniformly for |x| < 1.
Proof (of Theorem 7.2.2). Once again it is sufficient to prove the result for T = 1.
Again we invoke a Skorohod construction (but keep the same notation) and we start
from
sup V* (Vfjt(jc, v) - L(x, y)) - W(x, y)\ -> 0 a.s.,
0<x,y<l '
'
\Vk{Vntk0c,0)-x)-W(x, 0)
0<x<l'
with
a.s.
(7.2.11)
1 n
Vn,k( ' 0) = 7 Zl l{l-Fi(Xi)<kx/n)
x
i=l
Let Ut := 1 - Fi(Xi), i = 1, 2 , . . . , n. The function V^(JC, 0) is a nondecreasing function. Its inverse function is (n/k)U\kx],n, the [kx]\h order statistic from
t / i , . . . , Un. Vervaat's lemma (Appendix A) allows us to invert relation (7.2.11) and
we get
(7.2.12)
sup \Vk (jU[kxln - x) + W(x, 0) ->0
a.s.
<JC<1 '
0<x
X,C
'
Similarly we get
sup \Vk (jW[kyln
0<v<l'
a.s.
with Wt := 1 - F2(Yi), i = l , 2 , . . . , n .
Since we have uniform convergence in
Vk (Vntk(x, y) - L(x, y)) - W(x, y) -> 0
and since by (7.2.12) and (7.2.13)
(7.2.13)
244
T.U[kx],n
-X
-> 0 and
sup
0<x<\
-W[ky],n-y
a.s. ,
(7.2.14)
~> 0
a.S.
0<y<\
we have
^{Vn,k
(j-Uikxin, lWihln)
~ L ( /[**],,,, j^BW.")}
/n
n
- W \rU[kx],n, TW[kyln)
\w (lu[kxln,
lw[kyln)
- W(x, y)
0<*,;y
a.s.
Finally, relations (7.2.12) and (7.2.13) imply when combined with Cramer's delta
method and the differentiability conditions for L that
sup
0<*,;y<l
II(JC,
a.s.
L(x,y)<l)
245
r r
0.04
0.02
0 i'*-\
r'
0.04
i'i-"
0.02
r
0.06
HmO
(7.2.15)
L (cos0, sin0)
We can estimate the set Q\ by estimating p(0), 0 < 0 < n/2. A natural estimator
forp(0) is
p(0) =
p(0) : =
L (cos0, sin0)
In Figure 7.2 we find the estimation of Q\ via p(0). Again the concavity of the
Q-curve seems true and there is some indication of dependence between (high values
of) the variables.
The asymptotic normality of p(0) follows straightforwardly from Theorem 7.2.2.
Corollary 7.2.5 Let (X\ ,Y\), (X2, Yi), ...be Ltd. random vectors with distribution
function F. Suppose F is in the domain of attraction of an extreme value distribution
G with standard marginals. Suppose that for some a > 0 and for all JC, v > 0 the
relation
246
~i
0.2
0.4
0.6
0.8
1.0
HmO
L2 + y2 = 1 : x >0,y >ol .
Suppose further that the function L has continuous first-order partial derivatives
Li(x, y) := L(x, y) and L2(x, y) := L(x, y)
ox
ay
forx, y > 0. Then fork = k(n) - oo, Jfc(n) = o (,,2a/(i+2a))
fl5 - > OO,
L(x,y,z)<l}.
A sample graph of the g-curve in R+ is in Figure 7.3. The variables involved are
wave height (HmO) and still water level (SWL) as before, and wave period (Tpb)
measured in seconds. The picture indicates no asymptotic independence since for
asymptotic independence one expects a flat convex function.
247
k=21
248
Adapting the arguments in the beginning of Section 6.1.4 for this specific case,
we consider the sets
Dr,e := {(*, y) e R+* : x v y > r andx/y < tanflj
for some r > 0 and 0 e [0,7r/2]. Then
<D(0) := rv(Dr,e) = v(Dhe),
(7.3.1)
where v is the exponent measure of Section 6.1.3. Since it is easier in this context
to work with the uniform distribution as the basic distribution rather than with the
distribution function 1 1/JC , JC > 1, we reformulate (7.3.1) in terms of the measure
ix, defined (as in Section 7.2) by
JJL{(S, t) e [0, oo] 2 \ {(oo, oo)}
: s < x or t < y}
2
(7.3.2)
t^oo
t /
249
in relation to the sample size n. Since we want to deal with the tail of the distribution
only, the choice t = n/k imposes itself with k = k(n), k - oo, and k/n - 0.
Next we replace the measure P by its empirical counterpart. Then the left-hand
side of (7.3.3) becomes
n1
~
kn
X ) l{(l-F1(Xi))A(l-F2(Yi))<k/n
i=l
l-Fl"\x):=l-J2l{Xj>x].
(7.3.4)
J=i
where R(Xt) is the rank of the ith observation X,-,i = 1 , . . . , n, among (Xi, X 2 , . . . ,
Xn). Similarly we replace 1 - F2{Yi) by (n + 1 - R{Yi))/n where fl(F;) is the rank of
7; among (Y\, Y2,..., y n ). Taking everything together we get the following estimator
for <I>:
1
* ( ^ ) '= 7 22
l{R(Xi)vR(Yi)>n+l-kandn+l-R(Yi)<(n+l-R(Xi))tanO}
(7.3.5)
The estimator is nonparametric in that the statistic is invariant under monotone transformations of the marginals of the observations. So it does not depend on the marginal
distributions.
In Figure 7.4 we find <$>(#) for the 828 observations of wave height (HmO) and
still water level (SWL). Note that there is some indication of dependence between
the variables since the angular coordinates are not clustered in the neighborhood of
0 and JT/2.
Another way of displaying the estimator of the spectral measure makes use of its
discrete character: it gives equal weight to a limited number of points. Hence we can
just display on the line segment [0, TT/2] the points
f
arctan
n + 1 - R(Yj)
n+
l-R(Xt)
for those observations (X,-, Yi) for which R(X() V R(Yi) >n + l-k. This is done
in Figure 7.4 too.
One can do this similarly in higher dimensions. For example, in R 3 one can display
the intersections of the lines through the points
250
e 0.5
0.25
SWL
HmO
Fig. 7.4. Estimated spectral measure and angular coordinates 0,- (shown as "+" signs); the solid
line represents the corresponding distribution function <I> scaled down from 39/27 to 1.
(n + 1 - R(Xt), n + 1 - RiYt), n + 1 -
R(Zt))
and the origin with the plane { J C , J , Z > 0 : J C + J + Z = 1}. TO display the intersection
points on this triangle we do the following. Figure 7.5 shows the situation in which
P is the point
n + 1 - R(Xt)
n + 1 - R(Xt)
( 3n + 3 - R(Xt) - R(Yt) - R(Zt)' 3n + 3 - R(Xt) - R(Yt) - R(Zt)'
n + 1 - R(Xj)
3n + 3 -RiX^-RiYA-RiZi))
R(Zt)),
EF _ OF
~DB~~OB'
It follows that
DB = V2
n + 1 - R{Xi)
2n + 2-R(Xi)-R(Yi)
'
Similar relations hold for the lines connecting P with the other edges, A and C.
251
/E ^-yD
/
A
T
Fig. 7.5. The point P in the triangle and its projection in the plane.
Figure 7.6 displays the empirical spectral measure for the three-dimensional sea
state data. Again the picture indicates no asymptotic independence since for asymptotic independence one expects the point to be concentrated near the three vertices.
HmO
Tpb HmO
*=14
SWL
HmO
Fig. 7.6. Trivariate angular coordinates representing the spectral measure; scatter plots shown
correspond to k = 54, 27, 14.
252
L(JC, y) = /
(7.3.6)
Jo
for JC, y > 0 (see Theorem 6.1.14(3)). After splitting the integration interval into
several parts and applying partial integration, we obtain (cf. proof of Theorem 7.3.1
below) the alternative expression
/7T\
L(x,y)=x<P(-)
/-arctanOV*)
+ (xvy) /
V2/
Jn/4
*(0) (
\sin2<9
^ J d0 . (7.3.7)
cos20/
rzrctm(y/x)
L*(*, v) := x <J> M + (* v v) /
^2/
/j/4
4>(0) 5 - A T - J 0 . (7.3.8)
\sin 2 0
cos20/
This estimator is somewhat more complicated than the one in Section 7.2. On the
other hand, the present estimator has the advantage that it is homogeneous, i.e.,
L{ax, ay) == a L<D(JC, y)
UmJ[l-F(ul^,U2{^)]=L(X,y).
Let k = k(n) be a sequence of integers such that k - 00,fc/n 0, n > 00. 77ien
*(0)-5>*(0)
(7.3.9)
/or 0 = 7r/2 anJ eac/i 0 [0, n/2) that is a continuity point 0/O. Moreover,
U(x,y)-+L(x,y)
forx, y > 0.
(7.3.10)
253
Corollary 7.3.2 The statements of Theorem 7.3.1 imply the seemingly stronger statemmtS
t /
(7.3.11)
X (<*>, <I>)
= inf{<5 : <S>(<9 - 8) - 8 < 4>(<9) < 0(6> + <5) + SforallO < 0 < n/l]
and for all L > 0,
sup
L<t>(x,y)-
L{x,y)
4-0.
(7.3.12)
0<x,y<L
Proof (of Theorem 7.3.1). Define the measures \x and A as follows: for a Borel set A
in [0, oo] 2 \ {(oo, oo)},
M(A) := lim n-P ((1 - Fi(X), 1 - F2(Y)) e -A] ,
(7.3.13)
where (X, F) is a random vector with distribution function F and as before Theorem
7.3.1, we define
1 n
/1(A) := - 22 U(n+l-R(Xi),n+l-R{Yi)) A:A}
(7.3.14)
(7.3.15)
(7.3.16)
By subtracting two sets as in (7.3.16) we get for 0 < x\ < x2 < oo, y > 0,
A ([*i, *2l x [0, y]) -+ /x ([xi, x 2 ] x [0, y]).
(7.3.17)
This is also true with x2 = oo, y = oo. Let 0 be a continuity point of 3>(0). Clearly
for e > 0 we can find two finite unions of sets as in (7.3.17), Le and Ue, such that
Le c Eue C tfe
and
254
(Eh0)
< n (Ls) + s
L(x,y)=
/
JO
rn/2
pn/2
y /
parctan(y/x)
./0
cot0 <P(dO) + JC /
./jr/4
>/
pn/2
<Z>(dO) + W
<t>(d0)+x
./0
<D(</<9) .
*/arctan(;y/;t)
/arctan(;y/jt)
/0
/arctan(j/jt) /y
arctan(;y/jt)
/arctan(y/jt)
<Z>(dO)-y
Jn/4
Jn/4
J^
2
(d0),
Jn/4
Jn/ Sin q
Q ^
T/4
sin 2 0
r*/ 4
4>(<9)
dO .
/arctan()7jt) COS2 0
dO
! * ( * , ? ) = * * ( - ) + (*v;y) /
v
n/4
2/
7^/4
4>W
\sm20
_ ^
^ . (7.3.18)
cos^fl/
Since <f>(<9) -* *(0) for 0 = JT/2 and all (9 in some set 5 where [0, jr/2] fl 5C is
a countable set,
nlim
(Ux,
+ (xvy)f
J
(* (|)
- *
(|))
n-HX> \
' \silT
COS2 0 J
255
Proof (of Corollary 7.3.2). We have already shown that (7.3.10) is sufficient for
(7.3.12) (cf. proof of Theorem 7.2.1). Now we show that (7.3.9) is sufficient for
(7.3.11). Fix e > 0. Take 0 < 0O < 0\ < < 6r < 0r+i = n/2 such that 0; is a
continuity point of <J> for i = 1, 2 , . . . , r and 0;+i 0; < s for / = 1, 2 , . . . , r. Then
as n > oo,
P
For any 0 [0, f ] there exists 0k such that 0 < 0k < 0 + s. Then if <S> (0*) >
$ (0fc) > we have
0> (0 + s) > <S> (0*) > 4> (0*) - e > 4> (0) - e .
It follows that
P (<> (0 - e) < <D (0) + and <J> (0 + e) > O (0) - e for 0 < 0 <
n/l\
> PU> (0i) - 8 < 4> (0/) < O (Oi) + e for i = 0, 1 , . . . , r + l ) -> 1
as n -> oo.
For the asymptotic normality of 4> and L$ we need two conditions, both of which
strengthen the domain of attraction condition
lim t p ( \ - Fi(X) < - or 1 - F2(Y) < -) = L(x, y)
t->oo
tJ
considerably. Also we need to impose a further restriction on the growth of the sequence k(n).
Let 8 e | l , \, ,. . . | , p = 0 , 1 , 2 , . . . , 1/8 - 1, and define h(p) :=
[p8/ tan 0, (p + 1)5/ tan0], 0 [0, TC/4]. Let .A be the class containing all the following sets:
1. U ^ " 1 {(JC,V) :x e h(p),0< y <x tan0 + Cp(x tan0) 1 / 1 6 }, for some 0
[0, jr/4] and Co, C i , . . . , Ci/a-i [ - 1 , 1];
2. {(*, y) : y < }, for some b < 2;
3. {(*, y) : * < a}, {(JC, y) : x < M,y <2}, for some a < M (later on M will be
taken large);
4. {(*, y) : x > 1/ tan 0, y <fc},for some 0 e [0, TT/4] and b < 2.
Next define As = IA 5 : A e AL|, where for A e A, As = {(JC, y) : (y, JC) e A).
Finally define A =
AuAs.
256
| and M > 1,
M (A)
0,
AeA'
7-772
O<*<2/tan0
(Xtan^)l/16
"^
'
and b(x) = b (
I for x >
1Vtan6>/
A
" '
tan<9
where B([0, oo] 2 \ {(00, 00)}) denotes the class of Borel sets on [0, oo] 2 \ {(00, 00)}.
The class of sets C2 = C^iP) is like C\ but with x and y interchanged.
Condition 7.3.4 For some f$ > 0,
D(t) :=
sup
rlP{(l
0,
r |0.
CeCiUC2
VkD
Gho' B
00.
Theorem 7.3.6 Let (X, Y), (X\, Y\), (X2, I2), ...be independent and identically
distributed random vectors with continuous distribution function F. Let F\(x) :=
F(x, 00) and F2{x) := F(oo, x). Suppose that for JC, y > 0,
lim t P (1 - Fi(X) <-or\-
f-oo
and moreover, the uniform extensions Conditions 7.3.3 and 7.3.4 hold. Suppose that
fx has a continuous density X in [0, oo) 2 \ {(0, 0)}. Let k = kin) -> 00, n -> 00 and
suppose Condition 7.3.5 holds fork. Then, as n -* 00,
257
Z(0) := /
Jo
A(x,xtan6>){Wi(jc)tan(9-W2(jctan(9)}Jjc
wifA W\(x) := WM([0, JC] x [0, oo]) and W2(x) := WM([0, oo] x [0, y]). Note that
W\ and W2 are also standard Wiener processes. Finally,
Vk ( L 0 ( X , y) - L(x,
y)J - * Q(x,
y)
+ (*vy)
A/4
Proo/ The proof is very intricate. We give here a sketch of the reasoning and refer
to the paper Einmahl, de Haan, and Piterbarg (2001) for full details. We can write
with Co =
E\j,
i=i
and
Ut := 1 - F<n\Xi) ,
Vi:=l-
F^iYi)
i = 1,2,..., n, where Fj and Fj? are the marginal empirical distribution functions
(cf. (7.3.4)). We also introduce
= (*'*) H -
Ce:=-\(x,y)e[0,ooY
Now it is important to notice that
-kPn {-Ce)
with
= -Pn
[-Co)
1 n
Pn (C) := n- ^ 1 {([/-,Vi)eC}>
i-i
258
where
Ut := 1 - Fi(Xi) ,
i = 1, 2 , . . . , n. We now have
Vt := 1 - F2(Yi) ,
Vt(*(0)-*(0))
r-(n
(k * \
(k * \ \
l-Ce)
- ii ( o ? ) J
(bias term)
> 1)
t->OQ
Yfj=xP{Xj>Uj{t))
= hm !r'-00
P(y>=1Xj>Uj(t))
L ( 1 , 0 , . . . , 0 ) + L ( 0 , 1 , . . . , 0 ) + .- + L(0,.. .,0,1)
L(l,l,...,l)
d
d
L(l,!,...,!)
L
259
One possible interpretation for this coefficient is that K quantifies, on average, how
many disasters will happen given that one disaster is sure to happen.
The case of asymptotic independence corresponds to K = 1, and the case of full
dependence corresponds to K = d (cf. Proposition 6.1.21). So in order to make things
somewhat reminiscent of the correlation coefficient in that the case of asymptotic
independence corresponds to 0 and the case of full dependence to 1, we define the
following dependence coefficient (Embrechts, de Haan, and Huang (2000)):
-1
d-\
H : = K
-L
(d-Y)V
When dealing with observations from the domain of attraction of an extreme value
distribution one can estimate H by
.
d-L
(d-l)
<*-L(l,l,...,l)
(d-l)L(l, !,...,!)
260
: = t ( l , l , . . . , l ) : = i E l j a,>jri(1)
:
orX(d)>X(d)
or
k+l,n r - r A i - A n - * + l , n 1
where ( x | 1 } , . . . , X ^ } ) , . . . , (X< 1 } ,..., x) are independent and identically distributed observations from the distribution function F. Similarly as in Theorem
7.2.1 (cf. Exercise 7.1) we have that under the domain of attraction condition and
k = k(n) -> oo, k/n -> 0, n -> oo,
H^H.
(7.4.1)
Let W be a d-dimensional continuous Gaussian process with mean zero and covariance structure given by the natural extension from the two-dimensional case considered in Theorem 7.2.2. For simplicity of notation define W(l) := W(l, 1 , . . . , 1),
._ W ( o , . . . , 1 , . . . , 0), where the 1 is in the *, coordinate and L ^ :=
Wd)
L(0,..., 1 , . . . , 1 , . . . , 0), where the l's are in the JC,- andXj coordinates. Then, under
the conditions of Theorem 7.2.2, with the obvious extensions to d dimensions, we
have
d
V* ( - l ) - W{\) - ^ L ; ( 1 ) W ( 0
1=1
with
Var(W(l))
Var(W (0 )
EW(l)W(i)
EW(i)Wu)
= L ,
= 1,
= 1,
i = l,...,rf,
=d- L(iJ\
ij = 1 , . . . , d ,
a
Li(l)
1=1
1=1
i =
l,...,d
^ : = L + ^ ( L ? ( l ) - 2 L / ( l ) ) + 2 j ; J ] LKD^d) (2 - L(^>) .
j=ij#
In order to be able to apply this limit result for testing, one needs to estimate Lj (1)
consistently, j = 1 , . . . , d. A consistent estimator is
1=1
261
(1)
Y0"-D^Y0'-1)
.k+l,n>'"'Ai
- A n-Jfc+l,n'
yU)>yU)
Y0+l)>y(;+l)
i - A -[*(l+*-l/4)] + l,' A ~An-k+\
y(d)>y(d)
'Ai
-An-k+l,n*
, L J .
)
l-F(w,z),
where w > maxi<i<n Xi and z > maxi</<n Y(.
One may think, for example, of an athlete who wants to compete in the Olympic
Games in two disciplines. Her past records in the two disciplines are the observations
above. The values w and z are the thresholds that one has to reach, at least one of
them, in order to qualify. The athlete has never reached the thresholds.
This problem is a simple multivariate version of the problem of tail estimation
(Section 4.4). We want to consider here the simplest situation just for the sake of
exposition. We assume that both marginal distributions of F are 1 1/JC, x > L A
much more general situation will be considered in Chapter 8. We want to look at the
problem from an asymptotic point of view, hence with n -> 00, and assuming that
F is in the domain of attraction of an extreme value distribution. Since the condition
w > maxi<j< Xt and z > max\<i<n Yt is an essential feature of the problem, we
262
(7.5.1)
is bounded.
The aim is to estimate p* := 1 F(wn, zn)- We further assume for simplicity
that wn = crn and zn = drn, for some positive sequence rn -> oo and c, d positive
constants. The domain of attraction condition is
lim t (1 - F(tx, ty)) = - log G0(x, y) = L (-, -) .
t-+oo
\x yj
(7.5.2)
Hence
/?* = 1 - F(u>n, zn) = l - F(crn, drn) ~ L I - , - J .
rn \c d)
This limit relation suggests that we estimate /?* by
^:=7/n'k(yd)=7i^i{x^c/k
or Yi>nd/k)
lim 2- = lim
-A^- = 1 ,
(7.5.3)
in probability.
This is straightforward. But let us now look at the problem of how to estimate
pn :=P(X>
wn, Y>zn)
where (X, Y) has distribution function F with the same simplifications as before.
Suppose moreover that the distribution function F is in the domain of attraction of
an extreme value distribution with independent components. One can try to estimate
pn as before by
11 "
r
7 , hxt>nc/k
*,=i
=
11"
and Yi>nd/k)
7 T 2 ^ liXi>nc/k)
Yn K
i=\
l l A .
l
+ Tl_s
{Yi>nd/k)
Yn K
,=1
11
~ T J . l{Xi>nc/k
T K
i=l
or Y^nd/k} ,
263
anything about the asymptotic behavior of this probability itself. So it seems that in
order to estimate pn consistently we need a more refined model.
In fact, the condition of Theorem 7.2.2 on the asymptotic normality of L(JC, y)
gives a clue about where to look for further conditions. The condition is in our case
lim
f(l-F(f*,oO)-L(i,i)
^ - ^ - = Q(x, y),
f-0O
(7.5.4)
A(t)
Q(x, oo),
(7.5.5)
(oo,y).
(7.5.6)
(1/JC,
P (X >txotY
> ty)
(7.5.7)
asf -> ooforO < JC, y < oo. Comparing this relation with (7.5.2), we see that P(X >
t or Y > t) is a regularly varying function of order 1 and P(X > t and Y > t) is
of lower order in case of asymptotic independence. In fact, P(X > t and Y > t) is a
regularly varying function of order p 1, where p < 0 is the index of the regularly
varying function \A\.
We now show that condition (7.5.7) allows us to estimate pn consistently. It is
common to write (7.5.7) as
lim/>(x>a,r>,y)
t-oo P(X>t,Y
>t)
In particular, #(0 := P (X > t, Y > t) is a regularly varying function with index less
than or equal to - 1 . In the original papers (Ledford and Tawn (1996,1997,1998)) the
index is written as l/rj with r\ < 1. Clearly if there is no asymptotic independence
264
A(t) can be taken constant and hence rj = 1. Also S is the distribution function of a
measure, say p,that is,
S(x, y) = p Us, t) e R+ : s > x, t > y\ .
Pn : = I ~rn
\Yl
- 7 ^
Tl K
{Xi>nc/k,Yi>nd/k}
1=1
provided that
lim nq (^) = 00 .
(7.5.10)
This condition sets a lower bound for the sequence k = k(n). We now write
Pn _ lEUUx^nc/kJ^nd/k}
Pn
S (c, d) q ( f ) ( f r , , ) ^
lq(l)S{c,d)
P(X>wn,Y>Zn)
By (7.5.8)
P (X > w, Y > zn) = P(X>
Um i i M ^
q(rn)
(7.5.n)
and
(W
-4 1 .
(7.5.12)
<?(0*-1/r>
?
; ; = 1,
#('*)
265
=: S(x, y)
(7.6.1)
y)
exists and is positive. Then, P (1 - F\{X) < f, 1 - F2(F) < t) is a regularly varying
function with index l/rj, say, and as in Theorem 6.1.9, for a, x, y > 0,
S(ax, ay) = al/r,S(x,
y) .
(7.6.2)
The residual dependence index r) e (0, 1] was introduced by Ledford and Tawn
(1996, 1997, 1998).
Note that the domain of attraction condition implies
\imrlP(l
40
- Fi(X) <tx,l-
lim
OO
P
p
_M^lkll!U
Vl-Fi(X)
l-F2(Y)
>
= S [ - I = x-V'S(l. 1) = x-V
7
for x
> 0, i.e., the probability distribution of the random variable
((1 F\(X)) v (1 F2(Y))~l is regularly varying with index l/rj. This suggests
that we use a Hill-type estimator as in Section 3.2.
266
- J2
1=0
as an estimator, where {V),w} are the the order statistics of the independent and
identically distributed sequence V) := 1/ ((l - F\(Xj)) v (l - F 2 (F / ))), 1 =
1,2,...,w.
Since Fi and F2 are not known, we replace them with their empirical counterparts
Fj and Fj1 as defined in (7.3.4) (to prevent division by 0). This leads to the random
variables
T(n)
'
._
"
1
((l-F^))v(l-F
w
2
(y,)))
1
~~ /Vw+l-*(X,-A
/f,+i-/g(y f )\\
n
l-(R(Xi)AR(Yi)y
n+
where /?(X,-) is the rank of X; among Xi, X 2 , . . . , Xn and tf(F;) that of Yj among
Fi, F2, , Yn. The Hill-type estimator then becomes
n:=lf:iogT^n-logT*\n,
1=0
where {7),n} are the order statistics of the non-independent and identically distributed
sequence 7} , 1 = 1, 2 , . . . , n.
Asymptotic normality can be proved under a refinement of condition (7.6.1).
Theorem 7.6.1 Let (Xi, Fi), (X2, F2),... be i.i.d. random vectors with distribution
function F. Suppose (7.6.1) and (7.6.2) holdfor some rj e (0, 1]. We also assume that
the following second-order refinement of (7.6.1) holds:
lim
40
^(0
^V
%y)
exists for all x,y > 0 with x + y > 0, w/iere gi w some positive function and Q is
neither a constant nor a multiple of S. Moreover, we assume that the convergence
is uniform on {(JC, y) e R+ : x2 + y 2 = 1} anJ ffotf the function S has firstorder partial derivatives Sx := dS(jc,y)/9;c and Sy := 9S(JC, y)/3y. Finally, we
assume that
\imrlP (1 - Fi(X) < f am/1 - F 2 (F) < t) =: I
exists. Let q*~ be the inverse of the function q(t) = P(l F\(X) < t and 1
F2(F) < t). For a sequence k = k{n) of integers with k -> 00, k/n -+ 0, and
\fkq\ (q*~(k/n)) - 0, -> 00,
267
Vk(fj-ri)
is asymptotic normal with mean zero and variance
iy2(l-/)(1-2/5,(1, l)Sy(l,l)).
Proof. We provide a sketch of the proof. The original elaborate proof (Draisma,
Drees, Ferreira, and de Haan (2004)) is beyond the scope of this book.
Similarly to Section 7.2 one obtains, with m := nq(k/n) and m > oo,
( * E
{Xt>XH.[kxWtlt
and Yi>Yn[ky]+l,n}
v)
in D([0, T] x [0, T]), for every T > 0, where W is a zero-mean Gaussian process
with in case / = 0,
EW(xu yi)W(* 2 , yi) = S(x\ A x 2 , yi A y 2 ),
and in case / > 0,
W(x, y) = ~r (Wi(*> 0) + Wi(0, y) - Wi(x, y))
-V/S*(*. y)Wi(x, 0) - flSy(x,
y)Wi(0, y)
and
EW\(x\,yi)Wi(x2,y2)
=x\Ax2
+ y\ A y2 - lS(xu yi)
-/S(x 2 , y2) + /S(*i v x 2 , yi v y2) .
{Xi>Xn-[kx]+l,n
and Ki>yn-[Jkjc]+l,n}
2^
{l-F1(n)(X/)<l-F1(n)(Xn_M+1,n)=[^]/nand
= 1{r>ir/[fc.]}
Hence with
i= l
268
we obtain
^_ . ,
M-j.
(x) _
x l /
\ _ w ^
x)
in D([0, T]), for every T > 0. This relation is somewhat similar to Theorem 5.1.4,
which led to the asymptotic normality of the "usual" Hill estimator. For further details
we refer to the mentioned paper.
Exercises
7.1. Prove Theorem 7.2.1 for the ^/-dimensional case, i.e.,
(Xn , . . . , X ) are independent and identically distributed random vectors with
distribution function F in the domain of attraction of an extreme value distribution
G, with
L(xi,x2,
...,*):= -logG I
\
x7n-\
Y\
x~yi-\
Yi
,...,
x~YdYd
>
for (jti, JC2,..., Xd) R+, where y\, yi, , Yd are the marginal extreme value
indices and L(JC, 0 , . . . , 0) = L(0, JC, . . . , 0) = L ( 0 , 0 , . . . , JC) = JC, then for T > 0,
as n -> oo, k = k(n) > oo, /w -> 0,
sup
L(xi,x2,
...,Xd)-
L(x\,X2,
...,Xd) ^ 0
0<x\,X2,...,Xd<T
7.2. Consider Example 5.5.3. Determine the dependence coefficient H of the random
vector (X,X n + i).
7.3. Prove Theorem 7.2.2 with the natural extension to d dimensions, i.e., that under
the given conditions and for k = k{n) -> oo, k(n) = o (n 2 a f / ( 1 + 2 a ) ), a > 0, as
n -> oo,
\/fLOi,X2, ...,*/) L(xi,^2,...,*</) J - > #(*i,.*2, ...,*</)
in Z)([0, T]^), for every T > 0, and for (*i, Jt2,..., x^) R+, where
B(^i,X2, ...,Xd) = W(X\,X2, ...,Xd) L\(X\,X2, .>-,Xd)W(x\,0, . . . , 0 )
-L2(X\,X2, . . . , *rf) W(0, * 2 , 0, . . . , 0 ) - . . .-Ld(x\,X2, . , *</)W(0, 0, . . . , Xrf)
and W is a continuous Gaussian process with mean zero and covariance structure EW(x\,...,
Xd)W(x\, ...,Xd) = M (/?(JCI, . . . , Xd) H /?(JCI, . . . , JCrf)) with
/?(JCI, . . . , Xd) := {(MI, . . . , WJ) e R+ : 0 < I < x\ or . . . orO < Ud < *<*}
269
7.4. Show that under the conditions of Theorem 7.2.2 the proposed estimator of
Sibuya's dependence coefficient A (Section 7.4) satisfies
Jk (k - A.) 4> AT (o, L(l - 2LiL 2 ) + (Li + L 2 ) 2 + 2 (1 - Li) (1 - L 2 ) - 2) ,
where L := L(l, 1), L\ := Li(l, 1), L 2 := L 2 (l, 1), and N is a standard normal
random variable.
7.5. Let F(x\, JC2) be a probability distribution function in the domain of attraction of
an extreme value distribution G(x\, JC2), i.e., there are functions a\, a 2 > 0, &i, and hi
such that for all JCI,JC2 for which 0 < G(JCI,* 2 ) < l,lim^-^oof {1 F (b\(t) + x\a\(t)
^2(0+^2^2(0)} = logG(x\,X2) =: 4>(JCI,x 2 ). Suppose that the following
second-order condition holds: there exists a positive or negative function A with
limr_+oo A(0 = 0 and a function *I> not a multiple of <I> such that for each (jq, JC2)
for which 0 < G(JCI,JC2) < 1,
r
hm
t^oo
t(l-F
=
A(0
-,
V(xux2)
locally uniformly for (jq, JC2) (0, 00] x (0, 00]. Show that this second-order condition implies the second-order condition of Section 2.3 for the two marginal distributions. Show that the function A is regularly varying. Show that if the index of A is
smaller than zero, condition (7.2.8) of Theorem 7.2.2 holds pointwise.
7.6. Show that if X and Y are independent, the residual dependence parameter rj from
Section 7.6 for (X, Y) is \.
IH. Let X, Y, U be independent random variables, where X and Y have distribution
function 1 1/jt, x > 1, and U has distribution function 1 l/(x a logjc), x >
1, for some a e [1,2). Show that the distribution function of the random vector
(X v U,Y v U) is in the domain of attraction of an extreme value distribution with
independent components and residual dependence index \/a.
8
Estimation of the Probability of a Failure Set
8.1 Introduction
In this chapter we are going to deal with methods to solve the problem posed in a
graphical way in Chapter 6. The wave height (HmO) and still water level (SWL) have
been recorded during 828 storm events that are relevant for the Pettemer Zeewering.
Engineers of RIKZ (Institute for Coastal and Marine Management) have determined
failure conditions, that is, those combinations of HmO and SWL that result in overtopping the seawall, thus creating a dangerous situation. The set of those combinations
forms a failure set C.
Figure 8.1 displays the failure set
C = {(HmO, SWL) : 0.3HmO + SWL > 7.6} ,
as well as 828 independent and identically distributed observations of HmO and SWL.
This set is such that if an independent observation should fall into C, it would (could)
lead to a disaster. The problem is how to determine the probability that an independent
observation falls into this set.
In order to develop statistical methods to deal with this problem, we use the theory
developed in Chapters 6 and 7. We start by assuming that there exist normalizing
functions a\, a2 > 0 and b\, b2 real, and a distribution function G with nondegenerate
marginals, such that for all continuity points (x, y) of G,
lim Fl(ai(t)x
(8.1.1)
t->oo
1 + n * > 0,
(8.1.2)
1 + y2x > 0,
(8.1.3)
and
,
where y\ and y2 are the marginal extreme value indices. Then from Section 6.1 (e.g.,
Theorem 6.1.11),
272
Fig. 8.1. Failure set C, boundary point (wn, wn) and observations.
1/K2
1/Kl
b\(t)
x Y\ 1
Y-b2(t)
>
or
>
= lim t P
02 ( 0
Y\
( - a\(t)
r->oo
= -logG
,i
V
Y\
yn-V
Y2
.
Y2 J
l((
X-bi(t)\l/n
y-^m\J/K2
ai(t)
eQ\=v(Q)
(8.1.4)
for all Borel sets <2 C M+ with inf (x,y)eQ max(x, y) > 0 and v(9g) = 0. Then, for
any a > 0 we know that
-l,
v(aQ)= a~Lv(Q)
(8.1.5)
where
aQ:= {(ax, ay) : (x, y) e Q}
(cf. Theorem 6.1.9), and this property will be our main tool. Note that v is the approximate mean measure of the point process formed by the observations (Theorem
6.1.11). Hence in principle v(Q) can be estimated by just counting the number of
observations in the set Q.
Now recall that we want to estimate P ((X, Y) e Cn). Clearly there is no observation in the failure set. In fact, the observations are all some distance away from
the failure set. There has been no dangerous situation around the dike during the
observation period. This suggests that in a first approximation, P(C) < l/n. This
particular feature is essential for extreme value problems and we want to capture this
8.1 Introduction
273
in our approach, based on a limit situation in which the number of observations grows
to infinity.
We have n independent observations (Xi, Y\), (X2, Y2),..., (X n , Yn) with common distribution function F and we have a failure set C with P(C) < \/n. This
means that if we assume n - 00, in order to preserve the extreme value situation,
we have to assume that the failure set is not fixed but depends on n: C = Cn with
P ( C n ) - > 0 , n - 00.
Now we write the probability we want to estimate in terms of the transformed
variables:
Pn
:= P((X, Y) e Cn)
with
fi :=
- Kl1+w"^rJ
'( 1 + y 2 -^r)
:(x,y)Cn
(8.1.7)
Since the set Qn* like the set Cn, does not contain any observations, we divide the set
Qn by a large positive constant cn such that Qn/cn contains a small portion of the
observations. This way we can estimate v(Qn/cn) and hence v(Qn) =
v(Qn/cn)/cn.
Summing up, the procedure involves the following steps:
1. Marginal transformations
l/Kl
274
200
300
400
Fig. 8.2. Transformed: failure set (8.1.7) (area above the curved line), boundary point (qn, rn)
and data set (8.1.8).
o
o
60 H
0*1,-S2)
1
a
S = Qn/Cn
3
*
,I P *
1
1
01
1
100
200
300
400
500
Fig. 8.3. Transformed data set (8.1.8), boundary point {s\,S2) '= (qn/cn, rn/cn) and pulled
set S (area above the curved line) from (8.1.9).
where cn is a positive sequence (generally cn > oo, n > oo) and 5 is afixedopen set
of R 2 , and the marginal transformations (8.1.8) applied to Cn give the set cnS (called
Qn above). Note that n/k is playing the role of the running variable t considered
before. Then, for some fixed open Borel set S C M+ with inf(Xjy)es max(;c, y) > 0
and v(dS) = 0 we can write (8.1.6) as
(('*"4sr-
1 + K2
Y-b2(
MI)" )
1/K2>
ecnS
(8.1.10)
v(S),
ncn
(8.1.11)
8.1 Introduction
275
where the last equality follows from (8.1.5). This leads to the estimator (defined in
more detail below; cf. Theorems 8.2.1 and 8.3.1)
pn := - * - v(S)
ncn
Note that S is not known since y\, y2, a\,a2,b\9b2 are not known.
Up to this point we have dealt with cn as if it were known. That is, it has played
a similar role to that of the intermediate sequence k = k(n) in the univariate estimation. This way it is to be chosen (under certain bounds) by the statistician. An
alternative way to deal with cn is to incorporate it in the problem itself, and consequently to estimate it along with the other unknown quantities. We shall discuss these
two approaches in two separate subsections.
We add some comments at this point. In the above discussion we assumed v(S)
positive, and this will be the case considered in the next section. In fact this is the case
if the random variables X and Y are not asymptotically independent or S contains (at
least part of) the axis
{(JC, y) : x > 0 and y = 0} U {(*, y) : x = 0 and y > 0} .
(8.1.12)
The notion of asymptotic independence was first introduced in Section 6.2. Recall
that a random vector (X, Y) is said to be asymptotically independent if its distribution function is in the domain of attraction of some extreme value distribution with
independent components, i.e., the limiting distribution is the product of its marginals.
In this case we know that the exponent measure from Section 6.1.3 is concentrated
in the positive axes given in (8.1.12). In terms of the spectral representation discussed
in Section 6.1.4, recall that for 0 < 0\ < 62 < n/2 either for all 0 < r\ < Y2 < 00
the set
I (x, y) :r\ < y x2 + y2 < r2, 6\ < arctan - < O2 J
has positive /x-mass or for no choice of 0 < r\ < T2 < 00 it has positive /x-mass,
depending on whether the spectral measure has positive or zero mass in [0\, #2]This means that v(5) is always positive as long as we do not have asymptotic
independence. But v(5) can be positive even under asymptotic independence, e.g.,
in case S D (^1,^2) x [0, y) for some 0 < x\ < X2, y > 0.
Note that the proposed transformations of the failure set Cn that lead to the set S are
such that certain features of the original set Cn are preserved after the transformation
to S. For instance, if Cn C [x, 00) x [y, 00) then S will also satisfy this, for possibly
some other x, y, or if Cn D (jq, xi) x (-00, y) then S D (JCI, X2) x [0, y), for
possibly some other JCI , JC2, y.
The case v(5) = 0 is discussed in Section 8.3. Clearly v(S) = 0 under asymptotic
independence and if S is contained in a set of the form (JC, 00) x (y, 00), for some
x, y > 0. The procedure is quite similar to that with v(5) > 0, and additionally
it involves the residual independence index rj introduced in Section 7.6. For testing
asymptotic independence we refer to Section 7.6.
276
(8.2.1)
for all n. Note that this is a rather weak assumption. For instance in Figure 8.1 we
took the diagonal point (wn, wn) for (vn, wn).
Now let us see what happens to this point after the marginal transformations
(8.1.8), with t replaced by n/k, where k = k(n) -> oo, k/n - 0, n -> oo. They are
ilustrated in Figure 8.2. Define
H-TUT)
H 1 + K ^ n
<822)
'-
and assume that lim^-^oo qn/rn exists and is finite; this avoids the predominance of
one marginal over the other so that the problem does not become a univariate one in
the limit.
8.2.1 First Approach: cn Known
In the next theorem we state the necessary conditions for the consistency of pn. We
opted for a long theorem, which in turn is mostly self-contained in all its conditions
and definitions.
Theorem 8.2.1 Let (X\, Y\),..., (Xn, Yn) be an i.i.d. sample from F. Suppose F is
in the domain ofattraction ofan extreme value distribution with normalizing functions
ai > 0, bi real, marginal extreme value indices yi, i = 1, 2, and exponent measure v
(cf (8.1.1)-(8.1.4)|
Consider some estimators ofyt, at (n/k) > 0, b( (n/k) such that for some sequence
k = k(n) -* oo, k/n -> 0, n - * oo,
277
i = 1,1
Suppose the failure set Cn is an open set for which (8.2.1) holds. Suppose further
that Cn can be written as
y)es\,
(8.2.4)
where S is an open Borel set in R+ with v(dS) = 0 and v(S) > 0, and cn a sequence
ofpositive numbers with cn -> oo, n -> oo.
Finally, suppose 0 < g n /r n < oo (our conditions imply that qn/rn does not
depend on n),
lim " W ^ o ,
(8.2.5)
w/iere
log 5 ds, f > 1
~ - r / v '
(the function wy(t) has been defined in Theorem 4.4.1), and that (8.1.4) holds with
Q replaced by cnS, i.e.,
u; y (0
"-oo k
l + yi
\ \ \
^iM|)V/n
(1
^ F fl^n[
) *Ws}/v(cS) = l .
(8.2.6)
wc i = l
Ko^wrvwrH
(8.2.7)
w/iere
S:=
- I1 +W
Ml) / '
i/w>
-V
Ml) /
we /lave
Pn P
Pn
: (*, y) G C
(8.2.8)
278
We postpone the proof to Section 8.2.3. We want to add some comments at this
point.
The estimation of y,, at(n/k), and >,(/&), i = 1,2, is known from univariate
extreme value statistics (cf. Chapters 3 and 4). For instance one can use the momenttype estimators of Sections 3.5 and 4.2:
1 *
M
n\
= J & g
* =1
n,n-i+l
" l o g X,n_k)J,
j =
1,2,
(MW-\
and
where for M^32, Yi, ci2(n/k), and i>2(n/k) replace X by Y in the previous formulas.
Then under the second-order regular variation condition (cf. Definition 2.3.1) for
both marginals with auxiliary functions A,-, / = 1, 2, and provided k = k(n) oo,
k/n -> 0, y/kAi(n/k) = 0(1), / = 1,2, as n -> oo, the 0/>-property for the
individual terms in (8.2.3) follows from Sections 2.2, 3.5, and 4.2. Then they are also
jointly 0P(\).
Remark 8.2.2 Note that pn and S may not be defined
ifl+y\(Xibi(n/k)/a\(n/k)
< 0 for some Xi and similarly with the second component. However, when checking
the proofs one sees that when n -> oo, the probability that this happens tends to zero.
Remark 8.2.3 Note that the relation between k = k(n) and cn may restrict the range
of possible values of the marginal extreme value indices. For y\ A yi < 0, condition
(8.2.5) implies
lim
->
Cn
^
^fr
= 0.
\cnJ
For instance, if we want to allow k/cn = 0(1), we must have k~l/2~(yi AYl) -> 0,
which is true only if y\ A yi > \ .
8.2.2 Alternative Approach: Estimate cn
Define, for some r > 0,
cn := Sd
(8.2.9)
r
where qn and rn are as in (8.2.2). According to (8.1.9) the point ($1,52) : =
(qn/cn, rn/cn) is on the boundary of S. Moreover, from (8.2.9) we have s\ + s\ =
279
(qn/cnj1 + (rn/cn)2 = r 2 , that is, (s\,S2) is on a circle of radius r and hence close
enough to the observations (cf. Figure 8.3).
Let JC* and JC| be the right endpoints of the marginal distributions. Note that
vn t x* and wn f JC| imply qn -> oo and rrt -> oo respectively, under the domain
of attraction condition.
Corollary 8.2.4 Under the conditions of Theorem 8.2.1, with y\ A yi > \, Ji/Sm
the estimators
qn :=
(8.2.10)
1 + Xi
1/W
r n := I 1 + y 2 -
(8.2.11)
Ml) /
(8.2.12)
c : =
r
/or some r > 0 (to be chosen by the statistician), and
(8.2.13)
with
S*:=
y-Mi)'
i/>
: (x, y) C
(8.2.14)
Then
1.
Pn
Remark 8.2.5 Under much more stringent conditions it can be proved that in case
v(5) > 0,
q^AK^n)
\Pn
280
Lemma 8.2.6 Let fn (JC) and gn (x) be strictly increasing continuous functions for all
n, limn_>oo fn(x) = JC, and limn^oo gn(x) = xforx > 0. For an open set O, let
On:={(fn(x),gn(y)):(x,y)eO}.
Then
lon(x,y)
for(x,y)
: = l{(x,y)eOn)
~> lo(x,y)
:=
l{(x,y)eO)
O.
Proof Take (x, y) e O and e > 0 such that (x e,y e) e O. For > o w e have
/*"(*) > * - s and g^(y) > y - <?. Hence (/*"(*), gj~(y)) O for n > n 0 . It
follows that
M/r(*).sroo)-*i
for
(JC, y)
0 . Now
eOolo
(/*"(*). gfiy))
= 1.
Proposition 8.2.7 Let (X\,Y\), (X2 ,Yi),. ..bean i. i.d. samplefrom F. Suppose F is
in the domain ofattraction ofan extreme value distribution with normalizing functions
at > 0, bi real, marginal extreme value indices yi, i = 1, 2, and exponent measure
v. Let S be an open Borel set in R+ with inf (x,y)eS max(jc, y) > 0, v(dS) = 0, and
v(S) > 0. For the random variable
^):=JD
1=1
[((-wro^n-
= eitv
n->oo
for all/.
Next we define
m-.= li
i=l
|(hW)>W)>
Note that
V(S) =
Vn{Sn),
(8.2.15)
281
where
Sn := {(/(*), gn(y)) : (*, y)eS],
(8.2.16)
|)zM|)\V
g(x) := I 1 +
fe(f)-^(f)\\1/w
/a 2 (g)^_!
n
U(D &
-2(f) ;;
v(S)4>v(S).
Proo/ Under the domain of attraction condition,
MD'Mf)*
ilzM|)\
l(f)
converges, in probability, to (v(S), yi, y^, 1,1,0, 0). Next invoke a Skorohod construction so that we may pretend that this relation holds almost surely. Let Sn be as
in (8.2.16). By Lemma 8.2.6 we have
lSn(x,y)^ls{x,y)
(8.2.17)
for (JC, y) G 5. Note that the given conditions for Cn imply that there exist s\,S2 > 0,
(s\, 52) 35 such that x > s\ or y > S2 for all (JC, y) e S. It follows that
SC
{(JC, y)
Define D n as
Dn:={(fn(x),gn(y)):(x,y)D)
{(JC
- e, y - e) : (JC, y) D] .
(8.2.18)
v*:=^2"wvn
with the convention vo = v. Let /i n be the density of vn with respect to v*. We know
that
282
f h0 dv* = v(S)
a.s.
By Lemma 8.2.6, Proposition 8.2.7, (8.2.17), (8.2.18), and Pratt's (1960) lemma
(summarizing: if gn -> g pointwise, \gn\ < fn for all n, and f fn -> / / for some
functions /, gn, / , and g, then / gn-^ f g) we have
0(5) = vn(Sn) = J \Snhn dv* -> j lsh0 dv* = v(S) .
It follows that
v(S)-^v(S).
Proposition 8.2.9 Let (X\, Y\), (X2, F2),... be an i.i.d. sample from F. Suppose
F is in the domain of attraction of an extreme value distribution with normalizing
functions at > 0, b[ real, and marginal extreme value indices yt, i = 1, 2. Suppose
(8.2.3) for some sequence k = k(n) > 00, k/n -> 0, n > 00. Redefine fn(x) and
gn(x)as
fn(x) := 1 + Ki 7-7ST
C \
\ai ()
/I
gn(x) .= I 1 + K2 I ^TTTV
Cn \
" \a2 ( I )
r
V2
Ml)
+
^-FFj
a2 (
II
/(*) -> *
r^Cn)
awd
= 0
(8.2.19)
g n (x) - x ,
, . W - I | . + (. + o(i))(fax,"-.) + o(^)p
= i{<)"+<c.,>"0(i) +
1+0
(cn*)*/* f
Cn
+< r
/ 1\
0 ()p
/ 1 xiVyi+o^)
b) - "b)l
283
Now we deal with both factors separately and for that we use condition (8.2.19). Note
that (8.2.19) implies wn(cn)/Vk -> 0, i = 1, 2. Now
i^"0' n<0;
hence regardless of whether y\ > 0 or y\ < 0 we have CnYl/Vk
(log cn)/Vk -> 0, n -> oo. The result follows for yi ^ 0.
Next consider yi = 0. Then
-> 0 and
Vn
,08
Mi) H i
W 7
log(cx) +
Mf)-Mr)"
i(f)
I (cnx)* - 1
log(cnx)
=ir('A-)-i=brrft-<\9i\(cnx)W
\og2(cnx)
= Vk \n\ x\*\ ^ I f t i a ^ W *
lQ 2
g fa*) .
Then use (8.2.3) with the Skorohod construction and assumption (8.2.19), which for
y = 0 implies \og2(cn)/Vk -> 0, n -> oo. Hence limw_>oo fn*~(x) x* a n c ^ n e n c e
also lim n ^oo / ( * ) = * .
284
Proof. From Proposition 8.2.8 we know that 0(5) -> p v(S). Invoke a Skorohod construction, so that
v(S) - v(S) = o{\) a.s.
and
(
V* [ Yi ~ J*. ^ 8 - 1 , * '
(f)
( l )
for / = 1,2. Then from Lemma 8.2.6 and Proposition 8.2.9 we have
l(x,y)
-+ ls(*,y)
for (x, v) G 5. The rest of the proof is like that of Proposition 8.2.8.
v(S)(l+op(l)),
ncn
Pn
40(^
hm = lim ^-oo pn
n-^oo
pn
HS) P ,
= hm
> 1.
->oo v ( 5 )
Proo/ (of Corollary 8.2.4). It is enough to prove that cn/cn ->p 1, -> oo. For this,
note that
i/n
Remark 8.2.11 The estimation of qn and rn is practically the same as tail estimation
as discussed in Section 4.4. Since the conditions of Corollary 4.4.5 are satisfied (note
that qn - oo corresponds to dn -> oo there), one could alternatively invoke this
result to prove the consistency of cn.
285
> x or
\
a\(t)
and hence
(X-bi{f)
hm t P
t-+oo \
AY-b2(t)
> x and
> y
ai(t)
a 2 (0
/
= log G(x, y) - log G(x, oo) - log G(oo, y),
and in case of asymptotic independence the right-hand side is identically zero. More
generally if Q is any Borel set contained in [w, oo) x [v, oo), with u,v > 0 and
v(dQ) = 0, under asymptotic independence of (X, Y),
i+
H(( ^)'> ^)'>HThis gives too little information on the probability of the set Q.
To estimate
pn = P ((X, Y) C)
when
we propose the following refinement of (8.1.4), which will lead to a new limit measure
v: for JC, y > 0 and some functions a\,a2,r positive and b\, b2 real, r -> oo,
^ r W P j ^ l + w - ^ - j .
>,and(^l
n ^ ^
> ,j
(8.3.1)
exists, and it is positive and finite.
Then, similarly as in Section 6.1.3, one can define the measure v as follows: for
any Borel set Q in R^_ with inf (x,y)eQ max(x, y) > 0 and v(dQ) = 0 let
v{Q)
-&H(o-^)>*^)>*)(8.3.2)
286
Moreover, it follows that the function r is regularly varying with index greater than
or equal to 1. Using the notation introduced in Section 7.6, the index of the regularly
varying function r is l/rj, where n e (0, 1] is the residual independence index. Also
as in the proof of Theorem 6.1.9, it follows that v is homogeneous of order l/rj, i.e.,
u(flfi) = a - 1 / M f i ) ,
(8.3.3)
for any a > 0, where aQ is the set obtained by multiplying all elements of Q by a.
Note that (8.3.2) is valid if X and Y are not asymptotically independent. In this
case r(t) = t, rj 1, and v = /i, the exponent measure of Section 6.1.3.
We are now ready to proceed with the estimation of pn, which closely follows the
reasoning developed in the previous section. Using again (8.1.8),
Pn
= P((X, Y) e Cn)
'((("Hspr-o-^pn-i-
'(f)
v(S)
cj"r(f)'
where the last equality follows from (8.3.3). Comparing with the previous section,
apart from estimating S and v, we now have to deal with the parameter 77. But this
was the subject of Section 7.6, from where we know how to estimate rj.
In the next theorem we state the necessary conditions for the consistency of pn.
As in the previous section we opted for a long theorem that is mostly self-contained
in all its conditions and definitions. The proof is left to the reader (Exercises 8.1-8.3).
Theorem 8.3.1 Let (X\, Y\),..., (Xn, Yn) be Ltd. random vectors with distribution
function F, satisfying (S3.1) for some positivefunction r, with marginal extreme value
indices yi and normalizing functions ai > 0, b[ real, i = 1, 2.
Consider some estimators ofyt, at(n/k) > 0, bi(n/k), i = 1,2, and rj such that
for some sequence k/n > 0, r(n/k)/n > 0 (this implies k 00), n > 00,
i\k)
ai
\k)
i = 1,2, and
f^(ii-r,)
= Op(l).
(8.3.5)
Suppose Cn is an open set and that there exists some boundary point of Cn>
(vn,wn) such that
Cn C [Un.OO) X
[Wn,00)
287
where S is an open Borel set in [0, oo) 2 with v(dS) = 0 and v(S) > 0, and cn a
sequence of positive numbers with cn -> oo, n -> oo.
Finally, suppose 0 < qn/rn < oo with qn and rn as in (8.2.2) (our conditions
imply that qn/rn does not depend on n),
lim
lim
n->oo y
" W ^ o ,
(8.3.7)
J-^-log(qn)=0,
(8.3.8)
lim r (-r
n->-oo
(l
y-Mf)'
+ K2
1/K2N
ecnS\
/u(cS) = l .
27ie with
1
(8.3.9)
where
5:=
I 1 + Kix7
1/K2>
i( 1 + *^fi,
1:( ,. c .
we /zave
(8.3.10)
288
100
SWL1
200
300
400
SWL
0.00
0.30
1.34
2879640.
57592.80
20
S(r=50)
S(r=10)
15
289
S(r=100)
>o
10
15
20
1~
0
~\
10
15
20
Fig. 8.5. Transformed data set and transformed failure sets (area above the curved line).
for r = 10, 50,100 (we use the approach of Section 8.2.2, where, recall, r denotes
the radius of the circle to which the boundary point C?i, $2) belongs, in the picture
illustrated with the diamond point). One sees that there is quite a difference in the
number of points belonging to each set 5, but the effect of this in the estimates of pn
is quite negligible. In Figure 8.6 one finds the diagram of estimates of pn over the
window 40 < k < 110 and for r = 10, 50,100.
1
1
/" 1U
1
TDU
~~ "
r=100 1
1
k
wPi I
8xl0~ 7
6xl(T 4xl(T7-
/ \A
2xl0~ 7 0-
' \m
Y^
60
Exercises
In the next three exercises one gradually proves Theorem 8.3.1.
8.1. Let (Xi, F i ) , . . . , (X, Yn) be i.i.d. random vectors with distribution function F,
satisfying (8.3.1) for some positive function r and limit measure u, with normalizing
290
functions at > 0, b( real, and marginal extreme value indices yi9i = 1, 2. Let S be
an open Borel set in R+ with inf (X,y)es max(jc, v) > 0, v(dS) = 0, and v(S) > 0.
Introduce the random variable
Prove that
0(5) - u(5) .
#mf: See Proposition 8.2.8.
8.3. Prove Theorem 8.3.1.
8.4. Under the conditions of Theorem 8.3.1 and with cn as in (8.2.10)-(8.2.12) prove
Part III
9
Basic Theory in C[0,1]
294
/
{max
[i<n
Xj(s)-bs(n)\
I
as(n)
J 5 [ 0f i]
(9.2.1)
as n - 00, uniformly for s [0, 1] and locally uniformly for x. In particular, note
that it means that the distribution functions of X(s) and Y(s) are continuous in s.
We can choose the functions as(n) and bs(n) in such a way that P(Y(s) < x) is an
extreme value distribution in the von Mises form (cf. Section 1.1.3). Then
P(Y(s) <x) = exp ( - (1 + y(s)x)-l/y(s)^
(9.2.2)
for s e [0, 1] and all x with 1 + y(s)x > 0, where y is a continuous function.
From (9.2.1) and (9.2.2) we get
lim n {1 - Fs (as(n)x + bs(n))} = (1 + y(s)x)-l^(s)
(9.2.3)
n-oo
uniformly for s e [0, 1] and locally uniformly for x with 1 + y(s)x > 0. Since
convergence of a sequence of monotone functions is equivalent to convergence of
their inverses (Lemma 1.1.1), (9.2.3) is equivalent to
295
uy{s)-l
y(s)
uniformly for s e [0,1] and locally uniformly for u e (0, oo), where Us is the
left-continuous inverse of 1/(1 Fs) for s e [0,1].
We have
max -Xi(s)-b
J ^ -s(n)}
^ ^ \
Ad {F(s)} 5[(U]
(9.2.4)
i<n
as(n)
j 5 [o,i]
in C[0,1] and (9.2.3). By combining the two and using the uniformity in both statements we get
max
- i 1(1 +
y(s)Y(s))l/y{s)]
}
-> {y(5)},[0,i]
[i<n
as(n)
Jse[01]
in C[0,1], where as(n) and bs(n) are chosen such that log P(Y(s) < x) =
(1 + y{s)x)-xlv{s) for all x with 1 + y(s)x > 0.
2.
Imax n
V,Y,\
^ W^seiOM
1
i<n / l ( l - F s ( X i ( 5 ) ) ) J5[0,1]
in C[0,1], and for all u e (0, 00),
Us(nu)-bs(n)
Uy^ - 1
hm
n->oo
as(n)
y(s)
(9.2.5)
(9.2.6)
y(s)Y(s))l/y{s)}se[o,i].
Remark 9.2.2 Relation (9.2.6) means that the function Us(t) is extended regularly
varying with an extra parameter (see Section B.4). We call the continuous function
y = y(s) the index function.
Since relation (9.2.6) is not difficult to handle (cf. Section B.4 in Appendix B),
this theorem reduces our problem to studying the limit relation
1
-\fb-iri
"ill
(9.2.7)
296
(9.2.8)
Theorem 9.2.3 Suppose 771,772,... tfre /./.J. copies of the process rj from (9.2.7).
Then for all positive integers k,
1 *
rX/m^rj.
(9.2.9)
nk
*=1
7=1
r=l
where the r, are independent and they all have the distribution of the ,-. Now keep
k fixed and let n tend to infinity. Then the left-hand side tends to rj in distribution and
the right-hand side tends to k~l Vy=i *lj m distribution where the rjj are independent
and have the same distribution as rj.
Yi
bn
forn = 1,2,....
Hence the class of max-stable processes coincides with the class of limit processes
in (9.2.4).
297
(9.3.1)
(l/loo,//l/loo)
(9.3.2)
The space Cx [0,1] equipped with the supremum norm is CSMS, and we turn (0, oo]
into a CSMS by introducing the metric g(x,y) := \l/x l/y\. Hence finally we
introduce
c[0,1] := (0, oo] x C+[0,1],
(9.3.3)
with the lower index Q meaning that the space (0, oo] is equipped with the metric Q,
and we consider vn as a measure on CQ [0,1] for each n. Despite the introduction of the
new normed space, for convenience we still use the notation |/|oo for sup0<5<i f(s).
Theorem 9.3.1 Let , i, fr,... be i.i.d. stochastic processes in C + [0,1]. If
i=l
298
The relation between the probability distribution of rj and the measure v is that
form = 1, 2 , . . . ,
P(neAK,x)=exp(-v(AcKx))
with, for K = (K\,...,
xm) positive,
e Kh j = 1, 2 , . . . , m) .
: f(s)<XjforseKj,j
(9.3.4)
Forf > 1,
MJ* ( M - 1 ^ 5 ) > [^tP (t~li= e Z?) > Y ^ r d + lt])P ( d + M ) " 1 * G 5 ) ,
and both the right- and left-hand sides converge to v(5), for all sets B of the form
(9.3.4), when t -> 00. Since the measure v is determined by its value on sets of that
form, the proof is complete.
Remark 9.3.3 Note (Daley and Vere-Jones (1988), A.2.6) that the conclusion of the
theorem amounts to vn ->d v in "weak hash" topology (it>#) or equivalently to weak
convergence in any subspace of the form {/ :
\f\oo>a},a>0.
For the proof of the theorem we need two lemmas.
Lemma 9.3.4 Let rj be a simple max-stable process on [0, 1]. Then
P(\r}\oo <x)= exp ( J
, x >0 ,
V= ~V W
for all n. Hence
P(\r)\oo<x)
Pn{\n\00<nx).
299
The most important step in the proof of Theorem 9.3.1 is the following result.
Lemma 9.3.5 Under the conditions of Theorem 9.3.1, for each s > 0 the sequence
of measures {vn,e} defined by
v,e(A) := vn{f e A : |/|oo > e}
for each A e CQ [0,1] is relatively compact.
Proof. We need to prove two things: First, that the sequence
vn,e (c+[0,1]) = vn {/
(9.3.5)
Second, that the sequence {vn,e} is tight. Note that since vn,e(CQ [0,1]) has a
finite limit as n -> oo, we can check tightness for the sequence {vn,g} as if it were
a sequence of probability measures. According to Billingsley (1968), Theorem 15.3,
this is equivalent to the following:
1. For each positive ft there exists an a > 0 such that
Vn,e(S<*)<P
for all n, where
Sa := {f C+[0,1] : |/|oo > of}
for each a > 0.
2. For each positive a and ft, there exist a <$,0 < <5 < 1, and an integer no such that
(a)
vn,*{/:u/;()>a}<0
for n >no with
ft/}(*):=
sup m m ( | / ( * ) - / ( 5 i ) | , |/(s 2 ) - / ( * ) ! ) ;
,s
2 *! ^
vn,e\f
(b)
I
sup | / ( J ) - / ( 0 I >
0<s,t<8
sup
15<5,f<l
\f(s)-f(t)\>a\
<p
I
300
Now (1) follows from the first part of the proof. Next we prove (2a); the other parts
are similar. Relation (9.2.7) implies convergence in distribution, hence tightness, of
{Mn v ( a / 2 ) } ^ ! with Mn := n~l V?=i & Consequently, for any * > 0,
^Kv(/2)(')^(/2))</l*
forn > no*. Define
Qn,a = (Mn V ( a / 2 ) ) l{|&|oo>n(a/2) for some/, |y|oo<n(a/2) fory^i}-
for
{^Qn.a^ -
Proof (of Theorem 9.3.1). Note that since we have convergence in C + [0, 1], for
m = 1, 2 , . . . , K\, K2,..., Km compact sets in [0, 1], and positive x\, X2,..., xm,
301
= 1,2,..., m))
t=i(s)<xh
fors e KjJ
= l,2,...,m )
v{f
: f(s)<xj,
forseKj,
7 = 1,2,. . . , m } c
= - log P^Cs) < Xj, for 5 G ^ - , j = 1, 2 , . . . , m) .
Since for any s > 0 the sequence {vn?} is relatively compact by Lemma 9.3.5,
every convergent subsequence has this same limit.
Definition 9.3.6 We call the measure v the exponent measure of the simple maxstable process. This is analogous to the exponent measure in finite-dimensional space
(Section 6.1.3).
The characterizing property of the exponent measure is the following homogeneity
relation.
Theorem 9.3.7 For any Borel set A in {f e C[0,1] : / > 0} such that inf {|/|oo :
/ e A] > 0 and v(3 A) = 0, and any a > 0,
v(aA)=a~lv(A),
(9.3.8)
Remark 9.3.8 Hence by Theorems 9.3.1 and 9.3.7, for any K some compact subset
of [0,1], and each x > 0
P I sup rj(s) < x J = exp ( - v { / : f(s) > x for some s e K})
= exp I v{f : f(s) > 1 for some s e K] J ,
i.e., supJG rj(s) has an extreme value distribution.
302
As in the finite-dimensional case, a nice intuitive background for the role of the
exponent measure is provided by the following theorem.
Theorem 9.3.9 Assume the conditions of Theorem 9.3.1. Define the random measures
Nn on CQ [0, 1] as follows: for any Borel set A with v{dA) = 0,
iV(A):=X;i { l l -i 6 i l ) .
i=l
N(Am))
J = I exp I -
YlkJl{n-HeAj}
_
Cfro, 1] := {/ C[0,1] : / > 0 , |/|oc = 1} .
In Section 9.3 we proved that the exponent measure v satisfies a homogeneity property:
for a > 0 and any Borel set A in C^"[0, 1] = (0, oo] x C^[0,1],
v(aA)=a'lv(A) .
(9.4.1)
303
(9.4.2)
for each Borel set A in Cx [0,1]. This finite measure is called the spectral measure
of the limiting process rj in the relation
1
",=i
Theorem 9.4.1 (Gine, Hahn, and Vatan (1990)) Suppose^, &,... are Ltd. stochas
tic processes in C + [0, 1],
V&4
(9.4.3)
"ill
in C + [0, 1], and P(r](s) < 1) = e~l for s [0, 1], i.e., rj is simple max-stable in
C + [0, 1]. Then there exists a finite measure p onCx [0, 1] with
f(s) dp(f) = 1
(9.4.4)
C+[0,1]
for all s e [0, 1] such that for m = 1, 2 , . . . , K\, #2, , Km compact sets in [0, 1],
andx\,X2,...,
*m > 0,
- log P ( ? ; ( J ) < Xj, fors e Kj, j = 1, 2 , . . . , m)
=
(9.4.5)
304
Proof. Let r\ be simple max-stable in C + [0,1 ]. We have already obtained the measure
p in (9.4.2). Next we prove (9.4.5) for this measure p. We proceed as in the proof of
Theorem 9.3.1. On the one hand, as in the mentioned proof,
lim vn \f : f(s) < xt, for s Kj, j = 1,2,... , ra}c
= lim nP({%(s) <nxj,
fors e Kj,j
= 1,2,
\,2,...,m)c)
...,m)
forseKj,j
l,2,...,m}c
sup J
l/loo I
seKj
'
m} c
'
SUp g ( j )
max
dp(g) .
Ci [0,1] !</<
*/
For (9.4.4) note that P(r](s) < 1) = e~l, s e [0,1]. Hence for each s e [0,1],
I = -logP(rj(s)
= v { / : l/loo > ( / W / l / l o c r 1 )
= L
% dp(g) = f
g(s) dp(g) .
For the converse statement of the theorem assume that p is a finite measure on
Cj [0, 1] satisfying (9.4.4). The measure v on CQ [0,1] is defined by
v { / : |/|oo > r a n d / / l / l o o A} = r~ 1 p(A)
for r > 0 and A a Borel set in Cx [0,1]. Let N be a Poisson point process on CQ [0, 1]
with mean measure v (cf. Theorem 9.3.9). Let
?i>?2, ft.
be a realization of the point process. Define
CO
n:=\JKi-
(9A6)
305
Km compact sets
;=i
00
;=1 i=i
and it is clear that this process has the same structure as the process rj except that the
measure v is replaced by kv. On the other hand we write
00
*!? = V*&
1=1
and again the process has the same structure as the process r\ except that for a Borel
set A c C J 0 , l ] the mean measure is now
v{f
: kf eA} = v \k~lA\
= kv{A]
by Theorem 9.3.7. Since the two processes Vy=i *lj an( * ^ ^ a v e ^ e s a m e distribution,
the process r\ is max-stable.
Finally, we prove that rj is in C + [0,1], that is, we prove that t] has continuous
sample paths and that P(r] > 0) = 1. In order to prove continuity we show that (1)
liminfj-^o r)(s) > t](so)'9 (2) limsup^ 5 o rj(s) < rj(so) for each so [0,1] with
probability one.
1. Take any realization fi, &, 3, CQ [0,1]. Since 77 := v ^ l ^ f o r e a c h e > 0
there is a f,- such that f,- Oo) > rj(so) s. Since & is continuous, lim5_>50 f,- 0 ) =
&0o)- Hence
liminf r](s) > lim f,-(,?) > 77 Oo) s .
5>-5o
2. Define
*so
306
(9.4.7)
(9.4.8)
Ay
It follows that N \Ayj\ = Oforw > wo and hence sup5G/n V/=i f/C*) ^ yforn > oIn particular, lim sup5_^0 V S i f* 0s) - ? This proves the continuity.
The last statement we need to prove is
P(n> 0) = 1 .
We have
1
"ill
with 77,171,772 >.. .,y)n independent and identically distributed.
Note that for s e [0, 1] we have n~l V?=i Vi (s) = 0 if and only if 77* (.s) = 0 for
1 = 1,2,...,/!.
Define A := {5 [0, 1] : rj(s) = 0} and A( := {* e [0, 1] : rn(s) = 0},
/ = 1, 2 , . . . , . We have
P(A ^0) = P (n?=lAi / 0) < P (D?=1{Ai 0}) = Pn(A 0)
for all n. Hence either P(A ^ 0) = 1, i.e., there is some s with rj(s) = 0, which is
impossible since P(rj(s) < x) = exp ( - 1 / J C ) for * > 0, or P(A j 0) = 0, and that
is what we want to prove.
307
1=1
where f; = Z,-7r,- and the (Z{, TZI) form a realization of a Poisson point process on
(0, oo] x Cj [0, 1] with mean measure v satisfying dv = (dr/rL) x dp.
Conversely every stochastic process with the given representation is simple maxstable in C+[0, 1].
Example 9.4.3 Consider a Poisson point process on R 2 \ {(0, 0)} with mean measure
(x1 + j 2 ) - 3 / 2 dxdy. Let {(X;, F;)} be an enumeration of the points of the point
process. Note that there are only finitely many points outside the unit circle. We
show that the simple max-stable process {2 _1 v ? ^ Xt cos# + F; sin0}o<0<2;r has
the representation of Corollary 9.4.2. With x = r cos 0 and y = r sin 0 we have
(x2 + v 2 )~ 3 / 2 dx dy = r~2 dr d<p. Write Xt = Rt cos <f>; and F, = Rt sin <D,-. Note
that for each 0 the half-plane {(x, y) : JC cos# + y sin0 > 0} contains infinitely
many points of the point process. Hence for 0 < 0 < In,
1
oo
oo
i=i
Corollary 9.4.4 With probability one there exists a finite collection f i , . . . , & (hence
k is random) from Corollary 9.42 such that
k
ri(s) =
\/^i(s)
*=1
i=l
308
(9.4.9)
0<s<l
J seR
where {Z, } ^ z l is a realization of a Poisson point process on (0, oo] with mean measure
dr/r2 and independent of {Wf }flx. The process rj is stationary (cf. Section 9.8 below).
For the proof of Corollary 9.4.5 we use the following result:
Lemma 9.4.7 Suppose P is a Poisson point process on the product space S\ x 52
with S\ and S2 metric spaces and the intensity measure is v = v\ x V2, where v\ is
not bounded and V2 is a probability measure. The process can be generated in the
following way: let {U(} be an enumeration of the points of a Poisson point process
on S\ with intensity measure v\ and let V\, V2, be independent and identically
distributed random elements ofS2 with probability distribution V2. Then the counting
measure N defined by
00
tne
Proof We need to prove that the number of points of the set {(/,-, V/)}?^ in two
disjoint Borel sets are independent (which is trivial) and that the number of points
N(A\ x A2) in a Borel set A\ x A2, with A\ c Si and A2 C S2, has a Poisson
distribution with mean measure vi(Ai)v2(A2). Now
P (N(Ai x A2) = k)
00
309
m!
00
?*
A , 4 ^ * n _ ,*, r ^ W - *
= E
/ -" LJfc)!ifc!
. M (V2 (A2* (1 - V2 (A2)>
*', (m
kuk\
(m
=
(MAQi^Aa))*..,^ ^
*!
^
(vi(Ai)v 2 (A 2 ))*
it!
(vi
(Ai))me-VlUl)
m!
(l-y2(A2)r^
(m-fc)!
_,
(A,^(A?)
Proo/ (0/ Corollary 9.4.5). In order to establish the representation we start from
the result of Corollary 9.4.2. Let {(Z,-, jr,-)}^ be an enumeration of the points of a
Poisson point process on (0, oo] x Cl [0,1] with mean measure
i_
dr
do
P ( C > , 1]) -j x - ^
.
r2
p(C7[0, 1])
Then the Poisson point process represented by {Zi^r,-}?^ has the same distribution
as that represented by {Z,-7r,- } ^ t .
Next define for f = 1 , 2 , . . . ,
% .= _J
p(C+[0,1])
*,- := Hi p(ct[0,1])
' Z2
J A Z2
for a Borel set A of (0, oo]. The intensity measure of the second component is
P { / : / > 0 , l/loo = 1 , fp (cj"[0,1]) G )
GO :=
P (cfro, i])
310
In order to prove that conversely the stated construction represents a simple maxstable process, just follow the steps back of this proof.
It remains to prove that for the converse the requirement sup0<5<i V(s) = c
a.s. can be relaxed to E sup0<5<1 V(s) < oo. Note that the former is used to ensure
the finiteness of the process rj. But this also follows from the following weaker
assumption: we consider now a probability measure Q on the space
C* := {/ C[0,1] : / > 0 , |/|oo > 0}
with the property
Then
P
l/loo dQ(f)
< oo .
0<s<\
exp|- J J
^dQ(f)
^l/loo>^
ttp(-f\f\oodQ(f)\>0.
< s < 1) = /
JCJ[0,1]
|*//loo dp(g)
The proof of Corollary 9.4.9 is left to the reader (cf. Gine, Hahn, and Vatan (1990)).
Combining the results of Theorems 9.2.1 and 9.4.1, we get the following characterization of max-stable processes in C[0, 1].
Theorem 9.4.10 For each limit process [Y(s)}se[o,\] in (9.2.4) that satisfies
P(Y(s) <x) = exp ( - ( 1 +
y(s)xyl/y{s)}
fors e [0, 1] there exist a continuous function y and a finite measure p onCl [0, 1],
satisfying (9.4.4) of Theorem 9.4.1, such that with rjfrom Theorem 9.4.1,
ir,(s))
{Y(s)}sem] o,i] &
J^
- j\ "
'|
(9.4.10)
311
Conversely, any pair (y, p), with y a continuous function and p a finite measure on
C
C+[0,1]
x [0, 1] satisfying (9.4.4) of Theorem 9.4.1, gives rise to a max-stable process via
(9.4.10).
{max
[i<n
Xi(s)-bs(n)\
as{n)
}
- M F Cs)}s[0,i]
J5[o,i]
in C[0,1]. Define
Us(x) := F,
(9.5.1)
H)
for x > 1, s e [0,1]. Theorem 9.2.1 states that if (9.5.1) holds with proper choices
of as(n) positive and bs(n) real, then
v
lim
n->oo
Us(nu)-bs(n)
as(n)
uyW-l
y(s)
uniformly for s e [0,1] and locally uniformly for u (0, oo). Moreover, the processes
&(S) : =
l-FsiX^s))
'
s [0,1], satisfy
|-V&(*)I
I n
1\Q
1= 1
+ YWY(S))1,Y<')\
I
J
m=:hW}.i]
\se[0,l]
5G[0,1]
Us(X):=Fr(l-l)
for x > 1, s [0, 1]. The following statements are equivalent.
312
max
i<n
Xi(s)-bs(n)\
as(n)
\sem]
->
{Y(s)}sem]
in C[0, 1], where as(n) positive and bs(n) real are chosen such that
- l o g P(Y(s) < x) = (1 + Y(s)x)-l^s)forallx
with 1 + y(s)x > 0;
lim "'<*">-'<"> = ^
(9 . 5 . 2)
(9.5.3)
(9.5.4)
(c) For each r > 0 and each Borel set B C Cx [0, 1] (defined in {93.2)) with
p(dB) = 0,
lim fP(||oo > *r) = r - ^ ^ t O , 1])
(9.5.5)
awd
lim P ( J -
J*B)
e B | l^loo > t) =
(9.5.6)
Proa/ We have already proved in Theorem 9.2.1 that (1) is equivalent to (9.5.2), and
(2a). It remains to prove that (2a), (2b), and (2c) are equivalent.
We start with the equivalence of (2a) and (2b). The direct statement has been
proved in Theorem 9.3.1 and Corollary 9.3.2. The proof that (2b) implies (2a) consists
in rearrangement of the equalities in the proof of Theorem 9.3.1. For (9.5.4) note that
with r := |/|oo and g :=
f/\f\oo,
v(A) = v { /
: | / |
0 0
- Z -
A J = V { /
: rg e
where
v {/ : r > r0 and g e B] =
far Be
(^[0,1].
r~lp(B)
A],
313
(9.5.7)
where r > 0 and B is a Borel set in Cx [0,1]. By taking B = Cx [0,1] we get for
r >0,
lim fP(|$|oo > " 0 = r ~ V (cj"[0,1]) ,
(9.5.8)
which is (9.5.5). For general B and r = l w e have
lim tPmoo
r-*oo
(9.5.9)
where s\, S2 . . . , sm+\ are rationals such that 0 = s\ <S2< - < Sm+i = 1, and /?,
q, a\,..., am, b\,..., bm are positive rational numbers.
Take $i, $2 , ^m+i such that for all i = 1, 2 , . . . , m,
sup
f0(s) -
Si<S<Si+l
inf
f0(s) < 2e .
Si<5<Si + l
sup5
i<S<Sj+\
fo(s) 3s
"l/loo
^si<s<si+i fo(s) + 3e
< a t < b i
'
a,<
mz<b"
and /?, ^ with |/oloo - s < p < q < |/ 0 |oo + e such that p < \f\oo < q.
Remark 9.5.2 It is easy to see that (9.5.2) can be extended to
H m U,(fu)-bs(lt]) = uYM - 1
'-+<*>
as([t])
y(s)
uniformly, where t runs through the reals.
314
ri^yZiVt,
(9.6.1)
/=i
where {Z;} is an enumeration of the points of a Poisson point process on (0, oo] with
mean measure dr/r2 and V, V\, V2,... are independent and identically distributed
nonnegative stochastic processes in C + [0, 1] with EV(s) = 1 for all s e [0, 1]
and sup0<5<i V(s) = c a.s., where c is a positive constant. The point process and
V,V\,V2, are independent.
Let Q be the probability distribution of the process V.
9.6.1 Spectral Representation
In order to make this representation more analytical, we use Theorem 3.2 of Billingsley
(1971), which says that for each probability measure on a metric space S with its Borel
sets, there is a random element of S, defined on the unit interval (that is the unit interval
with its Borel sets and Lebesgue measure X as the probability measure) with the same
probability distribution. Let
C+[0, 1] := {/ C[0, 1] : / > 0 , |/|oo = c]
for some c > 0. It follows that there is a measurable mapping h : [0, 1] -> Cc [0,1]
such that for each Borel set A of Cc [0,1],
Q(A) = X ({t e [0,1] : h(t) e A}) .
(9.6.2)
We are going to use the mapping h to build an alternative version of (9.6.1). Note
that with dv\ \ (dr/r2) x dk (A Lebesgue measure on [0,1]), A\ a Borel set of
(0, 00], and A a Borel set of C^"[0, 1],
vi ({(z, 0 (0, 00] x [0, 1] : (z, h(t)) Ai x A}) = Q(A) f
% .
Hence if {(Z/, 7})}?^ is a realization of a Poisson point process on (0, 00] x [0, 1]
with mean measure dv\ := (dr/r2) x dk, then
{(Z,-,Aai))}i
(9.6.3)
is a realization of a Poisson point process on (0, 00] x Cc [0, 1] with mean measure
dv := (dr/r2) x dQ, where g is the probability measure of h(T) on Cc [0, 1]. It
follows that
00
rj^yZihiTi).
!=1
(9.6.4)
315
Now note that h is a mapping from [0,1] into Cc [0,1]. Hence for each t [0,1]
the mapping provides us with a continuous function, fs(t) say, with fs(t) e [0, oo),
Jo / ^ r ) df = 1 for 0 < s < 1, and s u p ^ ^ fs(u) = c for all t e [0,1].
This leads to the following result.
Theorem 9.6.1 (Resnick and Roy (1991)) Let {(Z;, 7/)}?^ be a realization of a
Poisson point process on (0, oo] x [0, 1] with mean measure (dr/r2) x dX (X Lebesgue
measure). If the process n is simple max-stable in C + [0, 1], then there is a family of
functions fs (t) with
1. for each t e [0, 1] we have a nonnegative continuous function fs(t) : [0, 1] >
[0, oo),
2. for each s e [0,1],
[ fs(t)dt
Jo
= l,
(9.6.5)
3.
I
JO
such that
imheioM = \y % f*(Ti)\
(9-6-6)
Conversely, every process of the form exhibited at the right-hand side of (9.6.6) with
the stated conditions is a simple max-stable process in C + [0, 1].
Remark 9.6.2 The family of functions {fs} is called a family of spectral functions
of the simple max-stable process. Note that the spectral functions are by no means
unique.
Remark9.6.3 By defining f*(u) := H\u)fs(H(u))
for u e R, where H is a
probability distribution function and Hf its density, one can take the spectral functions
in Li(R) rather than Li([0,1]).
Remark 9.6.4 There is also a weaker form of this theorem, where for the process
x] a.s. continuity is replaced by continuity in probability and for the functions fs(t)
continuity is replaced by continuity in measure: X{t : \fSn(t) fs(t)\ > s] -> 0 as
n -> oo for each e > 0 when sn -> s (de Haan (1984)).
Remark 9.6.5 It is not difficult to see that it is not essential that the max-stable
process be defined on [0,1]. One can take any compact set in a Euclidean space.
9.6.2 Stationarity
In this subsection we consider stochastic processes defined on the whole real line
rather than on the unit interval as in the previous sections. We do this mainly in view
of applications and of some examples to be considered in Sections 9.7 and 9.8. Since
the proofs of the results in this section are quite lengthly, we refer to the original
papers for some key points.
316
2. for each s e R
fs(t)dt
=l,
(9.6.7)
JO
s u p / i ( 0 dt < 00 ,
/0 sel
such that
{^)},R=|yz'/^-)|
<9-6-8)
l'=l
heR
Conversely every process of the form exhibited at the right-hand side of (9.6.8) with
the stated conditions, is a simple max-stable process in C + (R).
Remark 9.6.8 The family of functions {fs} is called a family of spectral functions
of the simple max-stable process.
Proof (of Theorem 9.6.7). The proof is semi-constructive. First consider an infinite sequence of positive random variables W := (Y\, Y2,...). We assume that this
sequence is simple max-stable, i.e. for W\, W2,... independent and identically distributed copies of the sequence W and all k
\\/wt=dw.
Moreover we assume that P(Yi < 1) = e~l,i > 1.
We extend the line of reasoning of Chapter 6 (finite-dimensional extremes) to
this situation. The process W introduces a probability measure on the infinite product
S := R + x R + x . Since for
anyn>l,Yi,Y2,...,Yn>0
317
(9.6.9)
k->oo
\l/2
oo
/ v
v 1/2
It follows that sup,->! Yi/ai < oo almost surely. For any n = 1, 2 , . . . the random
variable supj <,- <n Yt /at has a Frechet distribution by the results of Chapter 6, i.e. there
exists positive constants b\, Z?2,... such that P(supi<i<n Yi/ai < x) = exp(bn/x),
for x > 0. Hence & := limn_+oo &n exists in (0, oo) and
P ( sup < x ) = e~b/x ,
V>1 fli /
for x > 0 .
318
fork = 1,2,...
(9.6.10)
As in the finite-dimensional case k may in fact be any positive number. Moreover for
8 >0
v {((*i,*2,.. ) : *i < #* for* = 1,2,. ..) c } < oo
(9.6.11)
(i.e. the measure v is finite outside a neighbourhood of the origin).
Next, as in Section 6.1.4, we move towards a spectral measure. Using the transformation L, with the a;'s as before:
w : = sup(jt,7fl,-)
>i
Xk/W , W > 0
Zk '=
u; = 0
0,
* = 1,2,...
< un\
= C~lV {(X\,X2,
< Un}
i.e. the transformed measure d(voL*~) = (dr/r2) x d\x on [0, oo) x S for some
measure /x on 5. Note that
fi(S) = v | (xi, X2,...) : sup ( ) > 1
[
i>i\i/
= ^ {((JCI, JC2,...) : xt < a,- for / = 1, 2 , . . .)c} < oo
Yn > 0,
P{Fi<yi,...,r<y}
= exp - v {((xi, x 2 , . . . ) : xt < v/ for i = 1 , . . . , n) c }
= exp-v{((u;,zi,Z2, ) : ziw < yt for/ = l , . . . , ) c }
= exp v {(w, zi, Z2, ) nun < w
[
l<i<n Zi
319
- j li(d(zi,Z2, ))
^mini<i<(yi/zi)<iy
max ( ) \x(d(z\,zi
Jsi<i<"\yiJ
. . . ) ) = exp -
max d t .
Jo l<<"
yt
Next we give a representation of the process (Fi, Y2,.. ) using the Poisson point
process of the statement of the theorem: we have
oo
oo
\J Zif\{Ti),\J
since
i=i
Zif2(Ji),...\
i=i
inf -g
fori = 1,2,...]
dr
= exp- / /
x~2dt
J Jr> m
T
U<j<n Tjfe
= exp /
max
dt .
Now let us consider the process r\. By the results obtained so far we have a spectral
representation for the process {^(rn)\^L\ where r\, ri,... is an enumeration of the
rationals of R: with some abuse of notation we can write
The next step - finding a similar representation of the process r](s) for real s is done by using continuity. The process Y\ has continuous sample paths hence in
particular it is continuous in probability.
320
We use without proof the auxiliary result: any sequence of random variables
( V S i Zifn(Yi)}%Li w * m /n* spectral functions converges in probability asrc ooif
and only if the sequence f* converges in Lebesgue measure. This gives representation
(9.6.8) of the process rj(s) for real s.
The final step, proving the continuity of fs(t) for almost each t and the convergence of the integrals in 3., is provided by Theorem 3.2 of Resnick and Roy (1991):
for a compact interval / in R, the process {rj(s)}sei has continuous sample paths if
and only if the family fs(t) of spectral functions is continuous in s for almost all t
and if moreover
/ sup fs(t) dt < oo .
Jo sei
The proof of the converse statement of the theorem is easy.
Next we turn to the issue of stationarity.
Definition 9.6.9 A mapping O from L* (the non-negative integrable functions on
[0, I]) to L* is called a piston if for h e L+
4>(h(t)) =
r(t)h(H(t))
with H a one-to-one measurable mapping from [0, 1] to [0, 1] and r a positive measurable function, such that for every h e L~^
I <3>(h(t))dt= J h(t)dt .
Jo
Jo
Theorem 9.6.10 Let {(Z/, 7})}?^ be a realization of a Poisson process on (0, oo] x
[0, 1] with mean measure (dr/r2) x dk (k Lebesgue measure).
If the stochastic process {rj(s)}se^ is simple max-stable, stricly stationary and
continuous a.s., then there is a function h in L* with fQ h(t)dt = 1 and a continuous
group of pistons {&s}sdBL (continuous, i.e., Q>Sn(h(t)) -> <&s(h(t)) as sn -> s for
almost all t e [0, I]) with
J0
\/Zi<l>s(h(Ti))\
(9.6.12)
Conversely every stochastic process of the form exhibited at the right-hand side
of (9.6.12) with the stated conditions, is simple max-stable, strictly stationary and
a.s. continuous.
321
Proof. We apply de Haan and Pickands (1986). It says that if rj is a simple maxstable process on R the representation of this theorem holds with "77 has continuous
sample paths" replaced by "rj is continuous in probability" and with the statement
" * * ( / ( ) ) "* **(/()) for almost all u e RM replaced by "/J |* J f l (/(ii)) <Ps(f(u))\du -+ 0."
Next, replacing convergence in probability with a.s. convergence on the one hand
and replacing convergence in Li-norm by a.s. convergence on the other hand can be
done locally, i.e., for each compact interval. Theorem 3.2 of Resnick and Roy (1991)
again justifies such replacement (cf. the proof of Theorem 9.6.7).
Conversely consider a process with the given representation.
For 51, 52 e R we have with fs(t) := &s(h(t))
- l o g P (rj(si) < xu rj(s2) < x2)
= - log P (Z(fSl (Tt) < Xl and Z ^ f f i ) < x2 for all i)
= - l o g P I max max I
V i
\
,
*\
J< 1 1
)
)
*2
dr_
x dt
72
r
- / /Jmzx(rfSl(t)/xi,rfS2(t)/x2)>l
J l Jmi
(fSl(t)
fs2(t)\.
= / max I -, -
JO
*1
X2
at .
J
( ' m a x ^ ^ ^ ^ W
funfMl.Mr)*.
(9613)
Jo
V xi
x2 )
Jo
\ x\
x2 )
Now fs (t) = 0 5 (h(t)) for s e R. Hence, since the Oj form a group, the left-hand
side of (9.6.13) is
I max I
JO
I dt
X\
X2
= / max I
Jo
Jo
Jo
\
\
x\
x\
rs{t)dt
x2
\ x\
\
x2
x2
)
))
322
r(v/2)0r
Note that all conditions, in particular condition (3) of Theorem 9.6.1, are fulfilled
since the densities are continuous and unimodal.
In all three cases the parameter ft has been introduced in order to control the
amount of spatial dependence: note that when f$ increases the amount of dependence
between values at two fixed sites decreases.
The two-dimensional marginal distributions can be calculated explicitly in all
three cases (and also for their two-dimensional analogues where s is a vector in R 2 ;
see de Haan and Pereira (2006)).
1. For the exponential model:
0 < v < xe~^s\
r 1
y '
logP(t](0)<x,r](s)<y)
X
L
'
1
x '
V*y
, xe-PW <y
<xeW.
y > xePW ;
+ -3> -^- +
log - .
y V 2
\s\fi
*y)
0<y<xL'2
Pi(B,s,z)
323
-(v+D/2
,xL-iv+1)/2<y<x,
+ (l-P2(B,s,z))
lP(TvA<^).
x = y,
\{\-Px{B,s,z))
+ \P2(B,s,z),
x<y<xL7iv+l)/2,
(v+l)/2
y > xLx
where
B2s2
B2s2
B2s2
B2s2
L2 = l + ^-+B\s\Jl
Pl(B,s,z)
P\\TvA
P2(B,s,z)
= P
7v.i-
Bs
+ ^-,
<P
Bsz
\-z
s2z
(1 - z)2
1
B2
s2z
(1-z) 2
P2
Tv, i is a random variable with a Student-f distribution with v degrees of freedom and
scale parameter one, and z = (;c/y) 2 / (v+1) . These results can be used for constructing
an estimator for the dependence parameter p.
(9.8.1)
seR
where {Z;}?^ is a realization of a Poisson point process on (0, oo] with mean measure
dr/r2 and independently, {W,-}?^ is a sequence of independent Brownian motions.
The process r\ is stationary on R. For the proof it is sufficient to prove that all
marginal distributions are stationary. We shall show this for the two-dimensional
distributions. Let 0 < s\ < S2 and write u := S2 s\. Then for x, y e R,
- log P(rj(si) < e*9 r](s2) < ey)
= E
W(si) Sl/2 x
m^(e
- ,eW(S2)~S2/2~y)
324
E e W(*,W2 m a x
(^-x
eW(s2)-W(sx)-(S2-sX)/2-y^
max(e-x,ew^-w^-^-s^2-y)
=E
- y l f
etJU-u,2e-ty2
dt
\lln Jty/u-u/2>y-x
-it-vu)2/u
\
V" /
dt
V2TT Jt-JU
*Ju>-<s/u/2+(y-x)/Ju
>-JH
=--*(f ^)--'*(f ^)
with <E> the standard normal distribution function. Clearly the distribution depends on
s\ and S2 only through u = s2 s\. The reasoning is similar when s\ < s2 < 0.
Finally consider the case s\ < 0 < s2:
- log P(rj(Sl) < ex, rj(s2) < ey) = E max (ew+*'2-x,
Denote the distribution function of eW(s\)+s\/2-x b y ^
0 f eW(s2)-s2/2-y b y F 2 T h e n t h e expectation is
/OO
/00 /
r dFi(t)F2(t)
JO
rt
tl F[{t) /
Jo V
JO
/00
= /
Jo
^ e distribution function
pt
F^(II)
Jw + f(f) / FfCw) Jw ) A
Jo
J
/00
an(j
^ta)-*/2-;^
/00
Ju
/00
fF(0 A F{(W) dw
Jo Ju
Note that
and
F'(f) = ^- 0 (^ + + \
2
t<Jsi \ 2 Jsi <Jsi)
with <p = <$'. Hence
/.00
/.00
kil
log
isn vw\
Jlogu Vkll
Now
isn
\s\\
325
VkT
(_ 1 /lll
exp \ 2\A
e~x
= e~x(/)
v2
2vx\\
x1
v2
2vx\\
X
\s
1 (\s\\
/vi^rT _ _^
v 2 ^n
^_\
V\H\)
'
Hence
/.
tF{{t)dt e
\s\\\
LA-M 7m-)
dv
,W(Jl)-5l/2-JC
and
/OO
/-00
/
/
JO Ju
tF[(t) dt F^(u) du
eW(Sl)-Sll2-x
= e-xP[eW(sx)''sxl2'x
>
u) dP (ew^~^2-y
>
W{si) S2/2 y
< u)
= e~xP
(Wo - W(s )
>x-y-
W^-^i
SJ
-^)
Similarly
/00
/00
/
/
JO Ju
VV^2-^1
326
Example 9.8.1 Let Y be a random variable with distribution function 1 1/JC, x > 1.
Let W be Brownian motion independent of Y. Consider the process
ti(s)heR
:= [ r e w - l * l / 2 )
(9.8.2)
We claim that this process is in the domain of attraction of the process in (9.8.1). Consider independent and identically distributed copies of the process: {Yle Wl: (*)-1*1/2 J R
for i = 1, 2 . . . . Now consider the point process consisting of the points
J^.^W-W^j" .
(9.8.3)
These are elements of (0, oo] x C + (R). We already know from Theorem 2.1.2 that the
point process constructed from the points {Y(/n}"=l converges in distribution to the
point process constructed from the points {Z; }?^ of (9.8.1). Since the second component is independent and does not change, the point process (9.8.3) converges in distribution to the point process constructed from the points {(Z,-, eWi (*)-M/2) J . Then
the point process constructed from the points {( _ 1 F;e W l ^~l J l/ 2 )}" = 1 converges
to the Poisson point process constructed from the points {(Zi^ W | '^~' 5 '/ 2 )}. = 1 . The
points are continuous functions. Since {sup,<rt n~l YieWi(<s>)~^^2}se^ is a continuous
functional of the point process, we have indeed
\supn-lYiew^-^2
"faCOheR
seR
inC+(R).
The process (9.8.2) has an interesting property. For a > 0 the distribution of
{%(s)/a}sei& given (0) > a is the same as the distribution of {Cs)}5eRThis propertywhich we call excursion stabilityis analogous to the defining
property for the (generalized) Pareto distribution; see Exercise 3.1.
Next we consider maxima of independent and identically distributed OrnsteinUhlenbeck processes. We show that if we apply a suitable time transformation the
limit process is max-stable.
Example 9.8.2 Let {^COL^ be a Ornstein-Uhlenbeck process, i.e.,
X(s) = f
e-(s~u)/2dW(u)
Joo
for all s e R with W Brownian motion on (oo, oo), i.e., two independent Brownian
motions starting at 0 and going off in two directions of time. Since for s ^ t the
random vector (X(s), X(t)) is multivariate normal with correlation coefficient less
than one, Example 6.2.6 tells us that, relation (9.5.1) can not hold for any max-stable
process in C[0, 1]: since Y has continuous sample paths, Y(s) and Y(t) can not be
9.8 Two
Examples 327
,9,84)
(v*Hi)-\>L
'
in C[5o, so] for arbitrary so > 0, where Xi, X2,... are independent and identically
distributed copies of X and the bn are the correct normality constants for the standard
one-dimensional normal distribution, e.g., bn = (21ogn log log n \og(4jt))1^2
(cf. Example 1.1.7). In order to show convergence we write
X{s) = e~s'2 (x(0) + fe u ' 2 dW{u)\ ;
hence
fc/;'^"^(=(1+o(^)).w().
Finally, for \s\ < so,
(i-^8.).-| +
o(^).
It follows that
= i+o
6 <x<O) i, )+
( (4))( "
+o
328
\\/(bn(Xi(0)-bn)
+ W?(s))--
Li=l
(9.8.5)
yeM
K/0ogz, + w;w)-M
Exercises
9.1. Show that the constant c in Lemma 9.3.4 is p(Cx [0, 1]), where p is the spectral
measure of Section 9.4. Argue that this constant is an analogue of L ( l , . . . , 1), where
L is the dependence function defined in Section 6.1.5.
9.2. In Section 7.4 (finite-dimensional extremes) a quantity K has been introduced
that quantifies the strength of dependence. It was shown that for ^/-dimensions K
i s d / ( - l o g P ( Z i < l,...,Zd
< 1)) = d / L ( l , . . . , l ) , where ( Z i , . . . , Z r f ) is
a random vector with distribution function Go, where Go is from Theorem 6.1.1.
Argue that \/c could serve as an infinite-dimensional analogue of this coefficient,
where c is the constant from Lemma 9.3.4. How would one estimate this quantity c?
9.3. Show that all marginal distributions of the process of Example 9.4.3 are indeed
exp( 1/JC), x > 0.
9.4. Check that the regular variation condition of Theorem 9.5.1 (2c) implies the regular variation condition of Theorem 6.2.1(1) for all marginal distributions.
9.5. Consider the stochastic process defined by %(s) := YV(s) for s e R, where Y
has distribution function 1 1/JC, x > 1, and V is a continuous stochastic process
independent of V satisfying EV(s) = 1, for all s and E supa<s<b V(s) < oo for a <
b. Show that is in the domain of attraction of a simple max-stable process that has
the representation of Corollary 9.4.5 with the same auxiliary process V. Moreover, for
a > 1 and V(0) = 1, {t-(s)/a}s>o given (0) > a has the same distribution as . This
property resembles a corresponding property for a generalized Pareto distribution in
finite-dimensional space. Find that the one-dimensional marginal distribution equals,
for each s > 0 and V(s) > 0 a.s.,
329
1 f
P(g(s) >x) = - / P(V(s) >u)du
x Jo
u > 0.
Note that they depend on s and do not follow a generalized Pareto distribution (cf.
Section 3.1).
9.6. Consider independent and identically distributed random vectors (R, $>),(R\, 3>i),
(/?2, $2), , where R and <& are independent, P(/ > r) = exp(r2/2), and 4>
has a uniform distribution over [0, 2TT]. This means that (Z? cos 4>, /? sin 4>) has a
standard normal distribution. Prove that {v"=lbn(Ri cos(0/bn <&;) bn)}e converges to {v"=17} + #Z; 02/2}Q, where {7}} is an enumeration of the points of a
point process on R with mean measure e~* dx and Z,- independent and identically
distributed random variables (Eddy and Gale (1981)).
Hint: Expand cos(0/bn 4>,-) and proceed as in Example 9.8.2. Note that by Corollary
5.4.2,
w^Ri/bn^l.
10
Estimation in C[0,1]
f (') := ,
dO.1.2)
We write
pn = P (X(s) > fn(s),
=
\1
cww
>
vf*,^
for s o m e s e
t' 1 ] )
332
10 Estimation in C[0, 1]
= P I - f (s) > cnh(s) for some s e [0, 1] 1,
- V &<*>
n ,
-+{ri(s)}se[o,\]
se[0,l]
\<Kn
in C + [0, 1] with P(rj(s) < 1) = e~l for 0 < s < 1 (i.e., standard Frechet). Then
r] is a simple max-stable process. Obviously in this case the exponent measure v is
the only unknown feature characterizing the process. We are going to develop an
estimator for v. As in the finite-dimensional case the estimator is based on a small
fraction of higher observations only. Define for k < n the estimator vn^ as
1
k
,-i
We claim that vn^ is a consistent estimator for v if k = k(ri) -> oo, k/n -> 0,
n -> oo.
Theorem 10.2.1 Let f, fi, f2> ?3, be i.i.d. stochastic processes in C + [0, 1]. If
-Mb^n
(10.2.1)
333
in C + [0, 1], then for any c > 0 as k = kin) -> oo, k(n)/n > 0, n -> oo,
vn,k\Sc -+ v\se,
(10.2.2)
where at both sides we consider the restrictions of the measures to the set
Sc := {/ C + [0, 1] : l/loo > c }
anJ convergence is in the space of finite measures on C [0, 1]. The measure v is the
exponent measure of the process rj (cfi Section 9.3).
Proof. According to Daley and Vere-Jones (1988), Theorem 9.1.VI, we need only to
prove that the finite-dimensional marginal distributions converge, i.e., for any Borel
v-continuous sets E\, E2,..., Em C Sc,
_
(Vn,k(El)>
__
Vn,k(El),
_
, Vn,k(Em))
p
- * ( v ( E i ) , v ( 2 ) , . . . , v(Em))
Since the limit is not random, this is equivalent to the following: for any Borel vcontinuous set E C Sc,
vntk(E)-+v(E)
Hb **)**>
which has been proved in Corollary 9.3.2.
A corollary of this theorem is the uniform convergence of the marginal tail empirical distribution functions as well as the tail quantile functions. This will be useful
later on.
Corollary 10.2.2 For each s let fi, n (s) < f2,nC?) < < f,*(*) be the order
statistics oft;\(s), &($), tn(s) and define
1
1 - Gn,s(x)
:= - ^
i=l
UkSi(s)/n>x}
Suppose the domain of attraction condition (10.2.1) holds. Then for any c > 0,
1
sup
1-GnA*)--
^0
(10.2.3)
0<s<l,x>c
and
sup
0<s<l, x>c
/ x
-Sn-[kx],n(s)
n
1
X
0.
(10.2.4)
334
10 Estimation in C[0, 1]
Later on we shall also need
1
sup
and
sup
0<S<1,JC<C
(10.2.5)
0<s<l, x<c
l-Gn,s(x)
*
Sn-[k/x],n(s)
n
x\
(10.2.6)
0.
Proof. Fix c > 0. By changing the probability space and using a Skorohod construction, we can pretend that the result of Theorem 10.2.1 holds a.s., i.e.,
V
n,k\Sc -
as
\SC
(10.2.7)
This means convergence of finite random measures. A metric characterizing this type
of convergence is given in Daley and Vere-Jones (1988), A.2.5:
d(v, IX) := inf {s > 0 : v(F) < ti(F) + s and fi(F) < v(Fe) + e
for all closed sets F e C + [0,1]},
(10.2.8)
( n,klSc>v\sc)
<
(10.2.9)
a.s.
Note that Ex s is in fact the same as Ex-yS and that v (EXfS) = 1/x. It follows from
(10.2.9) that for x > c, 0 < s < 1,
1 - Gn%s{x) = v * {/ C + [0,1] : f(s) > x]
< Vn,k (Ex,s) < V (Ex-e,s) + 8
X 6
+e
and
1 - GnAx)
2: Vn,k (Ex+e,s)
> V (Ex+2e,s)
- e
x + 2e
1-Gn,s(x)~-
a.s.
0<5<1,JC>C
335
1-GS(X)
sup
(10.2.10)
a.s.
0<s<l, c<x<b
it follows, since k/ni;n-[kx],n(s) is the inverse function of 1 Gn,5(jc), that for 0 <
c < b < oo,
k
sup
/ x
-> 0
-$n-[kx],nVS)
0<s<l, c<x<b
(10.2.11)
a.s.
-> 0
Sn-[kx],n(S)
0<S<1, X>C H
a.s.
-* 0
l-Gn,s(x)
0<s<l, c<x<b
a.s.
as(n)
)se[0,l]
(10.3.1)
{Y(s)l *e[0,l]
Fs(Xi(s))}\se[0l]
Ul + Y(s)ns))1^s)\
I
(10.3.2)
J5G[0,1]
1
i-Fs(Xi(s))
i=l
336
10 Estimation in C[0, 1]
where A C C + [0,
[0, 1]. This is cconsistent for v. However, this is not a statistic since
1 Fs is unknown. Hence we replace 1 Fs by its empirical counterpart
1 - - FnAx)
7=1
1 "
Kk(') := j t i L 1 ! ^ - / ^ . }
l
J
(10.3.3)
i=l
with
ti(s) :=
1n ^^l ^^)^^)}
l-Fn,s(Xi(s))
(10.3.4)
We know by Theorem 9.2.1 that for these processes (10.2.1) holds with rj(s) :=
(1 + y(s)Y(s))l/y(s\
s [0, 1]. This leads to a simpler way of writing (10.3.4):
"&(j)
:==
1.-1 r*
=1 ^
(10.3.5)
with Gn>iy as in Corollary 10.2.2. Hence we can analyze this estimator using the results
of Section 10.1.
Theorem 10.3.1 Let X, X\, X2,... be i.i.d. stochastic processes in C[0, 1] and assume that their distribution is in the domain of attraction of a max-stable process in
C[0, 1], i.e., (10.3.1) holds.
Let
1
l-FnAx):=-J2l{Xj(s)>x}
",=.
77ien, as k -> 00, &/n > 0, n -> 00, for all c > 0,
p
(10.3.7)
in the space of finite measures on C [0, 1], where on both sides we consider the
restriction of the measure to the set
Sc:= | / e C + [ 0 , l ] : | / | o c > c ) .
337
Proof. We have only to prove that for a v-continuous Borel set E c SC9
vn%k{E) -> v(E)
in probability (cf. proof of Theorem 10.2.1). Write
E = (E H S[cM) \J(En
with
Sb) =: Ei U 2
_
S[cM := {/ C + [0,1] : c < l/loo <fc] .
Let k$/n G i . Then k$(s)/n < b for 0 < 5 < 1, hence by (10.2.5) of Corollary
10.2.2 for sufficiently large n,
A:
n
1
l-Gn,s(ki;(s)/n)
k
n
i=\
1 *
=
i=l
l
l^
\^-GnAHi{)/n))-XeE2J\
l
1 "
=
*(> with g 2 | .
(10.3.8)
338
10 Estimation in C[0,1]
Hence
vn,k{E2) < v(Sb) = vn,k If : /() = L_lGn
'p (rV)'
P \ sup ( -;
(g(s)) > * - } - 1 ,
n^oo,
[*[0,]
0,1] V 1 " ^ n , 5 /
and by (10.2.2) the right-hand side tends to v (5^_ ), which by Theorem 9.3.7 equals
C/(b s) for some positive constant C. By choosing & large enough this can be made
smaller then s. Then as n - oo,
Ptf,,,* (EOS*) > * ) ( > .
The proof is completed by combining (10.3.8) and (10.3.9).
(10.3.9)
as(n)
}
-+{Y(s)}se[0,i]
J J[ o,i]
(10.4.1)
exp j - ( l + y(s)xy1/y{s)\
(10.4.2)
for each s e [0,1], where y C[0,1]. The function y is called the index function.
In this section we develop estimators for / , the scale and the location. The estimators
will be based on the moment estimator of Section 3.5, but similar results should hold,
for example, for the maximum likelihood estimator.
Now, since in applications one does not need estimators of scale and location for
extreme order statistics, but rather for intermediate ones such as those of Section 2.4,
we shall specify what we want to estimate.
339
:=as([t]),
(10.4.3)
bs(t) := bs([t])
U,M:=Fr(l-1-).
(10.4.4)
which can be achieved by a shift. Next we introduce the estimators. They are simple
extensions of the ones used in the finite-dimensional case. Define the sample functions
i
*-
~ lo8Xr,-k,n(s))J
(10.4.6)
1=0
Next define
(10.4.7)
-l
y-(s) := 1 - l \ 1
y() = Y+(s) + Y-(s),
(10.4.8)
(10.4.9)
as(n/k)
= X_t,(5)y+(i)(l-7-W),
(10.4.10)
bs(n/k)
= ^n-A.n^)-
(10.4.11)
10.4.1 Consistency
We have the following consistency result.
Theorem 10.4.1 LetX\, X2, ...be i.i.d. stochastic processes in C[0,1] andassume
that their distribution is in the domain ofattraction ofa max-stable process in C[0,1],
i.e., (10.4.1) holds. Ifk = k(n) -* 00, k/n -* 0, n -* 00, then
340
10 Estimation in C[0, 1]
sup |p+(,s) y+(s)\ -> 0 with y+(s) := y(s) v 0 ,
(10.4.12)
0<s<\
(10.4.13)
0<5<1
sup
\?(s)-Y(s)\-*0,
(10.4.14)
0<s<\
sup
0<s<l
sup
0<S<1
as(n/k)
- 1
as(n/k)
(10.4.15)
0,
bs(n/k)-Us(n/k)
as(n/k)
(10.4.16)
For the proof of Theorem 10.4.1 we need two technical lemmas. The first one has
been taken from Appendix B.
Lemma 10.4.2 Suppose that the functions log as (t) and gs (t) > 0 are locally
bounded in 0 < s < 1, 0 < t < oo, and for some Y C[0, 1] and all x > 0,
r
hm
t^oo
gs(tx)-gs(t)
as(t)
Xy^-i
(10.4.17)
Y(s)
= xY{s)
(10.4.18)
uniformly for 0 < s < 1, and for any > 0 f/iere emte /o > 0 swc/i that for t > to,
tx > to,
as(tx)
- r^W
(10.4.19)
*jc y ( , ) max ( * * , * " * ) ,
a5(0
or alternatively,
as(tx)
(1 - e ) * ^ min (x , JC" ) < ^ ^ < (1 + s)xy(s) max (*, *"*)
(10.4.20)
and
gs(tx)-gs(t)
gs(t)
xyw-i
Y(s)
exy{s)max(xs,x-e)
(10.4.21)
Further,
hm
= y+m
(10.4.22)
uniformly for 0 < 5 < 1 >Wf/i X+C?) := max (y(s), 0), and for any s > 0 f/iere exwta
fy swc/i that for t,tx > to,
log gs(t)
xy-^-\
as(t)/gs(t)
y-(s)
341
(10.4.23)
iV*w
i=l
Ms)},
(10.4.24)
5G[0,1]
) se[0,l]
in C + [0, 1]. Let t;\As) < ^2,n(s) < < ZnAs) be the order statistics of
fi(s), ?2(*) n(s). A/50 /ef /x and X be continuous functions defined on [0, 1]
with n < 1, X < 1, /x + X < 1. 77iew
1^
sup
0<s<l
i=0
1
-^ 0
1 - H(s)
(10.4.25)
and
sup
1 ^
0<^<l
i=0
- 1 (U-i,n(s)/Sn-kAs))US)
{U-iA^/Sn-kAs))^
//,($)
(1 _ ^s)
~ 1
Ms)
2 - /x(5) - X(J)
_ X(s)) (1 - /x(s)) (1 - X(s)) I
0.
(10.4.26)
Proo/ We shall prove (10.4.25). The proof of (10.4.26) is similar. Observe that
1 t \ (tn_iin(s)/tn_Kn(s)fis)
fi(s)
*=0
\krn-kAs)/
- 1
J(k/n)Sn-ktn{s) \J(k/n)U-k,n^)
(?r- L 7^)
\AC ?-*,($)/
A*/Kn-*+l,,.(*)
/n
(i-Gn,,w)^>-^
\M(5) /OO
= J ^
+ (ijr-t-rX
(l-G,,(x))^)-^
f
(1 - GH,,(x))xM-ldx
342
10 Estimation in C[0,1]
1
- -^ 0 .
1 - MO)
(10.4.27)
-.
V W" i = V W "
/=l
/=i
SU
5G[0,l]
5e[0,l].^1"
se[0,l]
1 n
W:=-J]l{
* i-i
V w /
i}.
/oo
(1 - G n , , ^ ) ) * ^ " 1 ^ < /
J\
(1 - F n W ) * ^ - ^ *
l - ^W
Hence by Pratt's lemma (summarizing, if gn -> g pointwise, |g n | < / for all n and
ffn-+ff,
for some functions / , g, / , g, then / g n -+ Jg; Pratt (I960))
Ji
Ji
1 - MO)
(10A28)
l-F 5 (X,(,s))
as in Section 10.3. Then, according to Theorem 9.2.1, for f i (5), ft fa). - t h e results
of Section 10.2 hold (the "simple" case).
We first prove (10.4.16). Note that by (10.4.28)
343
Us^i(s));
hence
bs (f) - Us (f)
X_*,(.) - Us (I)
Ml)
*(f)
M!)
which by Corollary 10.2.2 (10.2.4) and Theorem 9.2.1 (note that as in Theorem 1.1.2
one sees that (9.2.6) holds with n replaced by t running through the reals) converges
to zero, in probability and uniformly in s. For the proof of the other statements of the
theorem we start with the following:
My(1)
n
Jfc-1
1^
=
k
- log Xn-.kAs)
log Xn^n(s)
i^a
s(Sn-k,n(s))/Us(Sn-k,n(s))
- E
^(U-WVWUW)
i=0
M?\s)
fl5(fn-MW)/^(fn-MW)
*S
M*)
*-l
<4E(<-*.w/f-t.w))'-(')+e
/=o
Upon applying Lemma 10.4.3 we then get
1
9+(s)
sup aS (?*-*, n (S))/ Us (?-*,#! (s))
-o.
l-y-(j)
(10.4.29)
0<s<l
Similarly we get
M?\s)
sup
0<5<1
{as( f n - M ( * / ^ ( f - M W ) )
2
a-y-w: ( l - 2 y _ ( 5 ) )
4-0
(10.4.30)
\aS(Sn-k,n(s))
0<s<l UsUn-k,n(s))
- y+to
(10.4.31)
344
10 Estimation in C[0,1]
a
* (?)
as ( | )
and as(tx)/as(t)
(10.4.18).
-*
* (f [jiSn-kAs)])
as ()
JCK(5)
M?\s)(l - y-(s))
as(Sn-k,n(s))/Us(Sn-k,n(s))
,=i
in C [0, 1]. Assume also that the following smoothness condition holds: for all 0 <
and
0 there exists K > 0 andfor large enough v there exists 8Q > 0 such
P < \2 an
d c>Q
that for all 8 e [0, Sol
sup P \r < Es,8\
<s<l
0<s
sup
s<u<s+8
(10.4.32)
where
< K(-log6r3
forallue[s,s
.
+ 8]\ .
(10.4.33)
x&
EW(cu<x)W(Cs,y)
= v(cUiX n cs,y),
345
= yi A y2. Hence
We also need the following result on the inverse empirical distribution function.
Corollary 10.4.6 Under the conditions of Theorem 10.4.4, for any function a e
C[0, 1], with a special construction,
sup Vk
\a(s)
l(k
^0
(s)W(Cs,i)
0<s<\
as n -> oo.
The asymptotic normality for our estimators is as follows.
Theorem 10.4.7 LetX\, X2, X 3 , . . . be i.i.d. stochastic processes in C[0, I]. Assume
(10.4.1)-(10.4.11). Moreover, adopt assumption (10.4.32) of Theorem 10.4.4 with
?(*) := {1 - Fs(X(s))}-1. Finally, we need a uniform second-order condition: for
some positive or negative function As(t) defined for 0 < s < 1 and t > 0 and
satisfying sup 0 < 5 < 1 As(t) -* 0 as t > 00,
\ogus(xt)-\ogus(t) _
xy-w-i
as(t)/Us(t)
y-(s)
(10.4.34)
-> Hy-(s)*P(s)(x)
As(t)
uniformly for s 6 [0, 1] with p e C[0, 1], p(s) < 0, 0 < s < 1, and
Hy_{s),p{s)(x) = f
yy-(s)-1
u^'ldudy
* (f)
7 ^(f)
y+(*)
-0,
(10.4.36)
f/ien we /lave
sup \Vk {y+(s) -
y+(s))
y+(s)V(s) - 0 ,
(10.4.37)
4o,
(10.4.38)
4o,
(10.4.39)
-o.
(10.4.40)
0<5<1
sup
\Vk(y(s)-y(s))-g(s)
0<s<l'
sup
0<5<1
sup Vk
0<5<1
/Ml)
U (f) ;
A(s)
346
10 Estimation in C[0,1]
"
dx
= 1 W(C,)l-y-(s)
?
i _
W(CS,X)
xr-(s)
-2((1 - y-(s))(l
_ i
y-(s)
y_(s)
W(CsA) ,
dx
xl-r-^
- 2y-(s))rl v 1 1W(C,,i) ,
- 2y_(s))} P(s)
+i(l-y_(J))2(l-2y_W)2S(i),
W() = W(C,,i),
./*(*) = y(j)W(C s ,i) + (3 - 4y_(j))(l - y-('))?(*)
-i(l-K_W)(l-2y_(j))2Q(5),
5 [0, 1].
Proo/ The proof is somewhat similar to that of Theorem 10.4.1. We sketch the line
of reasoning. Condition (10.4.34) implies that for any s > 0 there exists to such that
for t > to, x > 1, and 0 < s < 1,
\ogUs(tx)-logUs(t)
as(t)/Us(t)
Y (s)+e
Hy_(s)<f)(s)(x) < e (l + x -
As(t)
(10.4.41)
- _ti(B'
"
,\lo^(l)-lBUs(i)
U(f) 7
f lnr /Mf)\l
*(!)/Mf)
PWf)/J
//rtog^(f(^-MW))-log^(|)\
*(f)/Mf)
/
(10.4.42)
\
/
We consider the second factor of (10.4.42)first.Note that Corollary 10.4.6 implies
sup
0<s<\
Sn-k,n(s) - 1
(10.4.43)
347
xmuL(0,y{s))
M l &-*.()}) PX
sup
, .
0<*<1
I/, (f)
Hence the second factor of (10.4.42) tends to one in probability, uniformly in s.
For thefirstfactor of (10.4.42) note that by (10.4.41) and (10.4.43),
/ r l o g M f {!?"-*.(*)})-logMl)
*(f)/*Mf)
y-C0
+ Op(l)V*A, Q j 1 + (^n-kAs)Y
For the first term apply Corollary 10.4.6 with a(s) := y-(s). Further, note that by
the boundedness of HY_(S),p(s)((k/n) t;n-k,n(s)) and (10.4.40) the second and third
terms converge to zero in probability uniformly in s. Relation (10.4.39) follows.
For the other relations we use again (10.4.41). Applying this with
X := Sn-i,n(s)/Sn-k,n(5)
a n d
' '=
<n-k,n(s)>
we get with
k-i
w 1} s
as before that
M?\s)
as{Xn.ktn(s))/Ut(XH-k,n(s))
f (k
Y-(s)[\n
\ - y - t o roo
)
A*/)6.-i,W
348
10 Estimation in C[0,1]
with
1
1 ~ Gn,s(x) := - X ] 1{(k/n)^i(s)>x}
k
=i
\-y-(s) noo
-?-*,(*)
n
(1 " GnA*))xy-(S)-1
-Cn-k,n(s)
1
\
1 - Y-(s))
dx
J(k/n)U-k,n(s)
/
VkUl- GnA*)) " * ~ *
l
J
J{k/n)!n-ktn(s)
\-r-(s)
1 /.oo
J J(k/nKn-k,n(s)
X^^dx
/OO
+ V /
.,(J)
JCy"(,)"2rfjC.
./(*/)<ii-*,n
Hence
Vk (
\ _ V(s)
\as(Xn-k,n(s))/Us(Xn-k,n(s))
(k
l-y-(s)J
\ - y - W /-oo
(-?n-*,(*) J
V"
(Vt 1-&,,W-* - 1
/(*/)?-*,(') V
'
-W(CS,X) V-W- fr
^
\-y_(s)
^oo
~ 11 /
W(Cs,x)xY-{s)~l
) J(k/n)t,,-.k,n(s)
-?-*.(*)
n
)
+ {^((^w)"y"W-)
+ y-(j)W(C,,i)l /
JC^-W-2^
J J(k/n)l;n-.Kn(s)
+ Vk f
xv-U)-2 dx +
W(C)
JC^"
w(Cs,i)
1
rf*
dx
349
(*/*)_*, (s)
+ Y(s)W(C.
rY-(s)-2
' > /
dx,
A) = v(A)
n-oo k
X(s)-h
i/y(s)
(|)V
(!)
)
The functions as(n/k) > 0 and bs(n/k) are suitable continuous normalizing
functions.
Estimators y (s),as (n/ k), bs(n/ k) such that with some sequence k = k(n)
00,
k(n) = o(n), n > 00,
sup
0<s<l
sTky(s) - y(s)
v^
(f)
(!)
v^ Mf)-Mf)
* (!)
= 0P(l)
350
10 Estimation in C[0,1]
/ < hn = / i Cn .
* (f)
0<s<l \
5. Sharpening of (1):
P(/?(X)ec5) P
>1,
* v (c.S)
n - oo .
lim
r.
= 0.
n:=
where
VYU)
yic/
,
as ( r )
J
c : = sup ll + y(s)
o<s<i \
Rnf(s):=ll + m
&
,H\
and
Sn '= Rn(Cn).
Cn
,for0<5<l,
351
hn(s)-bs(i)
Ml)
for some s e [0,1]. However, when checking the proof, one sees that as n - oo, the
probability that this happens tends to zero.
Theorem 10.5.2 Under our conditions,
Pn P
1,
n -> oo ,
Pn
provided v(S) > 0.
The proof of Theorem 10.5.2 follows from three lemmas and four propositions.
The proofs are very similar to those in Chapter 8 and will be mostly omitted.
Lemma 10.5.3 LetGn be an increasing and invertible mapping: C[0,1] -> C[0,1].
Suppose that lim^oo Gnf = f in C[0,1] for all f e C[0,1]. For an open set O
let
On := {Gnf
:feO}.
Then for all f e O,
l o n ( / ) := 1{/GO}-> l o ( / ) := l{/eO}
Lemma 10.5.4 For all x > 0,
'
lim
l + y(s)
n-*oo\
<
xY(s)+o2(l)_l
( 1 + ^ ( 1 ) ) T r - +03(D
4)
yW + o 2 (l)
^
=x,
n^oocn
{CnXY
Y(S)
l/Yn(s)
am/
lim
352
10 Estimation in C[0,1]
ds(n/k),
-* 0
0<s<\
sup
0<5<1
sup
0<s<l
'Ml).,
o.(f)
Mf)-Mf)
Ml)
0.
Define
^ ^ j E/ = 11 * XeS} *
Then, as n > oo,
vn(S) -> v(5) .
Proof. Invoke a Skorohod construction so that we may assume that by virtue of
Lemma 10.5.4,
RnRf -+ Identity a.s.
Write
c :=inf {l/loo : / } > 0 .
For all 0 < c$ < c, / e S => there exist s [0, 1] such that f(s) > c$.
Take no such that for n > no,
RnR^co
>
co
co
: / e S) C {/ : / < RnR^c0}C
353
C {/ : / < y } ' =: D ,
i.e.,
Now
vn(D) -> v(D )
by Proposition 10.5.6. Hence as in the proof of Proposition 8.2.8,
vn (RnR?{S))
-* v(S)
P 1 ,
n - > oo .
i/y(j)
y(5) M*)-Mf)
l+
* (!)
Then
cn = sup
1 + yO)-
* (!)
0<s<\ \
= sup rn(s)
0<s<\
1/xW
n(s)~bs(l)\
1 + Hs)
\-r
I rn(S)
I
'a(D (r,fr)) yW - 1
* (f)
**>
Ml)-Ml)'
"s (?)
Hence, since the expression inside the curly brackets tends to one in probability
uniformly in s by Lemma 10.5.5, we have
Cn
cn
sup rn{s)
1.
0<s<l
4> V(5) .
Part IV
Appendix
A
Skorohod Theorem and Vervaat's Lemma
hm
Xn(s)-g(s)
n->oo
dn
/A A IX
= y(s)
(A.0.1)
- g*~is)
-(g*-)\s)
y (g+-(s))
(A.0.2)
8n
uniformly on [g(a), g(b)], where g % xf~, x~,... are the inverse functions (rightor left-continuous or defined in any way consistent with monotonicity).
Proof. We first prove the result for g(s) = s for all s. Note that (A.0.1) implies that
xn(s) converges uniformly to s and hence x~(s) converges uniformly to s. Take any
sequence sn -> so (g*~(a), g^(b)). The local uniformity in (A.0.1) gives
358
lim
Xn (x^(sn)
Sn) - XJ-(sn) T Sn
Sn
At-* 00
\sn) ~~ Sn)
xn \sn)
Sn
-y(so)\
Xn (x^~(Sn) + Sn) -
< max
X^(s)
Sn
Xn (xj-(sn)
- Sn) - XJ~(S)
Sn
- y(so)
- y(so)
o.
Next note that from (A.0.1),
V(g^(s))
Sn
uniformly for g(a) < s < g(b), which implies, as we just proved,
g+-(xf(s))-s
-y(g*~(s))
Sn
Now
g^ (xf(s))
-s
= g^ (xfis))
= (xf(s)
- g<~ (g(s))
- g(s)))
Un(x):=-J2Uui<x}i=l
- *)}()<,<! -*{#o(*)}o<*<i
359
{U[nx],nh<x<l
G>
hence
sup \Vn~(U[nxln - Uf(x))|
-> 0 .
0<*<1
Since (/*)
(x) - x) + B$(x)\ = 0
-+{-B0(x)}o<x<i
in Z)[0, l]-space.
(A.0.4)
B
Regular Variation and Extensions
y-oo
(B.1.2)
If the limit in (B.1.2) as y -> oo exists, the function / : E4" -> R + defined by
f(t) = expg(logf) satisfies
lim IM.
=x*
(B.1.3)
t->oo f(t)
for all x e R + for some a e R. Then / is called a regularly varying function.
In this appendix these functions are studied thoroughly. Moreover, we study the
more general class of functions / : R + -> R for which
f{tx)
r
"
hm
*->
a{t)
b{t)
(B.l.4)
'
362
exists for all x R + , where a > 0 and b are suitably chosen auxiliary functions. The
results for functions satisfying (B.1.4) are surprisingly similar to those for functions
satisfying (B.1.3).
Definition B.l.l A Lebesgue measurable function / : R + -> R that is eventually
positive is regularly varying (at infinity) if for some a R,
\im^^-=xa,
r^oc f(t)
x>0.
(B.1.5)
Notation: / RV a .
The number a in the above definition is called the index of regular variation. A
function satisfying (B.1.5) with a = 0 is called slowly varying.
Example B.1.2 For a, 0 R the functions xa, xa(\ogx)P, * a flog log JC)^ are RV a .
The functions 2 + sin (log log x), exp((logJc)a), with 0 < a < 1, Jc _ 1 logr(x),
J2k<x V* (logO s i n ( l o g l o g ^ are slowly varying. The functions 2 + shut, exp[logJC],
2 4- sin log x, x exp sin log x are not regularly varying.
Our next result shows that it is possible to weaken the conditions in Definition
B.l.l.
Theorem B.1.3 Suppose f : R+ > R is measurable, eventually positive, and
lim ^
'->
(B.1.6)
fit)
exists, and is finite and positive for all x in a set ofpositive Lebesgue measure. Then
f e RVa for some a e R.
Proof. Define F(t) := log f(e<). Then lim r _^ 00 (F(r 4-JC) - F(t)) exists for all x in a
set K of positive Lebesgue measure. Define 0 : K -> R by 0(JC) := limf_>00{F(? +
x) - F(t)}. By Steinhaus's theorem (cf. Hewitt and Stromberg (1969) p. 143) the
set K K := {JC v : x, y e K] contains a neighborhood of zero. Since K is an
additive subgroup of R, we have K R and thus (j>(x) is defined for all x e R and
(B.1.7)
forall;c,y R.
It remains to solve equation (B.1.7) for measurable 0: Consider the restriction of
0 to an interval L c R. By Lusin's theorem (cf. Halmos (1950) p. 242) there exists
a compact set M c L with positive Lebesgue measure XM such that the restriction
of 0 to M is continuous. Let s > 0 be arbitrary. Then there exists 8 > 0 such that
0(v) 0(x) (, ) whenever x, y e M and |JC v| < 8 (since the restriction of 0
to M is uniformly continuous) and also such that M M contains the interval (6, 8)
(by Steinhaus's theorem). For each s e (5, 8) c M M there exists xo M such
that also xo + s e M. Then 0(x 4- ^) (f>(x) = 0(5) = 0(*o + $) 0(*o) (e, )
for all x G R; hence 0 is uniformly continuous on R. Since <p(n/m) = 0 ( l / m ) =
Ai0(l)/m for n, m e Z, m ^ 0, we have by the continuity of 0 that 0(*) = 0(l)x
for x R. Now (B. 1.5) follows.
363
Theorem B.1.4 (Uniform convergence theorem) Iff e RVa, then relation (B.l.5)
holds uniformly for x [a, b] with 0 < a < b < oo.
Proof. Without loss of generality we may suppose a = 0 (if not, replace f(t) by
f(t)/n.
We define the function F by F(x) :=logf(ex). It is sufficient to deduce a contradiction from the following assumption: Suppose there exist 8 > 0 and sequences
tn -> oo, xn -* 0 as n -> oo such that
|F(frt+xn)-F(fn)|>6
for = 1, 2,
and
^
= | j e /
: \F(tn+xn)-F(tn+y)\
> -
The above sets are measurable for each n and Y\,n U >2,n = ^; hence either A(Fi,w) >
A(/)/2 or A(F2,) > k(J)/2 (or both), where A denotes Lebesgue measure.
Now we define
Zn = \z : |F(f + *) - F(tn +xn-z)\>
= {z : xn-ze
- , xn-ze
Y2,n} .
Then A(ZW) = X(Y2,n) and thus we have either X(Y\in) > k{J)/2 infinitely often or
MZ) > HJ)/2 infinitely often (or both).
Since all the yi, n 's are subsets of a fixed finite interval we have
A(limn_+oo sup Y\9n) = limk^oo kQJn=kY\,n) > A(/)/2 or a similar statement
for the Z n 's (or both). This implies the existence of a real number XQ contained
in infinitely many Y\yH or infinitely many Z n , which contradicts the assumption
lim^oo F(t + x0) - F(t) = 0.
Theorem B.1.5 (Karamata's theorem) Suppose f e RVa. There exists to > Osuch
that f(t) is positive and locally bounded for t>to.Ifa>l
then
lim
*f
^ fto
= a +1.
f^ds
(B.1.8)
= -a - 1 .
(B.1.9)
'-" f? f(s) ds
Conversely, if (B.1.8) holds with 1 < or < oo,tftew/ /?Va; i/ (B.1.9) holds with
oo < of < 1, then f RVa.
364
Proof. Suppose/ e R Va. By Theorem B. 1.4, there exist to, c such that/ (tx)/f(t)
cfort> t0, x e [1, 2]. Then for t e [2nt0, 2"+%] we have
fit)
fito)
f(2~lt)
fit)
fil-h)
fj2-nt)
fil-h)
<
+1
fito)
< c
Hence f(t) is locally bounded for t > to and / r ' f is) ds < oo for t > to.
In order to prove (B. 1.8), we first show that ft f is) ds = oo for a > 1. Since
f(2s) > 2 _ 1 / 0 ) for s sufficiently large, we have for n > no,
/
fis) ds = 2
n
fi2s) ds > /
n 1
J2
fis) ds .
n x
J2 ~
J2 ~
Hence
poo
fis) ds=T
J2no
n=/iQ
rzn
p2n+l
fis) ds>T
J2n
/2"0+1
n=no
fis) ds = oo .
S^Zn J2no
Next we prove Fit) := ff fis) ds e RVa+i for a > 1. Fix JC > 0. For arbitrary
e > 0 there exists fi = fi() such that fixt) < (1 + e)xa fit) for t > t\. Since
lim^oo F(f) = oo,
+ 1
(B.1.10)
for r sufficiently large. A similar lower inequality is easily derived and we obtain
F RVa+i for a > - 1 .
In case a = 1 and Fit) -* oo the same proof applies. If a = 1 and Fit) has
a finite limit, obviously F e RVo.
Now for all a,
Fitx) - F(f)
tfit)
/* /(*)
fit)
dw ->
xa+1 - 1
,
a +1
t -> oo ,
(B.l.ll)
fis) ds .
Since in the case a < - 1 there exists 8 > 0 such that fi2s)
sufficiently large, we have, for n\ sufficiently large,
< 2~l~8fis)
for s
/2"
+1
f(s) ds=Y
n=ni
/>2"l
2-4<"-">> /
f(s) ds<T
365
+1
n=n\
bit)
dt
= log F{x) + c\
(B.l.13)
(since the derivatives of the two parts exist and are equal almost everywhere). Using
the definition of b again we obtain from (B.l.13)
(f)
f(x) = cb(x)exp(f
dt)
(B.l.14)
for all x > 0, with c = e~ci > 0; hence for all x, t > 0,
f(tx)
fit)
b(tx) e x p( [* bjts) - 1
b{t)
-^ds)
Now for arbitrary e > 0 there is a fy such that \b(ts) a 1| < e for t > to and
,s > min(l, JC). Hence the function / satisfies (B.1.5).
The last statement of the theorem ((B.1.9) implies that f e RVa) can be proved
in a similar way.
(B.l.15)
(B.1.16)
Conversely, if (B.1.6) holds with a and c satisfying (B.l. 15), then f e RVa .
Proof Suppose f'e RVa. The function t~af(t) is slowly varying and hence has a
representation as in (B.1.6) by (B.l.14). Then / has such a representation with a(s)
replaced by a (s) + a and c(t) replaced by t^cit). Now the result follows. Conversely,
one verifies directly that (B. 1.5) follows from (B.1.6).
366
Remark B.1.7 1. In formula (B. 1.6) we may take to [0, oo) arbitrarily by changing the functions c(t) and a(t) suitably on the interval [0, to].
2. The functions a(t) and c(t) (given in (B.1.6)) are not uniquely determined. It can
easily be seen that it is possible to choose a(t) continuous: define
/o(f) := exp ( / a(v)
and
b0(t) := t -r
/o(0
Ms) ds
lim/(0 = {'
'
r-*oo
[ oo , a > 0.
2. / / / i flV^, /2 G #V a2 , tfien f\+fi
/?Vmax(ai,a2)- # moreover
limr_^oo /2(f) = oo, f/itt f/ie composition f\ o /2 e i?V aia2 .
3. If f RVa with a > 0 (a < 0) f/ien / w asymptotically equivalent to a strictly
increasing (decreasing) differentiable function g with derivative g' e RVa-\ if
a > 0 and -gf RVa-\ if a < 0.
As a consequence of this; if f e RVa(a > 0) is bounded on finite intervals of
R+, then
sup /(JC) - f(t)
(t->oo).
(B.1.17)
0<*<f
Iff
flVa (a
<0),then
inf/(*) ~ / ( f )
(f-*oo).
(B.1.18)
367
< ^ y
(B.l.19)
(B.L20)
7. Tjf/ /?V a ,a < 0, is bounded on finite intervals ofM+ and 8,i- > 0 are
arbitrary, there exist c > 0 ant/ fo SMC/I that for t > to and 0 < x < ,
^
<-*.
(B.1.21)
& //
/(0=exp(Ta(5)y)
(B.1.22)
vw7/i a continuous function a(s) -> a > 0, 5 00, f/ien /*~ l?Vi/ a , vvftere
Z^ - w f/i inverse function of f.
9. Suppose f e RVa,a > 0, is bounded on finite intervals of'R+. 77in / /?Vi/ a .
(Formally, f is defined only on a neighborhood of infinity; we can extend its
domain of definition by taking f zero elsewhere). In particular, iff e RVa,a >
0, and f is increasing, the inverse function f*~ is in RV\/a.
10. If f RVa, a > 0, there exists an asymptotically unique function h such that
f(h(x)) ~ h(f(x)) ~ x, x - 00. Moreover, h ~ / if f is bounded on finite
intervals ofM.+.
11. Iff e RVa, a > 0, and f(t) = /(/o) + L is(s)dsfor t > to with yjs monotone,
then
hm = a .
'-> fit)
368
for
fit)
x>l.
fit)
< max(2, 2 a + 1 )
for
0<x <1 .
for
0 < x <.
= gifix))=x
for
x>x0.
(B.1.23)
1
aix) '
(B.1.24)
Since / is continuous and fix) > oo, x > oo, (B.1.24) implies
tg'it)
git)
1
oo .
(B.1.25)
369
(B.1.26)
~ hi(f(x))
~ X(JC - oo)
for
i = 1, 2 .
fit)
*~ /Jx
/(fa) - / ( f )
f(t)
it)/fit)
tfit)
fit)
lim sup
^ o o F f(t)
<a .
~
Similar inequalities for 0 < a < 1 lead to liminf f_oo f ^(O/ZXO > a. The cases ^
nonincreasing and a < 0 can be proved similarly.
f(tx)
fit)
^(1 _
8l)e-82\logx\
_ ^ <
e-8\togx\
(ftf*l
_ A
^ 5 2|l0gJC|\
370
We prove that the left-hand side of the inequality tends to zero when 81,82 -> 0
uniformly for x > 0. The proof for the right-hand side is similar. Let, as the <5's go to
zero, x be such that 821 log x | -> 0. Then the second factor goes to zero (note that the
first factor is bounded). If 82 | log x | -* c (0, 00], then | log x | -> 00 and hence the
first factor goes to zero (and the second is bounded).
0<s
s(s) = { an ,
< 1,
-an/2
where the sequence^ is such that an > 0,n > oo,and n log > 00, n 00.
Then
f(x)
0 < J S + i ) ! /((2n + D!)
= exp I I
\ J(2n)!
/((2n)!)
f((2n + 1)!)
ds I = exp ( log(2n + 1)) -> 00 ,
V2
/
/
n -> 00 .
'-Wo
+s+a
k(s)^-
fit)
ds=
I k(s)sa ds .
(B.1.27)
Jo
2. Ift k(t) is integrable on (1, 00) for some s > 0, then ff k(s)f(ts)
for t > 0, and
r
f(ts)
r
lim / k(s)i-i- ds = / k(s)sads .
ds < 00
(B.1.28)
371
Proof. (1) Note that for 0 < s < a + 1 the function ta~sk(t) is integrable on (0,1).
Since there exist c > 1 and 6 > 0 such that f(tx)/f(t)
< cxa~ for tx > to, 0 <
x < 1, by Proposition B. 1.9(5), we can apply Lebesgue's dominated convergence
theorem to obtain
/
k(s)t-t
ds -> / fc(s)sa ds ,
t -> oo .
Furthermore,
l/o
tw
wH 5 ?7(o/o K T ) ' W
<is - 0 ,
t - oo,
Remark B.1.13 It is easy to see that the conditions in Theorem B. 1.12(1) can be
replaced by / bounded on (0, 1) and ta~sk(t) integrable on (0,1) for some s > 0.
Remark B.1.14 De Bruijn (1959) noted that for any slowly varying function L there
exists an asymptotically unique slowly varying function L*, called the conjugate
slowly varying function, satisfying L(JC)L*(JCL(JC)) -> 1, Lk{x)L{xLif{x)) -* 1,
x -> oo. Note that one can obtain L* as follows: Define h{x) := xL(x). Then
L*(JC) ~ h(x)/x, x - oo. In special cases one has L*(x) ~ 1/L(x), x -> oo.
Example: L(JC) ~ (log;c) a (loglogx)^, x - oo, a > 0, /? e R, i.e., if /*(*) ~
jcOogjc^OoglogJc/, JC -> oo, then /i(jc) ~ x(logJc)~ a (loglogx)~^, x -* oo. If
we replace x by jc*" and take f$ = 0, we obtain / ( x ) ~ * y (logx) 8 , y > 0, 8 e R,
implies /(JC) - y ^ j c ^ G o g J c ) " ^ , JC -> oo.
'-oo
a(t)
lim IM^m
/-OO
0.2.1)
fl(f)
exists and the limit function is not constant (this is to avoid trivialities). First note that
(B.2.1) is equivalent to the existence of
372
., ,
fitx) - f{t)
fix) := rhm
t-+oo
(B.2.2)
ait)
=c
x >0,
(B.2.3)
aity)
ait)
f(tx0y)-f(t)
=
a(t)
f(ty)-f(t)
a(t)
fi*oy)-fiy)
>
f(tx0y)-f(ty)
^(JC())
a(ty)
t > 00 .
aitxy)
ait)
aitxy)
aitx)
aitx)
ait)
we have
Aixy) = Aix)Aiy)
(B.2.4)
= fixy)-fiy)
(B.2.5)
If y = 0 we have Cauchy's functional equation again and fiy) = clogjc for some
c ^ 0 and all JC > 0.
Next suppose y ^ 0. Interchanging JC and v in (B.2.5) and subtracting the resulting
relations, we get
fix)il-yy)
Hence fix)/il
= fiy)il-xy)
for
JC, y > 0 .
= c(l - xy)/y
373
The following theorem states that for y ^ 0 relation (B.2.1) defines classes of
functions we have met before. Note that it is sufficient to consider (B.2.3) with c > 0
since replacing / by / in (B.2.1) changes the sign of c.
Theorem B.2.2 Suppose the assumptions of Theorem B.2.1 are satisfied with y ^ O
and c > 0, i.e.,
f(tx) - f(t)
xy - I
lim
= c
.
f-*oo
a{t)
y
1. Ify > 0 then lim^oo fit)/ait) = c/y, and hence f e RVy.
2.1fy<0
then /(oo) := limJC_+00 f(x) exists, \imt-+oc(f(o) fit))/ait)
c/y, and hence /(oo) f(x) e RVy.
Proof The proofs of Theorem B .2.9 and Corollary B .2.11 below can easily be adapted
to show that if y > 0 (y < 0) there is a nondecreasing (nonincreasing) function g
such that
fit) - git) = oiait)) ,
t -> oo .
(B.2.6)
Since we may assume a e RVy (Theorem B.2.1), it follows that we also have
xY l
r
*('*)-*(*>
~
run,
hm
= c
.
(B.2.7)
t->oo
ait)
y
It will become apparent that it is sufficient to prove the theorem for g. Take y > 1
arbitrarily and define t\ = 1 and tn+\ = tny for n = 1, 2, We have, by (B.2.7),
gitn+l) ~ gitn+l)
y
m0Q^
lim r
= r (B.2.8)
n
-* gitn+l) ~ gitn)
Suppose y > 0. Then (B.2.8) immediately implies gitn) -> oo,/z -> oo. Further,
for any e > 0 there exists no such that for any n > no,
r
<yy(l + s){gk+i)-gk)}
k=n0
r
= y (l +
e){{g(t+i)-g(t0))}
l i m i%i>
= yv
(B.2.9)
and hence
Y g{t ) .
w
w t /
a(tn) ~ og(tn+l)
' ^ ' -_ gifn)
~ 'n
C
(B.2.10)
374
~
J-T
g(tn)
g(tn)
C ^
xy - 1 ,
n -> oo . (B.2.11)
For any s > 0 choose n(s) e N such that tn(S) < s < tn(s)+i. Then by (B.2.9) and
(B.2.11),
*(**> < 8(*njs)+ix) g(tnis)+i) ^ ^ ^ ^
n_^OQ
g(s)
g(tn(s)+l) g(tn(s))
Similarly
e(sx}
Q(t~/\x\
p(tf\}
g(s)
g(tn(s))
g(tn(s)+l)
-> xYy
(n-+ oo).
Since y > 1 is arbitrary, we have proved g e RVy. Combining with (B.2.7) gives
a(t)/g(t) -> y/c, t -* oo. With (B.2.6) this implies fit) ~ ca(t)/y,t -> oo, hence
Suppose next y < 0. Then (B.2.8) immediately implies lin^^oo g{tn) < oo.
Write h(x) := lim^oo g(0 g(x). We have
a{tn) ~ ^
flfe)
fife,)
'
Choose e > 0 and v > (1 + s ) - 1 ^ . Note that since a e RVy the above expression
is bounded above for n > no by
it=n
l-y^l+e)
*fti) M W l ) - h{fn)
Afti)
afe)
fl(f)
375
fl(f)
hm
'-+oo
f(fx)-g(t)
= log* .
a(t)
e Ti(a) and g : R
hm
r->oo
(B.2.13)
a{t)
376
zeJ
>-,xn
A(tn)
ThenA(Zi,) = A(F2,w).
Sincea e R Vo (Theorem B. 2.1) we have the inequality A (tn) > A{tn+xnz)/2
for z e Z\,n and n > no by Proposition B. 1.9(5). As a consequence, Z\,n c Z2,n for
n > o, where Z2,n is defined by
Z2,>
-I-
F(tn+xn)-F(tn+xn-z)
Mtn +*n -Z)
Z/} C
[45]
for n sufficiently large since xn -> 0. Hence we find that X (limn-^oo sup Z2,w) >
A (lim^-xx) sup Z\,n) > k(J)/2 or A (limn_^oo sup Y\,n) > X(J)/2. This implies the
existence of a real number JCO contained in infinitely many Y\,n or infinitely many
Z2,n, which contradicts the assumption Mmt^oo{F{t + *o) F(t))/A(t) = JCO.
Corollary B.2.10 If f e U(a), for any e > 0 there exist to,c > 0 such that for
t >to,x >l,
\f(tx)-f(t)\
(B.2.14)
< cxc
ait)
f(fu) - f(t)
ait)
/('*) ~ fit)
ait)
f(ek+lt)
a(e t)
k=0
- f(ekt) a(ekt)
k
(B.2.15)
f(tx) - f(e"t)
n
a(t)
a(e t)
e
a{ent)
a{t)
Using (B.2.15) and the inequality a{tx)/a(t) < c\x for some c\ > 0, t > ti
(Proposition B. 1.9(5)), we find that for t > to := maxfo, ^ ) ,
f(tx) - f(t)
a(t)
2d E eek
k=o
For the last statement, take t = to in (B.2.14).
377
lim
ffm-mdx^p
d* =
(B.2.16)
t^ooJx
x
Jx
x
a(t)
Now choose t\ > to such that f(ex) f(x) > 0 for x > t\. Then
[e f(tx)
I
J\
fte f(x)
ax = /
Jtl
[' fix)
ax I
Jtl
ax
x
= I
ax + /
x
Jti
ax
x
Jti
=: goit)
Note that go is nondecreasing and by (B.2.16),
r
8o(f) - fit)
lim
= - .
^oo
ait)
2
Now go e U(a) by Theorem B.2.8. Define g(t) := g0(te~l/2). Then g e Ilia)
and git) - fit) = o(a(t)), t-^ oo.
I fis)ds.
(B.2.17)
JtQ
(B.2.18)
2. The function cp : (to, oo) > R is well-defined for some to > 0 and eventually
positive, and
fitx)-fit)
,DOim
r
iim
(B.2.19)
= \0gX
x > o .
3. The function <p : (to, oo) -> R w well-defined for t > to and slowly varying at
infinity.
4. There exists p e RVo such that
f.
/(0 = P(0+ / P t o y
(B.2.20)
5. 77iere exist c\, C2 R, i, 2 /?Vb wif/i tfi(0 ~ 2(0> * -> oo, such that
378
(B.2.22)
From Corollary B.2.10 it follows that /(f) = oit&), t -> oo, for any > 0 (take
f = f0 in (B.2.14)). Since tait) e RV\ (Theorem B.2.7), we have fit) = oitait)),
t -+ oo.
We can apply Lebesgue's theorem on dominated convergence to the second term
on the right-hand side in (B.2.22) since by Corollary B.2.10 for tu > to, 0 < u < 1,
aitu)
and by Proposition B. 1.9(5) for tu > t\, 0 < u < 1,
0 < - i - < ciu~ .
Hence lim^oo (pit)/a(t) = / 0 log u du = 1, which proves that (1) implies (2).
For proving that (2) implies (3) see Theorem B.2.7.
Next we prove that (3) implies (4). By Fubini's theorem we have
J
ds J J Y ~ "u "S
ds = J
S
J tQ
J to
J to J to
=U>
) du = fit) - (p{t) .
'o
f(t)
(aiitx)
\ aiit)
fx a2itu) du
aiit)
as t -> oo, for all x > 0.
as t - oo,
ast->oo.
(B.2.23)
379
Proof. Consider the representation (B.2.20). Theorem B.1.5 implies that p(t) =
o(J\ p(s)/s ds), t -> oo. Hence, if f p(s)/s ds < oo, p ( 0 -> 0, f -> oo, and
lim^oo / ( ' ) = c + / r p(*)A <**. Then /(oo) - /(*) = /r p(s)/s ds e RV0
(Proposition B. 1.9(4)). If J p(s)/s ds = oo, then f(t) ~ f[p{s)/s ds e RV0
(Proposition B.1.9(4)).
When comparing (B.2.12) and (B.2.19) one sees that cp(t) ~ a(t)9 as t -> oo.
Now Theorem B.1.5 implies <p(0 = o(f(t)). Relation (B.2.23) follows. The second
relation follows in a similar way.
Remark B.2.14 1. Note that from the proof of Corollary B.2.13 it follows, using
(B.2.17), that <p(t) ~ a(t), t -> oo. As a consequence of Corollary B.2.13, the
limit relation (B.2.13) above is strictly stronger than f(t) ~ g(t), t -> oo.
2. Theorem B.2.12 is also true (and the proof not much different) with <p replaced
by the function
f
du
tJ
f{u)-^-f(t).
3. The result of Corollary B.2.11 is obtained again from Theorem B.2.12 by taking
g(t) = f p(s)/s ds with p as in (B.2.21).
4. Suppose / is locally integrable on R + and a R Vb- Then
IM^m^o,
,-,00,
(B.2.24)
a(t)
for x > 0, and
0,
r -> oo ,
(B.2.25)
are equivalent. The proof follows closely the proof of Theorem B.2.12.
5. From Theorem B.2.12(5) it is clear that for any a e R Vb, there exists a function
/ such that fe Tl(a).
6. Let t\ > 0 be such that / is locally integrable on (fi, oo). Then Theorem B.2.12
holds for any to > t\.
We mention some properties of functions that belong to the class n .
Proposition B.2.15 LIff,geTl
then f + g e II. If f e Tl, and h e RVa,
a > 0, then foheTl,
where ho f denotes the composition of the two functions.
/ / / 6 n, lim^oo f(t) = oo, and h is differentiate with h! RVa, a > 1,
then ho f eTl.
2. If f e Tl(a) is integrable on finite intervals ofM+ and the function f\ is defined
by
hit) := r 1 [ f(s) ds,
t>0,
(B.2.26)
380
= 0.
e RV-2
(B.2.27)
a(t)
f ( t x )
-S2<
a(t)
< (1 + h)X-^f
8\
+h .
(B.2.28)
t>t0,
(B.2.29)
JtQ
/ (l$?*(0) - f(h(t))
lim
= log x
t-+oo
aaihit))
by the uniform convergence theorem (Theorem B.2.9).
For the last statement we expand the function h:
hjfjtx)) - hjfjt))
aitWifit))
fjtx)
- f(t) h'jfjt)
ait)
+ 0{fjtx) h'ifit))
f(t)})
fl(tx)-Mt)
381
[Xx<p(ts)ds
[ <PQ
a(t)
J\ a(t) s
Now fix x > 1. Since f\ e Ti(a) the above expression tends to log* as t > oo.
Since / is nondecreasing, tcp(t) is nondecreasing. This implies
)-^ ( 0 -<
(1 -x
rxvi
a(t) ~ Ji ai(t)
logx
hm sup
HOC
fl(0
I-*"
for
x > 1.
Mt):=rl
f
Jo
fi-i(s)ds
for t > 0, where fo = / . Repeated application of Theorems B.2.8 and B.2.12 gives
2
f -> oo .
Then f(t)
' "> OO ,
by Theorem B.2.12.
(4) From Remark B.2.14(2) it follows that there exist functions ao, b such that
flo(0 ~ <*(0 *(0 = o(a(t)), t -> oo, and
/(f) = /* ^ * < f a + b(t) ,
for
f > f' .
(B.2.30)
Then for all e, 8\, S3, 84 > 0 there exists f = fo(, 61, 63, S4) such that for all f >
fo, JC > 1 we have
fx ao(ts)
b(tx)
f(tx) - f(t) = / - ^ - ^ ds + - ^ - a ( f j c ) - i(r)
a(tx)
J\
s
~"-'^
< ( (1 + *3) f
s8l~l
f
x51 - 1
1
= j (1 + 83 + e(l + 54)6i)
+ 6(2 + 64) a ( 0
382
using aoit) ~ a(t), bit) = o(a(t))9 and Proposition B.1.9(5). Hence / satisfies the
stated upper inequality if we take , 3, and 84 such that max(<$3 + e(l + <$4)<$i, e(2 +
4)) = <$2. The proof of the lower inequality is similar.
(5) We give the proof of the first statement. The proof of the other statement is
similar. Let
tgit)
h
8(t)
If g e RV-i, then the right-hand side in (B.2.31) tends to logx, as t -> 00 by the
uniform convergence theorem for regularly varying functions (Theorem B. 1.4). Next
suppose / e U(a). We have
f(tx) - f(t)
tg(x)
fx g(fs)
tg(x) r*
a(t)
a(t) Ji
a(t)
.
as
,
g(t)
and the integral is at most x 1 when x > 1. Hence for x > 1, since / e U, we get
. ,tg(t)
In*
hm inf >
r
'-oo
a(t)
x - 1
Similarly we find that limsup,..^ tg(t)/a(t) < (lnx)/(jc 1) for 0 < x < 1. Let
x -* 1 to obtain fg(f) ~ a(t),t -+ 00, and the last function is slowly varying by Theorem B.2.7.
Remark B.2.16 A special case of the current section is obtained when the auxiliary
function a satisfies a(t) ->- p > 0, t -> 00.
Note that the specialization of Theorem B.2.12 then gives the following statement:
Suppose g : R + -> R + is measurable. Then g /? Vp if and only if log g is locally
integrable on (to, 00) for some to > 0 and
lim /
log ( } ds = / l o g ^ ds = p .
This can be seen by applying Theorem B.2.12 for f(t) = log g(t).
A uniform inequality in the spirit of Proposition B. 1.10 is as follows.
Proposition B.2.17 If f Tl(a), f/iere emto a positive function ao with ao(t) ~
a(t) -> 00 swc/i that for all s, 8 > 0 f/*ere 15 0 fo = *o(> 5) MC/I that for t, tx > to,
f(tx) tfo(0
f(t)
5
a
- l o g * < emaxOc ,.*; )
log x =
a0(tx)
a0(t)
1+
C Upjtu)
J\ \a0(t)
383
_ \ dw
) u
max(jc,l)
)+ /
max(M5 l,u
8 l
) du
/min(jc,l)
-H)<
max(jc5,^~5) .
=
t-*oo
a(t)
y
for all x > 0, where y is a real parameter, i.e., f ERVy. Then for all 6,8 > 0
there is a to = to(s,8) such that for t,tx > to,
fitx) - f(t)
xy - I
< 8xy max(jc , x ),
ao(t)
where
Yf(t) ,
a0(t) :=
y > 0,
-y(/(oo)-/(*)),
l
fit) - t~ SI f{s)
Y <0,
ds,y=0.
fc:E+->Ew
a(t)
2. Ift k(t)
t -+ oo .
i;
and
k(s)logs ds,
Jo
Jo
i;
k(s)
k(s)f(ts)ds
fits) - f(t)
ait)
< oo , for
*~f
t > 0
k(s) log s ds ,
f oo .
(B.2.32)
384
Proof. (1) Note that for 0 < e < 1 the function t~sk(t) is integrable on (0,1). We
proceed as in the first part of the proof of Theorem B .2.12. Applying Corollary B .2.10
we have
f k(s) / ( ^ ) ~ / ( ' ) ds_+ f k(s)\ogs ds
Jto/t
a(t)
Jo
by Lebesgue's theorem on dominated convergence. Since k is bounded, ta(t) e RV\,
and f(t) = o{tlt2), t -> oo, we have
ft0/tk(s)
Jo
/("WO
a(t)
ds
fok(s7)f(s)ds-f(t)fk(*7)ds
ta(t)
as t -> oo.
(2) The second statement is proved in a similar way.
Remark B.2.20 Theorem B.2.19(1) also holds under the alternative conditions /
bounded on (0, 1) and f0 s~ek(s) < oo for some s > 0.
Theorem B.2.21 Suppose that f is nondecreasing andcj) is its left-continuous inverse
function. Let y be a real parameter. Equivalent are:
1. There exists a positive function a such that for x > 0,
r
lim
'-o
f(tx) - f(t)
=
a(t)
XY-\
2. There exists a positive function g such that for all x for which 1 + yx > 0,
1
= ( 1 + Yx)
F i T7T\
'
+ e)
It follows that
(1 - ey - i ^ u ((i _
Y
*~
g)0(O)
u m )
a(d>(t))
^uqi + eMty-umt))
a (0(0)
= 0.
a (0(0)
fl(0(O)
^d + ^ - i
""*
as t t x* and consequently
lim
*t**
^ t - U(<l>(t))
385
lim
ft**
u{x<t)(t))-t
xy-\
a((/)(t))
yxyIY,
= (1 +
f { t X )
= ^
(B.3.1)
a{t)
for all x > 0. Since one often needs to control the speed of convergence in (B.3.1),
it is useful to build a theory in which the convergence rate in (B.3.1) is the same for
all x > 0, that is, there exists a positive function A with lim^oo A{t) = 0 such that
f(tx)-f(t)
a(f)
H(x) := lim
t-^oo
xY-1
Y
(B.3.2)
A(t)
exists for all x > 0. First we want to exclude a more or less trivial case. If H(x) =
c{xY \)/y for some c R, the limit relation can be reformulated as
f(tx)-f(t)
lim
:A(Q)
^OOICACO)
_ *y-l
^ _
= a
A(f)
lim
t-*oo
_ xZ^l
a{t)
A(t)
= d f sy~l
f up~l du ds + c2 f
sy+p~l
ds .
(B.3.3)
386
=clXy-
(B.3.4)
p
and
lim *W
*P .
(B.3.5)
/ view 0/ f/ie restrictions discussed before the theorem, the constant c\ cannot be
zero ifp = 0.
Remark B.3.2 Relation (B.3.3) holds with a replaced by a\ and A replaced by A\
if and only if Ai(0 ~ A(t) mda\(t)/a(t) - 1 = o(A(t))91 -> oo.
Remark B.3.3 Alternatively, one can write (B.3.2) as
r
lim
t-+oo
f{tx)-f(t)-a{t){xy-\)/y
ai(t)
f .
= H(x)
By assumption, there exist y\, j2 <= R such that (#(;yi), {y\ l)/y)and(H(y2), O^
l)/y) are linearly independent; hence with A = (y\ l J / O ^ 1) w e obtain that
#Cyi) ^H(y2) 7^ 0. Now we subtract X times (B.3.6) at argument y = y2 from
(B.3.6) with argument y = y\ and we obtain
lim l(H(yi)
f^<x> I
- kH(y2))(l
+ 0 (1))
(,
*)_>'ai^>
t-Va\(t)
x>0
387
From this, we conclude that \\mt-^oo{tx)~y a\{tx) / {t~y a\{t)) exists for all x > 0.
Since the limit should be finite for all x > 0, it must be positive for all x > 0. Hence
t~Ya\(t) is regularly varying with index p, say. The existence of this limit, together
with relation (B.3.6), implies that limr-*oo((fjiO~ya(fjc) - t~ya(t))/(t~yai(t))
also
exists for all x > 0. Hence we obtain (B.3.4) for some c\ e R by Theorem B.2.1.
From Theorem B.2.2 we must have p < 0.
As a result, we obtain the following functional equation for H:
H(xy) = H(y)xp+y+
H(x) + cixy?^-
for
x,y>0.
(B.3.7)
I up~l
duds
is a solution of (B.3.7). Obviously, the function G(x) = H(x) H\ (x) satisfies the
homogeneous equation
G(xy) = G(x) + G(y)xp+y
f up~l du ds + c2 I sp+y~l
ds .
Y ) '
'
(B.3.8)
H(x) =
2
li(logx) ,
p = y=0,
388
Next we prove that in some cases we can relate 2ERV functions to classes of
functions we have met before.
Theorem B.3.6 (de Haan and Stadtmiiller (1996)) Suppose that thefunction f satisfies the conditions of Theorem B.3.1. Then:
1. in case p = y = 0,
f(t) = h(t) +\
/" his)
-U-ds
Jo s
with h e Yl;
2. in case p = 0, y ^ 0,
t~yfit)
n,
where fit)
( fit) ,
y > 0,
I /(oo) - fit),
y <0;
:=
is in ERVK+/0 .
Conversely, any of the properties (1), (2), (3) implies that f is in 2ERV.
Remark B.3.7 More precisely: In case p = 0 and y > 0,
r
hm
^oo
(tx)-yfjtx)-t-yfjt)
= logjc
*
Clt~Yfit)Ait)
(B.3.9)
or, equivalently,
lim - ^
?^oo
c\Ait)
= xY logx .
(B.3.10)
In case p = 0 and y < 0, (B.3.9) and (B.3.10) hold with / replaced by /(oo)
In case p < 0, lim^oo f yait) = c exists in (0, oo), and
lim
a(0A(0/(^+c2)
V+P
Remark B.3,8 1. It follows from the representations of Theorem B.3.6 that (B.3.3)
holds locally uniformly in (0, oo) by the properties of the function classes to
which the / ' s belong in the different cases. The representations also lead to
Potter bounds for relation (B.3.3).
2. Note that in most cases the existence of a second-order relation makes the firstorder relation simpler; for example, in case p < 0 and p + y = 0 one has
fit) ~ C3ty', as t - oo.
389
lim
fitxy)
'-oo
- f(ty) - / ( ; * ) + f(t)
= (logx)(logy)
CL\(t)
for JC, y > 0 with a\ := aA. Hence gx e H(a\ log*) for x > 0, where
gxit):=f(tx)-f(t).
Now Theorem B.2.12 implies that
lim
where
- ^ r ~ = l>
(B311)
1 ff
hx(t) := gx(t) - - / gx(s) ds = h(tx) - h(t)
with
t-*oo
a\(t)
= log*
(B.3.12)
Jo
Conversely, if (B.3.12) holds and h e T\ia\) then for x > 0,
/(fjO-/(0-(*(0+ai(0)log*
aiit)
hitx) - hit)
fx (hist) - hit)
= lim
h /
f-+oo
aiit)
J i V i(0
fx
ds
= logx + / (logs - 1)
J\
s
lim
*->oo
= \i\ogx)2
\
1
/
ds
(2) We can assume (if necessary change the function a somewhat) that ci = 0 in
(B.3.3). Write (B.3.4) as
lim
*-+*>
= x y log*
y
390
xY-\\
/
y_ = Q
( JT log*
A{t)
y\
lim-J*)
t-+oo
}.
We get
lim {f^-y~la^}-{fV-y~laV}
'-+00
a\(t)
Zl*LZ
y
(B.3 13)
=
= .
t-+oo
a\{t)
yl
. 1 . .
(B.3.14)
f(tx) - y la(tx) _ _ i y
=
-jt .
(B.3.15)
(tx)-yf(tx)-t-yf(t) _ i (/jc)-yq(/jc) - f ^ o
The result follows in view of (B.3.4). The proof for y < 0 is similar. Conversely, if,
for example, t~Y f{t) e 11(a), then
r
hm
'-oo
- \)/y
= xy log*
.
B
hm
f(t))
e 11(a),
yi
= xy logjc .
Hence / 2ERV.
(3) In view of Theorem B.2.2, relation (B.3.4) implies for p < 0 and c\ ^ 0 in
(B.3.3) that
c := lim t~Ya{t)
f->>oo
c - t-ya(t) - c i
hm
=
,
*-< t~ya\(t)
p
i.e.,
fl(0
= cfy + i(O(l+0(l)).
391
lim
*->oo
a\(t)
CK+P-1
PV
JC^-IX
J C ^ - 1
K+P
X+P
i.e.,
. {/('*>-c^}-{/(*)-c*ri} /Cl W . + P .
lim
lim
i-i
^ - = \
'-oo i
a\(t)
o + c2)) v 4- p
-oo
a\(t)
\P
/ y + i (B.3.16)
Note that in order not to have a trivial limit we need to have c\/p+C2 ^0. Conversely,
if a function / satisfies (B.3.16), then
f(tx)-f(t)-cty^
t-+oo
_XY+P-I
(c\/p + C2)a\(t)
y+p
t -> o o ,
t -> oo .
r (Anx)
r(x)
(Ak~ly)
r (y)
A \q (Akx) - q (Ak'lx)\
^
r (A*" 1 *)
(Ak~lx)
r (x)
392
r(y)
hm sup
= 1.
*-*y>jrr(*)
< ^,
- < 2 ,
r(x)
r(x)
< e,
JC > JCO,
3s
r(x)
(B.3.17)
as n -> oo. Since l i m * . ^ r(*) = 0, for > 0 we can find jq such that
\q(y)-q(x)\
<s
for y > x > x\. Hence C := tim^-^oo^C*) exists by Cauchy's criterion. Taking
y - oo in (B.3.17) we conclude that
C - 9 ( x ) = *(r(Jc)),
oo .
We finish the main theorems by establishing uniform inequalities for the class
2ERV.
Theorem B.3.10 (Drees (1998); cf. Cheng and Jiang (2001)) Let f be a measurablefunction. Supposefor some positivefunction a, some positive or negativefunction
A with Hindoo A(t) = 0, and parameters y e K, p < 0, that
/(**)-/(*) _
xy-\
ait)
lim
Y_
A(t)
f->00
- /
>
'
du ds
(B.3.18)
for all x > 0, i.e., f e 2ERV. Then for all 8,8 > 0 there exists to = to(e,8) such
that for all t,tx > %
f(tx)-f(t)
aoit)
XY-\
-*Y,pW
Ao(0
<sm?K{xy+p+\xY+f>-8),
(B.3.19)
where
Xy+P-1
^y,pM
-jw-
(B.3.20)
= 1
2
i(logx) , y = p = 0,
cf'' ,
a0(t) :=
p < 0,
- y ( / ( o o ) - / ( / ) ) ,Y<P
yf(t),
1/(0 + 7 ( 0 ,
= 0,
y > /> = o.
y = P = 0,
(B.3.21)
393
Ao(0 :=
-(Y + P)7i%Ji0
y + P<0,
(y + ^ J S -
y+
fit)
aoit) '
y + p = 0, p < 0,
m
m.
P<0,
p>0,p<0,
(B.3.22)
Y * P = 0,
fit)'
y=p=0
aoit) '
-xo-U
h(s) ds
(B.3.23)
and
f(t) - c ^ 1 ,
p < 0,
t-y(fioo)-f(t)),y<p
fit) :=
t~ fit)
fit) ,
= 0,
y>p
= 0,
(B.3.24)
Y =P=0
Moreover,
apitx) _
aoit)
ry
Aoit)
<emax(xr+p+s,xY+<,-s).
(B.3.25)
Proof From Theorem B.3.6 it is clear that for all cases except when y = p = 0 the
limit relation (B.3.1) can be reformulated as a relation of extended regular variation
(first order) for a simple transform of / . Hence those cases are covered by Theorem
B.2.18. It remains to consider the case y = p = 0. We use Theorem B.3.6:
fit) = hit) + / his) ds
Jo
with h e n. Now Theorem B.2.18 implies
\h(tx)-h(t)
log* <emax(jr,jc
kit)
for some positive function k and therefore
Aoit)
\hitx)-hit)
kit)
<s(l
log* +
+ -\msK(xs,x~s)
\hjtu)-hjt)
i;
I W)
-logK
394
iim
'?>
=L_
A/;-'
A(f)
p + y-
= H(x)
(B .3. 2 6)
a\{t)
with H not a multiple of(xy l)/y. Then there exists a twice differentiable function
f\ with
lim (f(t) - Mt)) lax(t) = 0 ,
(B.3.27)
f-oo
= 1;
(B.3.28)
(B.3.29)
hm
*->>
_ xy-i
= # (x)
Ai(0
(B.3.30)
(B.3.31)
we have
lim Ai(r) = 0 ,
(B.3.32)
t-+oo
\Ai\eRVp.
(B.3.33)
Proof For the case y = p = 0 the proof is given in Lemma B.3.14 below. For other
values of y and p separate proofs apply. As an example we give the proof for p = 0,
y >0.
Assume that the function a\ is positive (for negative a\ a similar proof applies).
Then (B.3.26) implies by Theorem B.3.6,
t-Yai(f)/Y
= logjc
395
(B.3.34)
= lim
t-^oo
t-Ya\{t)
= 0
(B.3.35)
a\{t)
lim t^-l
= -1 .
'-* 00 *i(0
Combining (B.3.34), (B.3.35), and
(B.3.36)
gi(tx)-gi(t)
lim
^
= log x
tgx(t)
we obtain
lim
&
=1,
i.e.,
lim
t-+oo
^^-
= 1.
(B.3.37)
a\(t)
ai(0
ai(0
/(0-* gi(0 ,
y
= y
-i , y ^ g K O - f ^ g i C O ) '
1- y
ai(0
ai(0
/ ( o - ^ g i ( o , -i
*y+Vi(o
J
= y
hy
y 0,
f - oo, by (B.3.35) and (B.3.37). Hence (B.3.28) holds. Finally, by (B.3.28),
(B.3.36), and (B.3.37),
a(OAi(f)~f/i(f)Ai(0
=
= (Y +
t2f'i(t)-(y-l)tf'l(t)
l)tr+lg'l(t)+ty+2g'i(t)
~ y ' y + V i ( 0 ~ i(0 .
Hence (B.3.31), (B.3.32), (B.3.33), and (B.3.29) hold.
t -* o o .
396
Lemma B.3.14 (A.A. Balkema) Let <f> be measurable and for all x,
x2
+ a2(t) +o(a2(t))
t -+ oo .
(B.3.38)
Since this relation is essentially 2ERV, we know that it holds locally uniformly.
Let y be a C2 function with compact support that satisfies
fxky(x)dx=0
for
xky(x)dx
= l for
= 1,2,
k = 0,
]4>(f +
s)y(s)ds
and for m = 1, 2 , . . . ,
with ^ ( 0 ) = ^ .
Then x/r^ satisfies, for all x,
lim
y
f
,\/"
= 1.
(B.3.39)
Fork = 0,1,2,
ak(t)-1r<kHt)
= o(a2(f)),
and in particular,
4>(0 - iKO = o(a2(t)) ,
<H t -> oc .
f </>(t + x - s)y(2)(s)
ds
fo{a2(t))Ym(s)ds
rf5
397
ds + j o(a2(f))y<k\s)
ds .
ak(t)j^-y{k\s)ds=ak(t)
since f(sj /j\)y^k\s)
p < k) and
for 7 > k.
= 0 for all s if
R e m a r k B.3.15 A simpler case of second-order behavior is related to regular variation rather than extended regular variation. Suppose f e RVy for some y R and
there is some positive or negative function A such that
/('*)
XY
lim -*>
*-> A(0
#(*)
=:
(B.3.40)
exists for x > 0 with if not constant. If (B.3.40) holds, we say that the function / is
of second-order regular variation. Then
,.
&>
(tx)-yf(tx)-t-yf(t)
= H{X)>
t-vf(t)Ait)
hence the function t~y f(t) is extended regularly varying, H(x) = (xp X)/p for
some p < 0, and the theory of Section B.2 applies. Using the inequalities of Theorem
B.2.18 one then gets, if p < 0,
lim
'7~
" '
A(0
'
/>
398
lim
xy-i
= Hyfp(x),
A(t)
t-+oo
where
i / vv-Y+P
+'-i
p \
xy
y+p
(B.3.41)
-i\
Y
Suppose y T p.
Then
f0,
ait)
K+
fit)
lim
f->>00
A(t)
< P < 0,
(B.3.42)
where for y > Owe define I := lim^oo f(t) a(t)/y. Furthermore, in case y > 0
assume p < 0, and we have
lim
Q(t)
t-*oo
xy--\
= Hy_tfi,(x),
(B.3.43)
where
Y < P < 0,
A(0,
A(0,
0)ory>p>0,
12(01 RVp>,and
P,
y < P < 0,
,
, y , p < y < o,
p = \
y , (0 < y < p and/ ^ 0),
p,
(0 < y < p ad/ = 0) or y > p > 0 .
If y > 0 anJ p = 0, f/i ftmif in (B.3.43) equals zero for any Q(t) satisfying
A(t) = 0 ( 0 ( 0 ) orequivalentlya(t)/f(t)
- y = 0(Q(t)).
Proof We start with the proof of (B.3.42) and separately analyze the cases y < 0,
y = 0, 0 < y < p, y = p, and y > p.
Start with y < 0. Then a(t)/A(t) e RVy-p and by assumption /(oo) > 0.
Hence
lim
f-oo
7&ZI=
A(r)
Hm
hm
(')
(0.
K"P<0,
oo , y - p > 0 .
-> /(r)A(r)
Next consider y = 0. Then p < 0 since we assume y ^ p. Then f(t)eR
Vb and
from (B.3.4) and Theorem B.1.6, there exists lim,_,.oo fl(0 and it is positive. Hence,
limbec a(f)/(/(fM(0) = 00.
Next we consider the various possibilities when y is positive. Note that from
(B.3.4),
(/("> - V) ~ (/ ~ f)
lim -*
'^
V^
-a(t)A(t)
399
xf>-\
>
(B.3.44)
xr+> - 1
(B.3.45)
Y+P
Then if y + p > 0,
no-"--
00
lim
'-+
-$a(t)A(t)
Y+p'
that is,
lim^
^
r-*oo A(t)
y + p
Next if y + p < 0, there exists / := lim^oo fit) a{t)/y and
(f{t)-a-f)-i
lim
Hence
J \lJ
1*
'-oo
lim
A(t)
~'-K>O f(t)
ait)Ait)
ait)Ait)
'-oo
Ait)
' - c o /(/)
-ait)Ait)
108
= Y+ (cf. Lemma
76) I T -
A(t)Hy p(x)+ A
400
-imn
xy
-i
+ A(t)HYtP(x) + o(A(t))\
hence
lim
log/(**)-log/(*)
*y - 1
fit)
f-00
-W+-\J$><TT)+>{%)
The result follows for this case.
Now consider y > 0. Again from (B.3.41),
-yf(fX)
m = x~r + %l-^f-+x~vW)
= 1 + (x-r - 1) ( l - ^
{mH
M+oiMt))]
+x-y^
{A(t)Hy,p(X)
o(A(t))};
hence
r
lim
log f(fx)~
f->oo
log / ( Q
.
logjc
6
().
VS
+ X-rA(t)HYiP(x)
; a(t) \f(t)
+ o(A(t)) +
fa(t)
\W)- )-
a(t)/f(t)
Q(t)
- K+
y < p < 0,
(lim^oo f(t) - a(t)/y = 0
and 0 < y < p) or y > p > 0,
-1 ,
(lim^oc f(t)-a(t)/y
^0
401
inverse of a probability distribution) and one wants to have conditions in terms of the
distribution function itself.
Theorem B.3.19 Suppose that f is nondecreasing and <j> is its left-continuous
function. Then (B.3.3) is equivalent to
<j>(t+xa(<t>(t))) _ ( ! _ ! _
inverse
yx)l/Y
HP ,
^
AfAf.n
= -(l + YxrM,YHl
+ yx)1'*) (B.3.47)
rt/(oo)
A(0(r))
locally uniformly for x e ( l / m a x ( 0 , y ) , l/max(y, 0)), where H is the limit
function in (B.3.3)
Remark B.3.20 1. The result is also true for the right-continuous inverse of / .
2. In case y = 0 we define (1 + yx)l/y = e*.
3. For specific parameters we can give more specific statements, such as, for example:
(a) if a = 0, y > 0, then rl^<f>(t) e II;
(b) if a = 0, y < 0, then f ^ ( / ( o o ) - rl) 6 II;
(c) if a < 0, y = 0, then \imt^oo(e-cit+x) <t>(t + *)) - c/(e~ct<t>(t) - c) = ea\
x R.
Proof Since by Remark B.3.8(l) relation (B.3.3) holds locally uniformly we can
replace x by x(t) = 1 + sA(t) in (B.3.3) and get
r
lim
'-oo
= 0,
(B.3.48)
hence
hm
= e.
fl(0A(0
Applying this for s > 0 and < 0 and using /((0(f))") * < / ( ( 0 ( O ) + ) we
obtain lim, t / ( o o ){/(0(O) - f}/{a(0(f))A(</>(O)} = 0. This and (B.3.3) imply that
f(4>(t)x)-t
_ JC^-1
lim ^/JL/X,
= H(x) .
ff/(oo)
A(0(O)
Now Ft(x) : = (/(</>(*)*) t)/a((/>(t)), x > 0,t < / ( o o ) , is a family (with respect
to 0 of nondecreasing functions; furthermore, (xy l)/y has a positive continuous
derivative, the function //() is continuous, and the function A satisfies A ( 0 ( 0 ) - 0,
r t / ( o o ) . Therefore we can apply an obvious generalization ofVervaat's lemma (see
Lemma A.0.2) to deduce (B.3.47). The converse implication is similar.
402
lim sup
'-0<5<l
(B.4.1)
= o.
fs(tx)-fs(t)
**'>-!
as(t)
Y(s)
= 0,
(B.4.2)
where as (t) is positive. The definition ofjoint (extended) regular variation of secondorder is analogous. The function y is called the index function.
This concept of joint (extended) regular variation is used in Chapter 9 and we
develop some properties that are needed in that chapter.
Theorem B.4.2 Suppose f is jointly regularly varying. For any positive e, 8 > 0
there exists to = to(s, 8) such that for t, tx > to, 0 < s < 1,
fs{tx)
rYis)
fsH)
<exy{s)m<ix(x\x-8)
(B.4.3)
and
(1 - e)xyto min (xs, x~s) < ^ -
7(0
(B.4.4)
Proof Clearly it is sufficient to prove the statements for y = 0 for all s. The first
step is to prove that (B.4.1) holds locally uniformly for x e (0, oo). It is sufficient
to deduce a contradiction from the following assumption: suppose there exist 8 > 0
and sequences sn > so, tn -> oo, xn > 0, as n -> oo, such that
\FSn(tn + xn) - FSn(tn)\
>8
for n = 1, 2 , . . . , where Fs(x) := log/ 5 (e*). The rest of the proof of the local
uniformity is exactly like the proof of Theorem B.2.9. Now we know that for any
s (0, 1), there exists a fy such that if t > to, x [1, e], s e [0,1],
|log/,(/x)-log/5(0|<.
Take any x > 1. We write x = eny, y [1, e) for some nonnegative integer n. Then
|log/,(^)-log/,(OI
n
- log Mte1-1)|
+ | log fs(tenv)
- log
fs(ien)\
403
Theorem B.4.3 Suppose f is jointly extended regularly varying, i.e., (B.4.2) holds.
For any e, 8 > 0 there exists to = to(s, 8) such that for t, tx > to, 0 < s < 1,
fs(tx)-fs(t)
**'>-!
as(t)
y(s)
<e
1+JC
y(s)
max (*'.*-'))
(B.4.5)
Moreover, the function as(t) is jointly regularly varying with index function y.
Proof. The proof of the last statement is analogous to that of Theorem B .2.1 and it is
left to the reader.
The defining relation (B.4.2) holds locally uniformly for x e (0, oo). This can be
proved in the same way as for the corresponding result in the previous theorem, now
following the proof of Theorem B.2.9.
By Theorem B.4.2 for any e, 8 > 0 we canfindto such that for t, tx > to we have
\as(tx)
rY(s)
as(t)
<exy(s>max(xs,x-s)
as(tx)
(1 - e)xYM min (x\ x~s) < ^ ^ - < (1 + s)x^s) max (*', x~s),
as(t)
(B.4.6)
(B.4.7)
and
yyis) - 1
y(s)
fs(ty) - fs{t)
as(t)
<
(B.4.8)
for y e [1, e]. Take any x > 1. We write x = eny, where y e [1, e] with some
nonnegative integer n. Note that
fs(teny) - fs(t)
as(t)
(e"y)y^ - 1
y(s)
(fs(teny) - fs(te")
as(ten)
yM
as(ten)
- l\
MO
y(s.)
+
+
(aAte^
V as(t)
\ yxW - 1
J y(s)
, e^>-l\
supyO)
y(s) + s
inf y(s)
e+1I] y(s) _ i
/ eg+inf
404
that
\fAteny)-Mt)
("#'s-l
MO
Y(s)
(
<sC
i=0
ey(s)
1 + +
Y(s)
_ i \ tf(n+i)(y(*)+) _ i
y(s)
e(n+l){Y(s)+e)
*?<*>+*-1
_ 1
Y(s) + 6
_ i
y(s) + e
1
~ -y(s)
1
- e ~~ Je'
_ i
e (n+l)(y(j)+e+2v'i)
V?
xCO + e
(consider y(s) + e < 0 and y(.s) + e > 0 separately and use that (1 e~x)/x < 1
for * > 0 and that (e* l)/x is increasing for x > 0).
Hence (with y = jce~n as before)
Mteny) - fs(t)
as(t)
(eny)y^-l
Y(s)
as(0
'-+0<5<1 fs(t)
- Y+(s) = 0,
(B.4.9)
where y+(s) := max(0, y(s)). For any e, 8 > 0 there exists to = to(s, 8) such that if
t, t x > to, 0 < s < 1,
log/,(^)-lQg/,(0
as(t)/fs(t)
**-<*>-!
< s ( l + xy~(s) max(A jT a ) ,
y-(s)
(B.4.10)
405
flM*)), K(*0)>0,
oo ,
<*sH(fn)
y(^o) < 0 .
Suppose Y(SQ) > 0. By Theorem B.4.3 for any e (0, y(so)) there is a fo such
that for tn > to,
1
f*m(tn) - fsH(to)
aSn(tn)
(?)
a .(. + en
yfo.)
y(sn)
It follows that
lim
fsnitn)
fsn{t0)
(B.4.11)
Now Theorem B.4.3 implies (take t = to and JC - oo in the statement of the theorem)
that
lim as(t) = oo
(B.4.12)
f-*00
fsn{tn)
- aSn{fn)
n lim
0
Next suppose y(so) < 0. By Theorem B.4.3 the function ir5(t) defined by
dx
iMO
J\
r dx
X*
Jt
X^
is well defined for those s for which y (s) < 1. By partial integration
ft
fa
rt
/oo
rt
rfu
= '/
/-W^-W
x
Jt
fsM-2
Jto
r
dx
= W ) + /s(0-'o / fsM
,
x
Jt0
i.e.,
/'
dw
r
dx
fs(t) = / irs(u)
xlrs(t) +10 / / , ( * ) .
u
x
Jro
Jto
Next we shall prove
lim sup
1^(0
as(f)
1
l-y(s)
= 0
(B.4.13)
(BAH)
406
for 0 < c < 1, where Ec := {s e [0, 1] : y(s) < c}. This implies that x/r is jointly
regularly varying and also that it is sufficient to prove (B.4.9) where a is replaced
with ty.
Now,
ro
fsitx) - MO dx
as(t) Jx
as(t)
x2'
For any e e (0, c), by Theorem B.4.3 there exists to such that for t, tx > to,
Jl
y(s) ) x 2 | - V i
as(t)
dx
) x*
V+
Hence
xy(s)
_ j
d x
y(s)
x2 1 - y(s)
t^oo as(t) Jx
uniformly for s e Ec. Hence it is sufficient to prove
fsn{tn)
r
hm = oo
for y{so) < 0, where V" is jointly regularly varying with index function y. Now by
(B.4.13),
liminfAM> l i m i n f / 1
M^<^_x
t.1
du - 1
O Nl ~ CO/**)3*
= liminf (1 - 2s)
n-+oo
_ 1 - 2e
1
3s
Hm
n
^-(J0) - 1
Y-(so)
We write
] n o (fsn(tnX)-fSn(tn)
lQ
H m
n-*00
g^nfa^)-l0g/5 w fa)
aSn(tn)/fSn(tn)
1Qg
=
n-*oo
asn(tn)
aSn(tn)
fsnitn)^1)
aSn(tn)/fSn(tn)
I
/xy(so) - 1
\
- log I . . yOo) + 1 1 = logx 0
y(s(so)
\ yfao)
/
For y (s0) < 0, lim^oo aSfl (tn)/fsn (tn) = 0; hence the limit equals
hm
"-*
fSn(tnxn) - fSn(tn)
aSn(tn)
*0y(5o) - 1
.
y(so)
407
References
1. K. Aarssen and L. de Haan: On the maximal life span of humans. Mathematical Population Studies 4,259-281 (1994).
2. M. Ancona-Navarrete and J. Tawn: A comparison of methods for estimating the extremal
index. Extremes 3, 5-38 (2000).
3. B.C. Arnold, N. Balakrishnan, and H.N. Nagaraja: A First Course in Order Statistics.
Wiley, New York (1992).
4. J.M. Ash, P. ErdGs and L.A. Rubel: Very slowly varying functions. Aeq. Math. 10, 1-9
(1974).
5. A.A. Balkema and L. de Haan: A.s. continuity of stable moving average processes with
index < 1. Ann. Appl. Probab. 16, 333-343 (1988).
6. A.A. Balkema, L. de Haan, and R.L. Karandikar: Asymptotic distributions of the maximum of n independent stochastic processes. /. Appl. Prob. 30, 66-81 (1993).
7. O. Barndorff-Nielsen: On the limit behaviour of extreme order statistics. Ann. Math.
Statist. 34,992-1002 (1963).
8. J. Beirlant and J.L. Teugels: Asymptotics of Hill's estimator. Theory Probab. Appl. 31,
463-469(1986).
9. J. Beirlant, P. Vynckier, and J. L. Teugels: Tail index estimation, Pareto quantile plots and
regression diagnostics. J. Amer. Statist. Association 91,1659-1667 (19%).
10. J. Beirlant, J.L. Teugels, and P. Vynckier: Practical Analysis of Extreme Values. Leuven
University Press, Leuven, Belgium (1996).
11. P. Billingsley: Convergence of Probability Measures. Wiley, New York (1968).
12. P. Billingsley: Weak Convergence of Measures: Applications in Probability. SLAM,
Philadelphia (1971).
13. P. Billingsley: Probability and Measure. Wiley, New York (1979).
14. L. Breiman: Probability. Addison-Wesley (1968); Republished by SIAM, Philadelphia
(1992).
15. B. Brown and S. Resnick: Extreme values of independent stochastic processes. J. Appl.
Probab. 14,732-739 (1977).
16. N. G de Bruijn: Pairs of slowly oscillating functions occurring in asymptotic problems
concerning the Laplace transform. Nw. Arch. Wish. 7,20-26 (1959).
17. S. Cheng and C. Jiang: The Edgeworth expansion for distributions of extreme values.
Science in China 44,427-437 (2001).
18. K.L. Chung: A Course in Probability Theory. 2nd Edition, Academic Press, New YorkLondon (1974).
410
References
19. M. Csorg<5 and L. Horvath: Weighted Approximations in Probability and Statistics. John
Wiley & Sons, Chichester, England (1993).
20. D. J. Daley and D. Vere-Jones: An Introduction to the Theory ofPoint Processes. Springer,
Berlin (1988).
21. J. Danielsson, L. de Haan, L. Peng, and C. G de Vries: Using a bootstrap method to
choose the sample fraction in tail index estimation. J. Multivariate Analysis 76,226-248
(2001).
22. A.L.M. Dekkers and L. de Haan: Optimal choice of sample fraction in extreme-value
estimation. /. Multivariate Analysis 47, 173-195 (1993).
23. A.L.M. Dekkers, J.H.J. Einmahl, and L. de Haan: A moment estimator for the index of
an extreme-value distribution. Ann. Statist. 17, 1833-1855 (1989).
24. D. Dietrich, L. de Haan, and J. Hiisler: Testing extreme value conditions. Extremes 5,
71-85 (2002).
25. G. Draisma, H. Drees, A. Ferreira, and L. de Haan: Bivariate tail estimation: dependence
in asymptotic independence. Bernoulli 10, 251-280 (2004).
26. G Draisma, L. de Haan, L. Peng, and T.T. Pereira: A bootstrap based method to achieve
optimality in estimating the extreme value index. Extremes 2, 367-404 (1999).
27. H. Drees: On smooth statistical tail functionals. Scand. J. Statist. 25,187-210 (1998).
28. H. Drees: Weighted approximations of tail processes for ^-mixing random variables.
Ann. Appl Probab. 10, 1274-1301 (2000).
29. H. Drees: Tail empirical processes under mixing conditions. In: H.G Dehling, T. Mikosch,
and M. Sorensen (eds.) Empirical Process Techniques for Dependent Data. Birkhauser,
Boston, 325-342 (2002).
30. H. Drees: Extreme quantile estimation for dependent data with applications to finance.
Bernoulli 9, 617-657 (2003).
31. H. Drees, A. Ferreira, and L. de Haan: On maximum likelihood estimation of the extreme
value index. Ann. Appl. Probab. 14, 1179-1201 (2003).
32. H. Drees, L. de Haan, and D. Li: On large deviations for extremes. Stat. Prob. Letters 64,
51-62(2003).
33. H. Drees, L. de Haan, and D. Li: Approximations to the tail empirical distribution function with application to testing extreme value conditions. To appear in /. Statist. Plann.
Inference (2006).
34. H. Drees and E. Kaufmann: Selecting the optimal sample fraction in univariate extreme
value estimation. Stock Proc. Appl. 75, 149-172 (1998).
35. W.F. Eddy and J.D. Gale: The convex hull of a spherically symmetric sample. Adv. Appl.
Prob. 13,751-763(1981).
36. J.H.J. Einmahl: Multivariate empirical processes. PhD thesis, CWI Tract 32, Amsterdam
(1987).
37. J.H.J. Einmahl: The empirical distribution function as a tail estimator. Statistica Neerlandica 44, 79-82 (1990).
38. J.H.J. Einmahl: ABahadur-Kiefer theorem beyond the largest observation. J. Multivariate
Anal. 55, 29-38 (1995).
39. J.H.J. Einmahl: Poisson and Gaussian approximation of weighted local empirical processes. Stock Proc. Appl. 70, 31-58 (1997).
40. J.H.J. Einmahl, L. de Haan, and V. Piterbarg: Non-parametric estimation of the spectral
measure of an extreme value distribution. Ann. Statist. 29, 1401-1423 (2001).
41. J.H.J. Einmahl and T. Lin: Asymptotic normality of extreme value estimators on C[0, 1].
Ann. Statist. 34,469-492 (2006).
42. P. Embrechts, C. Kluppelberg, and T. Mikosch: Modelling Extremal Events for Insurance
and Finance. Springer-Verlag, Berlin Heidelberg (1997).
References
411
43. P. Embrechts, L. de Haan, and X. Huang: Modelling Multivariate Extremes. In: P. Embrechts (ed.) Extremes and Integrated Risk Measures. Risk Waters Group, 59-67 (2000).
44. M. Falk: Some best estimators for distributions with finite endpoint. Statistics 27,115-125
(1995).
45. M. Falk, J. Htisler, and R.-D. Reiss: Laws of Small Numbers: Extremes and Rare Events.
Birkhauser, Basel (1994).
46. W. Feller: An Introduction to Probability Theory and Its Applications. Vol. 1, 3rd edition,
John Wiley & Sons, New York (1968).
47. A. Ferreira and C. de Vries: Optimal confidence intervals for the tail index and high
quantiles. Discussion paper, Tinbergen Institute, the Netherlands (2004).
48. R.A. Fisher and L.H.C. Tippett: Limiting forms of the frequency distribution of the largest
or smallest member of a sample. Proc. Cambridge Philos. Soc. 24, 180-190 (1928).
49. M.I. Fraga Alves, M.I. Gomes, and L. de Haan: Anew class of semi-parametric estimators
of the second order parameter. Portugalia Mathematica 60, 193-213 (2003).
50. M.I. Fraga Alves, L. de Haan and Tao Lin: Estimation of the parameter controlling the
speed of convergence in extreme value theory. Math. Methods Statist. 12,155-176 (2003).
51. M. Frechet: Sur la loi de probabilite de l'6cart maximum. Ann. Soc. Math. Polon. 6,
93-116(1927).
52. J. Geffroy: Contributions a la theorie des valeurs extremes. Publ. Inst. Statist. Univ. Paris
1 8, 37-185 (1958).
53. J.L. Geluk and L. de Haan: Regular variation, Extensions and Tauberian Theorems. CWI
Tract 40, Amsterdam (1987).
54. E. Gine\ M. G. Hahn, and P. Vatan: Max-infinitely divisible and max-stable sample continuous processes. Probab. Th. Rel. Fields 87, 139-165 (1990).
55. B.V. Gnedenko: Sur la distribution limite du terme maximum d'une serie aleatoire. Ann.
Math. 44,423-453 (1943).
56. L. de Haan: A spectral representation for max-stable processes. Ann. Prob. 12,11941204
(1984).
57. L. de Haan and A. Hordijk: The rate of growth of sample maxima. Ann. Math. Stat. 43,
1185-1196(1972).
58. L. de Haan and T.T. Pereira: Spatial extremes: the stationary case. Ann. Statist. To appear
(2006).
59. L. de Haan and J. Pickands: Stationary min-stable stochastic processes. Probab. Th. Rel.
Fields 72, 477-492 (1986).
60. L. de Haan and S.I. Resnick: Estimating the limit distribution of multivariate extremes.
Commun. Statist. - Stochastic Models 9, 275-309 (1993).
61. L. de Haan and S.I. Resnick: Second order regular variation and rates of convergence in
extreme value theory. Ann. Prob. 24, 119-124 (1996).
62. L. de Haan, S.I. Resnick, H. Rootzen, and C. de Vries: Extremal behaviour of solutions
to a stochastic difference equation with applications to ARCH-processes. Stock Proc.
Appl. 32, 213-224(1989).
63. L. de Haan and A.K. Sinha: Estimating the probability of a rare event. Ann. Statist. 27,
732-759(1999).
64. L. de Haan and U. Stadtmiiller: Generalized regular variation of second order. J. Australian Math. Soc. (Series A) 61, 381-395 (1996).
65. W.J. Hall and J.A. Wellner: The rate of convergence in law of the maximum of an
exponential sample. Statist. Neerlandica 33, 151-154 (1979).
66. PR. Halmos: Measure Theory. Springer (1950).
67. E. Hewitt and K. Stromberg: Real and Abstract Analysis. Springer (1969).
412
References
68. B.M. Hill: A simple general approach to inference about the tail of a distribution. Ann.
Statist 3,1163-1174(1975).
69. J.R.M. Hosking and J.R. Wallis: Parameter and quantile estimation for the generalized
Pareto distribution. Technometrics 29, 339-349 (1987).
70. J. Husler and D. Li: On testing extreme value conditions. Accepted for publication in
Extremes (2005).
71. J. Husler and R.-D. Reiss: Maxima of normal random vectors: between independence
and complete dependence. Stat. Prob. Letters 7, 283-286 (1989).
72. P. Jagers: Aspects of random measures and point processes. In: P. Ney and S. Port (eds.)
Advances in Probability and Related Topics. Marcel Dekker, New York (1974).
73. D.W. Jansen and C.G de Vries: On the frequency of large stock returns: Putting booms
and busts into perspective. Review of Economics and Statistics 73,18-24 (1991).
74. A.F. Jenkinson: The frequency distribution of annual maximum (or minimum) values of
meteorological elements. Quart. J. Roy. Meteorol. Soc. 81,158-171 (1955).
75. H. Joe: Multivariate Models and Dependence Concepts. Chapman & Hall, London
(1997).
76. O. Kallenberg: Random Measures. 3rd edition, Akademic-Verlag, Berlin (1983).
77. M.J. Klass: The Robbins-Siegmund criterion for partial maxima. Ann. Prob. 13, 13691370 (1985).
78. M.R. Leadbetter, G Lindgren, and H. Rootzdn: Extremes and Related Properties of
Random Sequences and Processes. Springer, Berlin (1983).
79. A. Ledford and J. A. Tawn: Statistics for near independence in multivariate extreme values.
Biometrika 83, 169-187 (1996).
80. A. Ledford and J.A. Tawn: Modelling dependence within joint tail regions. J. Royal
Statist. Soc. Ser. B 59,475-499 (1997).
81. A. Ledford and J.A. Tawn: Concomitant tail behaviour for extremes. Adv. Appl. Prob. 30,
197-215(1998).
82. M.J. Martins: Heavy tails estimation variants to the Hill estimator. PhD thesis (in
Portuguese), University of Lisbon, Portugal (2000).
83. D.M. Mason: Laws of large numbers for sums of extreme values. Ann. Prob. 10,754-764
(1982).
84. D. G Mejzler: On the problem of the limit distribution for the maximal term of a variational series. Uvov Politechn. Inst. Naucn. Zp. (Fiz.-Mat.) (in Russian) 38, 90-109
(1956).
85. R. von Mises: La distribution de la plus grande de n valeurs. Rev. Math. Union Interbalcanique 1,141-160 (1936) Reproduced in: Selected Papers of Richard von Mises, Amer.
Math. Soc., Vol. 2, 271-294 (1964).
86. R.B. Nelsen: An Introduction to Copulas. Springer-Verlag, New York (1998).
87. J. Pickands III: Maxima of stationary Gaussian processes. Z Wahrsch. verw. Gebiete 7,
190-233 (1967).
88. J. Pickands III: Sample sequences of maxima. Ann. Math. Stat. 38,1570-1574 (1967).
89. J. Pickands III: Statistical inference using extreme order statistics. Ann. Statist. 3,119-131
(1975).
90. J. Pickands III: Multivariate Extreme Value Distributions. Proceedings: 43rd Session of
the International Statistical Institute. Book 2, Buenos Aires, Argentina, 859-878 (1981).
91. H.S.A. Potter: The mean value of a Dirichlet series n. Proc. London Math. Soc. 47,1-19
(1942).
92. J.W. Pratt: On interchanging limits and integrals. Ann. Math. Statist. 31, 74-77 (1960).
93. A. Renyi: On the theory of order statistics. Acta Mathematica Scient. Hungar. tomus IV,
191-227 (1953).
References
413
94. S.I. Resnick: Extreme Values, Regular Variation and Point Processes. Springer-Verlag,
New York (1987).
95. S.I. Resnick and R. Roy: Random USc functions, max-stable processes and continuous
choice. Ann. Appl. Probab. 1, 267-292 (1991).
96. H. Rootzdn: Attainable rates of convergence for maxima. Statist. Prob. Letters 2,219-221
(1984).
97. H. Rootzdn: The tail empirical process for stationary sequences. Report 1995:9, Mathematical Statistics, Chalmers University of Technology (1995).
98. H. L. Royden: Real Analysis. 2nd edition, Macmillan, New York (1968).
99. G Shorack and J. Wellner: Empirical Processes with Applications to Statistics. Wiley,
New York (1986).
100. M. Sibuya: Bivariate extreme statistics. Ann. Inst. Statist. Math. Tokyo, 11, 195-210
(1960).
101. B. Smid and A. J. Stam: Convergence in distribution of quotients of order statistics. Stoch.
Proc. Appl. 3, 287-292 (1975).
102. N.V. Smirnov: Limit distributions for the terms of a variational series. In Russian: Trudy
Mat. Inst. Steklov. 25 (1949). Translation: Transl. Amer. Math. Soc. 11, 82-143 (1952).
103. R.L. Smith: Uniform rates of convergence in extreme value theory. Adv. in Appl. Probab.
14, 543-565 (1982).
104. R.L. Smith: Estimating tails of probability distributions. Ann. Statist. 15, 1174-1207
(1987).
105. W. Vervaat: Functional limit theorems for processes with positive drift and their inverses.
Z Wahrsch. verw. Gebiete 23, 245-253 (1971).
Index
Conditions
D and D', 197
^-mixing, 199
Estimation, 235
#,259
L, 235, 236, 247,252, 260, 268
Extremal index, 199
Level sets, 235,244, 245
Residual index, 265
Sibuya's coefficient, 269
Spatial, 323
Functions, 221
L, 222, 232, 258, 262, 265, 328
R, 225, 232
Copula, 221
Pickands', 225, 226
Sibuya's, 225
Level sets, 223
Spatial, 195, 322
Spectral measure, 220
Temporal, 195
Diagram of estimates, 120, 121, 123, 124,
148,150,152, 288, 289
Distributions
Beta, 18, 34
Cauchy, 18,34,61,76,120
Double-exponential, 10
Exponential, 18, 34, 60, 154, 174, 179,
194, 196, 322
Extreme value, see Max-stable
Frchet class of, 10
Gamma, 18, 34, 60, 61
Generalized Pareto, 34, 65, 89, 110, 124,
163, 326, 328
416
Index
Geometric, 35
Gumbel, 10
Max-infinitely divisible, 231
Max-stable
Estimation, 252
Multivariate, 208, 217, 221, 226, 231,
235
Simple, 217, 230-232, 235, 247
Univariate, 4, 6, 9
Normal, 11, 18, 61, 120, 179, 194, 197,
221,230,231,322
Poisson, 35
Reverse-Weibull class of, 10
Student-f, 62, 322
Uniform, 94,120,196
Von Mises\ 9, 294
Domain of attraction
Infinite-dimensional, 311, 325, 328
Multivariate, 226
Speed of convergence, 179
Testing, 163
Univariate, 4, 10, 14, 19,44
Drees, H.
Mixing Conditions, 199
Tail (empirical) quantile process, 51, 114
Uniform inequalities, 369, 383, 392
Edgeworth expansion, 185
Empirical
Distribution function, 62, 63, 66, 72, 127,
128, 159, 249
Exponent measure, 280, 336
Mean excess function, 112
Quantile, 13,127
Spectral measure, 251
Tail distribution function, 52, 76, 155,
159, 163, 333, 336
Left-continuous, 236,249, 266
Tail quantile process, 50, 51, 62, 76, 88,
114,155,161,163,200,236,333
Endpoint estimation, 145
Maximum likelihood, 147
Moment, 147
Excursion stability, 326
Exponent measure
Finite-dimensional, 211, 213, 214, 222,
229, 231,235, 248, 259, 272, 276, 286
Estimation, 235, 273, 280
Index
Extreme, 37,40,65
Nondegenerate behavior, 38, 60
Poisson point process, 37,199
Intermediate, 40,49, 65,129,163, 338
Asymptotic normality, 41
Piston, 320
Poisson point process, 39, 198, 204, 214,
272, 302, 304, 314, 323, 329
Extreme order statistics, 37, 60,61,199
Potter's inequalities, 367
Quantile estimation, 82,134
y positive, 138
Maximum likelihood, 139
Moment, 140
Probability weighted moment, 154
RSnyi's representation, 37, 60,71, 83
Rank, 236,249,266
Regular variation, 23, 361,362
Class n, 371,375
Properties, 379
Uniform inequalities, 382
Conjugate slowly, 371
Extended, 295, 371,374
Jointly, 402
Second-order, 385,386
Theorems on inverse function, 384,401
Uniform convergence theorem, 375
Uniform inequalities, 383,392
Index of, 23,362
Jointly, 401
Index function, 295,402, see also Index
function
Uniform inequalities, 402, 403
Karamata's theorem, 363
Properties, 366
Representation theorem, 365
Second-order, 397
Uniform convergence theorem, 363
Uniform inequalities, 369
Scale parameter
Finite-dimensional, 9,128
Estimation, 91, 111, 130, 148, 149, 152,
153,278
Infinite-dimensional, 294
Estimation, 338
Second-order, see also Regular variation
Comparison, 117
417