Sei sulla pagina 1di 421

Springer Series in Operations Research

and Financial Engineering


Series Editors:
Thomas V. Mikosch
Sidney I. Resnick
Stephen M. Robinson

Springer Series in Operations Research


and Financial Engineering
Altiok: Performance Analysis of Manufacturing Systems
Birge and Louveaux: Introduction to Stochastic Programming
Bonnans and Shapiro: Perturbation Analysis of Optimization Problems
Bramel, Chen, and Simchi-Levi: The Logic of Logistics: Theory,
Algorithms, and Applications for Logistics and Supply Chain
Management (second edition)
Dantzig and Thapa: Linear Programming 1: Introduction
Dantzig and Thapa: Linear Programming 2: Theory and Extensions
de Haan and Ferreira: Extreme Value Theory: An Introduction
Drezner (Editor): Facility Location: A Survey of Applications and
Methods
Facchinei and Pang: Finite-Dimensional Variational Inequalities and
Complementarity Problems, Volume I
Facchinei and Pang: Finite-Dimensional Variational Inequalities and
Complementarity Problems, Volume II
Fishman: Discrete-Event Simulation: Modeling, Programming, and
Analysis
Fishman: Monte Carlo: Concepts, Algorithms, and Applications
Haas: Stochastic Petri Nets: Modeling, Stability, Simulation
Klamroth: Single-Facility Location Problems with Barriers
Muckstadt: Analysis and Algorithms for Service Parts Supply Chains
Nocedal and Wright: Numerical Optimization
Olson: Decision Aids for Selection Problems
Pinedo: Planning and Scheduling in Manufacturing and Services
Pochet and Wolsey: Production Planning by Mixed Integer Programming
Whitt: Stochastic-Process Limits: An Introduction to Stochastic-Process
Limits and Their Application to Queues
Yao (Editor): Stochastic Modeling and Analysis of Manufacturing Systems
Yao and Zheng: Dynamic Control of Quality in Production-Inventory
Systems: Coordination and Optimization
Yeung and Petrosyan: Cooperative Stochastic Differential Games

Forthcoming
Resnick: Heavy Tail Phenomena: Probabilistic and Statistical Modeling
Muckstadt and Sapra: Models and Solutions in Inventory Management

Laurens de Haan
Ana Ferreira

Extreme Value Theory


An Introduction

Springer

Laurens de Haan
Erasmus University
School of Economics
P.O. Box 1738
3000 DR Rotterdam
The Netherlands
ldehaan@few.eur.nl

Ana Ferreira
Instituto Superior de Agronomia
Departamento de Matematica
Tapada da Ajuda
1349-017 Lisboa
Portugal
anafh@isa.utl.pt

Series Editors:
Thomas V. Mikosch
University of Copenhagen
Laboratory of Actuarial Mathematics
DK-1017 Copenhagen
Denmark
mikosh@act.ku.dk

Stephen M. Robinson
University of Wisconsin-Madison
Department of Industrial
Engineering
Madison, WI 53706
U.S.A.
smrobins@facstaff.wisc.edu

Sidney I. Resnick
Cornell University
School of Operations Research and Industrial Engineering
Ithaca, NY 14853
U.S.A.
sirl@cornell.edu

Mathematics Subject Classification (2000): 60G70, 60G99, 60A99


Library of Congress Control Number: 2006925909
ISBN-10:0-387-23946-4

e-ISBN: 0-387-34471-3

ISBN-13: 978-0-387-23946-0
Printed on acid-free paper.
2006 Springer Science+Business Media, LLC
All rights reserved. This work may not be translated or copied in whole or in part without the
written permission of the publisher (Springer Science+Business Media LLC, 233 Spring Street,
New York, NY 10013, U.SA.), except for brief excerpts in connection with reviews or scholarly
analysis. Use in connection with any form of information storage and retrieval, electronic
adaptation, computer software, or by similar or dissimilar methodology now known or hereafter
developed is forbidden.
The use in this publication of trade names, trademarks, service marks and similar terms, even if
they are not identified as such, is not to be taken as an expression of opinion as to whether or
not they are subject to proprietary rights.
Printed in the United States of America.
(TXQ/MP)
987654321
spnnger.com

In cauda venenum

Preface

Approximately 40% of the Netherlands is below sea level. Much of it has to be


protected against the sea by dikes. These dikes have to withstand storm surges that
drive the seawater level up along the coast. The government, balancing considerations
of cost and safety, has determined that the dikes should be so high that the probability
of aflood(i.e., the seawater level exceeding the top of the dike) in a given year is 10~ 4 .
The question is then how high the dikes should be built to meet this requirement. Storm
data have been collected for more than 100 years. In this period, at the town of Delfzijl,
in the northeast of the Netherlands, 1877 severe storm surges have been identified.
The collection of high-tide water levels during those storms forms approximately a
set of independent observations, taken under similar conditions (i.e., we may assume
that they are independent and identically distributed). No flood has occurred during
these 100 years.
At first it looks as if this is an impossible problem: in order to estimate the probability of a once-in-10000 years event one needs more than observations over just
100 years. The empirical distribution function carries all the information acquired,
and going beyond its range is impossible.
Yet it is easy to see that some information can be gained. For example, one
can check whether the spacings (i.e., the difference between consecutive ordered
observations) increase or decrease in size as one moves to the extreme observations.
A decrease would point at a short tail and an increase at a long tail of the distribution.
Alternatively, one could try to estimate the first and second derivatives of the
empirical distribution function near the boundary of the sample and extrapolate using
these estimates.
The second option is where extreme value theory leads us. But instead of proceeding in a heuristic way, extreme value theory provides a solid theoretical basis
and framework for extrapolation. It leads to natural estimators for the relevant quantities, e.g., those for extreme quantiles as in our example, and allows us to assess the
accuracy of these estimators.
Extreme value theory restricts the behavior of the distribution function in the tail
basically to resemble a limited class of functions that can be fitted to the tail of the

viii

Preface

distribution function. The two parameters that play a role, scale and shape, are based
roughly on derivatives of the distribution function.
In order to be able to apply this theory some conditions have to be imposed. They
are quite broad and natural and basically of a qualitative nature. It will become clear
that the so-called extreme value condition is on the one hand quite general (it is not
easy to find distribution functions that do not satisfy them) but on the other hand is
sufficiently precise to serve as a basis for extrapolation.
Since we do not know the tail, the conditions cannot be checked (however, see
Section 5.2). But this is a common feature in more traditional branches of statistics.
For example, when estimating the median one has to assume that it is uniquely defined.
And for assessing the accuracy one needs a positive density. Also, for estimating a
mean one has to assume that it exists and for assessing the accuracy one usually
assumes the existence of a second moment.
In these two cases it is easy to see what the natural conditions should be. This is not
the case in our extrapolation problem. Nevertheless, some reflection shows that the
"extreme value condition" is the natural one. For example (cf. Section 1.1.4), one way
of expressing this condition is that it requires that a high quantile (beyond the scope
of the available data) be asymptotically related in a linear way to an intermediate
quantile (which can be estimated using the empirical distribution function).
The theory described in this book is quite recent: only in the 1980s did the contours of the statistical theory take shape. One-dimensional probabilistic extreme value
theory was developed by M. Frechet (1927), R. Fisher and L. Tippett (1928), and
R. von Mises (1936), and culminated in the work of B. Gnedenko (1943). The statistical theory was initiated by J. Pickands III (1975).
The aim of this book is to give a thorough account of the basic theory of extreme
values, probabilistic and statistical, theoretical and applied. It leads up to the current
state of affairs. However, the account is by no means exhaustive for this field has
become too vast. For these two reasons, the book is called an introduction.
The outline of the book is as follows. Chapters 1 and 2 discuss the extreme
value condition. They are of a mathematical and probabilistic nature. Section 2.4 is
important in itself and essential for understanding Sections 3.4, 3.6 and Chapter 5,
but not for understanding the rest of the book. Chapter 3 discusses how to estimate
the main (shape) parameter involved in the extrapolation and Chapter 4 explains the
extrapolation itself. Examples are given.
In Chapter 5 some interesting but more advanced topics are discussed in a onedimensional setting.
The higher-dimensional version of extreme value theory offers challenges of a
new type. The model is explained in Chapter 6, the estimation of the main parameters
(which are infinite-dimensional in this case) in Chapter 7, and the extrapolation in
Chapter 8.
Chapter 9 (probabilistic) and Chapter 10 (statistical) treat the infinite-dimensional
case.
Appendix B offers an introduction to the theory of regularly varying functions,
which is basic for our approach. This text is partly based on the book Regular Vari-

Preface

ix

ation, Extensions and Tauberian Theorems, by J.L. Geluk and L. de Haan, which is
out of print. The authors wish to thank Jaap Geluk for his permission to use the text.
In a book of this extent it is possible that some errors may have escaped our attention. We are very grateful for feedback on any corrections, suggestions or comments
(ldhaan@few.eur.nl, anafh@isa.utl.pt). We intend to publish possible corrections at
Ana's webpage, http://www.isa.utl.pt/matemati/~anafh/anafh.html.
We wish to thank the statistical research unit of the University of Lisbon (CEAUL)
for offering an environment conducive to writing this book. We acknowledge the
support of FCT/POCTI/FEDER as well as the Gulbenkian foundation. We thank
Holger Drees and the editors, Thomas Mikosch and Sidney Resnick, for their efforts
to go through substantial parts of the book, which resulted in constructive criticism.
We thank John Einmahl for sharing his notes on the material of Sections 7.3 and 10.4.2.
The first author thanks the Universite de Saint Louis (Senegal) for the opportunity
to present some of the material in a course. We are very grateful to Maria de Fatima
Correia de Haan, who learned BlgX for the purpose of typing a substantial part of
the text. Laurens de Haan also thanks Maria de Fatima for her unconditional support
during these years. Ana Ferreira is greatly indebted to those who propitiated and
encouraged her learning on the subject, especially to Laurens de Haan. Ana also
thanks the long-enduring and unconditional support of her parents as well as her
husband, Bernardo, and son, Pedro.
In a book of this extent it is possible that some errors may have escaped our attention. We are very grateful for feedback on any corrections, suggestions or comments
(ldhaan@few.eur.nl, anafh@isa.utl.pt). We intend to publish possible corrections at
Ana's webpage, http://www.isa.utl.pt/matemati/ anafh/anafh.html.

Lisbon,
2006

Laurens de Haan
Ana Ferreira

Contents

Preface

vii

List of Abbreviations and Symbols

xv

Part I One-Dimensional Observations


1

Limit Distributions and Domains of Attraction


3
1.1 Extreme Value Theory: Basics
3
1.1.1 Introduction
3
1.1.2 Alternative Formulations of the Limit Relation
4
1.1.3 Extreme Value Distributions
6
1.1.4 Interpretation of the Alternative Conditions; Case Studies .. 12
1.1.5 Domains of Attraction: A First Approach
14
1.2 Domains of Attraction
19
Exercises
34

Extreme and Intermediate Order Statistics


2.1 Extreme Order Statistics and Poisson Point Processes
2.2 Intermediate Order Statistics
2.3 Second-Order Condition
2.4 Intermediate Order Statistics and Brownian Motion
Exercises

Estimation of the Extreme Value Index and Testing


65
3.1 Introduction
65
3.2 A Simple Estimator for the Tail Index (y > 0): The Hill Estimator . 69
3.3 General Case y e E: The Pickands Estimator
83
3.4 The Maximum Likelihood Estimator (y > \)
89
3.5 AMomentEstimator (/ R)
100
3.6 Other Estimators
110
3.6.1 The Probability-Weighted Moment Estimator (y < 1)
110

37
37
40
43
49
60

xii

Contents
3.6.2 The Negative Hill Estimator (y < - | )
Simulations and Applications
3.7.1 Asymptotic Properties
3.7.2 Simulations
3.7.3 Case Studies
Exercises

113
116
116
120
121
124

Extreme Quantile and Tail Estimation


4.1 Introduction
4.2 Scale Estimation
4.3 Quantile Estimation
4.3.1 Maximum Likelihood Estimators
4.3.2 Moment Estimators
4.4 Tail Probability Estimation
4.4.1 Maximum Likelihood Estimators
4.4.2 Moment Estimators
4.5 Endpoint Estimation
4.5.1 Maximum Likelihood Estimators
4.5.2 Moment Estimators
4.6 Simulations and Applications
4.6.1 Simulations
4.6.2 Case Studies
Exercises

127
127
130
133
139
140
141
145
145
145
147
147
148
148
149
153

Advanced Topics
5.1 Expansion of the Tail Distribution Function and Tail Empirical
Process
5.2 Checking the Extreme Value Condition
5.3 Convergence of Moments, Speed of Convergence, and Large
Deviations
5.3.1 Convergence of Moments
5.3.2 Speed of Convergence; Large Deviations
5.4 Weak and Strong Laws of Large Numbers and Law of the Iterated
Logarithm
5.5 Weak "Temporal" Dependence
5.6 Mejzler's Theorem
Exercises

155

3.7

155
163
176
176
179
188
195
201
204

Part II Finite-Dimensional Observations


6

Basic Theory
6.1 Limit Laws
6.1.1 Introduction: An Example
6.1.2 The Limit Distribution; Standardization
6.1.3 The Exponent Measure

207
207
207
208
211

Contents

xiii

6.1.4 The Spectral Measure


6.1.5 The Sets Qc and the Functions L, / , and A
6.2 Domains of Attraction; Asymptotic Independence
Exercises

214
221
226
230

Estimation of the Dependence Structure


7.1 Introduction
7.2 Estimation of the Function L and the Sets Qc
7.3 Estimation of the Spectral Measure (and L)
7.4 A Dependence Coefficient
7.5 Tail Probability Estimation and Asymptotic Independence: A
Simple Case
7.6 Estimation of the Residual Dependence Index r\
Exercises

235
235
235
247
258

Estimation of the Probability of a Failure Set


8.1 Introduction
8.2 Failure Set with Positive Exponent Measure
8.2.1 First Approach: cn Known
8.2.2 Alternative Approach: Estimate cn
8.2.3 Proofs
8.3 Failure Set Contained in an Upper Quadrant; Asymptotically
Independent Components
8.4 Sea Level Case Study
Exercises

271
271
276
276
278
279

261
265
268

285
288
289

Part III Observations That Are Stochastic Processes


9

Basic Theory in C[0,1]


9.1 Introduction: An Example
9.2 The Limit Distribution; Standardization
9.3 The Exponent Measure
9.4 The Spectral Measure
9.5 Domain of Attraction
9.6 Spectral Representation and Stationarity
9.6.1 Spectral Representation
9.6.2 Stationarity
9.7 Special Cases
9.8 Two Examples
Exercises

10 Estimation in C[0,1]
10.1 Introduction: An Example
10.2 Estimation of the Exponent Measure: A Simple Case
10.3 Estimation of the Exponent Measure

293
293
294
296
302
311
314
314
315
321
323
328
331
331
332
335

xiv

Contents

10.4 Estimation of the Index Function, Scale and Location


10.4.1 Consistency
10.4.2 Asymptotic Normality
10.5 Estimation of the Probability of a Failure Set

338
339
344
349

Part IV Appendix
A

Skorohod Theorem and Vervaat's Lemma

357

Regular Variation and Extensions


B.l Regularly Varying (RV) Functions
B.2 Extended Regular Variation (ER V); The class n
B.3 Second-Order Extended Regular Variation (2ERV)
B.4 ERV with an Extra Parameter

361
361
371
385
401

References

409

Index

415

List of Abbreviations and Symbols

Notation that is largely confined to sections or chapters is mostly excluded from the
list below.
=d
->d
_>p

a(t) ~ b(t)
a

n
y

r
v
Q

Up)
1 - F^
2ERV
a+
aflV b

a Ab
[a]
\a\
a.s.
C[0, 1]
C + [0, 1]
q + [0, i]
Cj"[0, 1]

equality in distribution
convergence in distribution
convergence in probability
lim, a{t)/b(t) = 1
tail index
residual dependence index
extreme value index
gamma function
exponent measure
metric \\/x l/y\
indicator function: equals 1 if p is true and 0 otherwise
left-continuous empirical distribution function
second-order extended regular variation
max (a, 0)
min(a, 0)
max (a, b)
min(<z, b)
largest integer less than or equal to a
smallest integer greater than or equal to a
almost surely
space of continuous functions on [0, 1] equipped with the
supremum norm
{/ C[0, 1] : / > 0}
{/ ^ C[0, 1] : / > 0, l/loo = 1}
{/ C[0, 1] : / > 0, |/|oo = 1}

xvi

Abbreviations and Symbols

C+[0,1]
CSMS
D and D'
D[0, T]
V(Gy)
ERV

f+

f4

zl/loc
F
Fn
Gy

GP
i.i.d.
L
R+
R*_*
R(Xt)
RVa
U
X*

*x

(0, oo] x Ct [0, 1] with the lower index Q meaning that the space
(0, oo] is equipped with the metric Q
complete separable metric space
dependence conditions
space of functions on [0, T] that areright-continuousand have
left-hand limits
domain of attraction of GY
extended regular variation
left-continuous version of the function /
right-continuous version of the function /
generalized inverse function of /
(usually left-continuous) inverse function of /
SUp5 | / ( 5 ) |

distribution function
right-continuous empirical distribution function
extreme value distribution function
generalized Pareto
independent and identically distributed
dependence function
[0,oo)
Ri\{(0,0)}
rank of X,- among {X\, Z 2 , . . . , Xn)
regularly varying with index a
(usually left-continuous) inverse of 1/(1 F)
supfjc : F(x) < 1} = U(oo)
M{x : F(x) > 0}

Extreme Value Theory

Parti

One-Dimensional Observations

1
Limit Distributions and Domains of Attraction

1.1 Extreme Value Theory: Basics


1.1.1 Introduction
Partial Sums and Partial Maxima
The asymptotic theory of sample extremes has been developed in parallel with the
central limit theory, and in fact the two theories bear some resemblance.
Let X\, X2, X 3 , . . . be independent and identically distributed random variables.
The central limit theory is concerned with the limit behavior of the partial sums X\ +
X2 H
h Xn as n -> 00, whereas the theory of sample extremes is concerned with
the limit behavior of the sample extremes max(Xi, X 2 , . . . , Xn) or min(Xi,..., Xn)
as n - 0 0 .

One may think of the two theories as concerned with failure. A tire of a car can
fail in two ways. Every day of driving will wear out the tire a little, and after a long
time the accumulated decay will result in failure (i.e., the partial sums exceed some
threshold). But also when driving one may hit a pothole or one may accidentally hit
the sidewalk. Such incidents have either no effect or the tire will be punctured. In the
latter case it is just one big observation that causes failure, which means that partial
maxima exceed some threshold.
In fact, in its early stages, the development of the theory of extremes was mainly
motivated by intellectual curiosity.
Outline of This Chapter
Our interest is in finding possible limit distributions for (say) sample maxima of
independent and identically distributed random variables. Let F be the underlying
distribution function and JC* its right endpoint, i.e., x* := sup{* : F(x) < 1}, which
may be infinite. Then
max (Xi, X 2 , . . . , Xn) - JC* ,

n -> 00 ,

1 Limit Distributions and Domains of Attraction

where >p means convergence in probability, since


P(max ( X i , . . . , Xn) < x) = P(XY < x, X2 < x,...,

X < x) = Fn(x) ,

which converges to zero for x < x* and to 1 for x > x*. Hence, in order to obtain a
nondegenerate limit distribution, a normalization is necessary.
Suppose there exists a sequence of constants an > 0, and bn real (n = 1,2,...),
such that
max(Xi,X 2 ,
...,Xn)-bn
an
has a nondegenerate limit distribution as n -> oo, i.e.,
lim Fn(ax+b) = G(x),
(1.1.1)
n->oo

say, for every continuity point x of G, and G a nondegenerate distribution function.


In this chapter we shall find all distribution functions G that can occur as a limit in
(1.1.1). These distributions are called extreme value distributions.
Next, for each of those limit distributions, we shall find necessary and sufficient
conditions on the initial distribution F such that (1.1.1) holds. The class of distributions F satisfying (1.1.1) is called the maximum domain of attraction or simply
domain of attraction of G. We are going to identify all extreme value distributions
and their domains of attraction. But before doing so it is useful and illuminating to
reformulate relation (1.1.1) in two other ways.
In (1.1.1) we have used a linear normalization. One could consider a wider class
of normalizations. However, this linear normalization already leads to a sufficiently
rich theory.
We shall always be concerned with sample maxima. Since
min(Xi,X 2 , ...,Xn)

= - m a x ( - X i , -X2,...,

-X)

the results can easily be reformulated for sample minima.


1.1.2 Alternative Formulations of the Limit Relation
We are going to play a bit with condition (1.1.1). By taking logarithms left and right
we get from (1.1.1) the equivalent relation that for each continuity point x for which
0 < G(x) < 1,
lim n log F(anx + bn) = log G(x) .
(1.1.2)
n-*oo

Clearly it follows that F(anx + bn) -> 1, for each such x. Hence
-log F(ax + b)
lim
= 1,
n-K 1 - F(anx +bn)
and in fact (1.1.2) is equivalent to
lim (1 - F(ax + bn)) = - log G(x) ,
n*oo

1.1 Extreme Value Theory: Basics

or
i*rn

/-i i o \

n^oo n (1 - F(anx + *)) "" - log G(x) '


Next we are going to reformulate this condition in terms of the inverse functions.
For any nondecreasing function / , let /*"" be its left-continuous inverse, i.e.,
f*-(x):=M{y:f(y)>x}.
For properties of the inverse function that will be used throughout, see Exercise 1.1.
Lemma 1.1.1 Suppose fn is a sequence of nondecreasing functions and g is a nondecreasing function. Suppose that for each x in some open interval (a, b) that is a
continuity point ofg,
lim fn(x) = g(x).
(1.1.4)
n-oo

Let f^~> 8^~ be the left-continuous inverses of fn and g. Then, for each x in the
interval (g(a), g(b)) that is a continuity point of g*~ we have
lim /*-(*) = **(*).

(1-1.5)

n-*oo

Proof Let x be a continuity point of g^. Fix s > 0. We have to prove that for
w,no e N, n > no,

We are going to prove the right inequality; the proof of the left-hand inequality is
similar.
Choose 0 < s\ < 8 such that g*~(x) e\ is a continuity point of g. This is
possible since the continuity points of g form a dense set. Since g*~ is continuous
in x, g*~(x) is a point of increase for g; hence g(g*~(x) e\) < x. Choose 8 <
x
g(g*~~(x) i) Since g*~(x) s\ is a continuity point ofg, there exists o such
that fn (g*~(x) s\) < g (g*~(x) \) + 8 < x for n > o- The definition of the
function fn*~ then implies g*~(x) s\ < fj~(x).

We are going to apply Lemma 1.1.1 to relation (1.1.3). Let the function U be the
left-continuous inverse of 1/(1 - F). Note that U(t) is defined for t > 1. It follows
that (1.1.3) is equivalent to
lim "M-b*

= G- (e-V*) =: D(x) ,

(1.1.6)

for each positive x. This is encouraging since relation (1.1.6) looks simpler than
(1.1.3). We are now going to make (1.1.6) more flexible in the following way:
Theorem 1.1.2 Let an > 0 and bn be real sequences of constants and G a nondegenerate distribution function. The following statements are equivalent:

1 Limit Distributions and Domains of Attraction


1.
lim Fn(anx + bn) = G(x)
ntco

for each continuity point x ofG.


2.
lim t (1 - F(a(t)x + b(t))) = - log G(x) ,

(1.1.7)

t-+oc

for each continuity point x of G for which 0 < G{x) < 1, a(t) := a[t], and
b{t) := b[t] (with [t] the integer part of t).
lim

'-oo

U(tx)-b(t)

a(t)

= D{x) ,

for each x > 0 continuity point of D(x) = G" - (e-1^),


bit) = b[t].

(1.1.8)
a(t) := a[t], and

Proof. The equivalence of (2) and (3) follows from Lemma 1.1.1. We have already
checked that (1) is equivalent to (1.1.6). So it is sufficient to prove that (1.1.6) implies
(3). Let x be a continuity point of D. For t > 1,
U([t]x) - b[t] ^ U(tx) - bm ^ U ([t]x (1 + l/[f])) - b[t]
a[t]
~
a[t]
~
a[t]
The right-hand side is eventually less than D(x') for any continuity point x' > x with
D(x') > D(x). Since D is continuous at x, we obtain
hm
t-*

U(tx) - bm
= D(x) .
ay]

This is (3).

We shall see shortly (Section 1.1.4) the usefulness of these two alternative conditions for statistical applications.
1.1.3 Extreme Value Distributions
Now we are in a position to identify the class of nondegenerate distributions that can
occur as a limit in the basic relation (1.1.1). This class of distributions was called the
class of extreme value distributions.
Theorem 1.1.3 (Fisher and Tippet (1928), Gnedenko (1943)) The class ofextreme
value distributions is GY (ax + b) with a > 0, b real, where
Gy(x)=exp(-(l

+ yx)-1/yy

1 + yx > 0 ,

(1.1.9)

with y real and where for y = 0 the right-hand side is interpreted as exp(e~x).
Definition 1.1.4 The parameter y in (1.1.9) is called the extreme value index.

1.1 Extreme Value Theory: Basics

Proof (of Theorem 1.1.3). Let us consider the class of limit functions D in (1.1.8).
First suppose that 1 is a continuity point of D. Then note that for continuity points
x >0,
U(tx) - U(t)
lim
= D(x) - D(\) =: E{x) .
(1.1.10)
t-*oo
a(t)
Take y > 0 and write
Ujtxy) - U(t)
ajt)

Ujtxy) - U(ty) a(ty)


a(ty)
ajt)

Ujty) - Ujt)
ajt)
'

' '

Weclaim that l i m ^ o ^ t / (ty)U(t))/a(t) andlimr_>oo a(ty)/a(t) exist. Suppose


not. Then there are A\, A2, #1, 2 with A\ ^ A2 or B\ ^ B2, where Bt are limit
points of jUjty) Ujt))/ajt) and A,- are limit points of ajty)/ajt), i = 1, 2, as
t - 00. We find from (1.1.11) that
E(xy) = E(x) At + Bi9

(1.1.12)

1 = 1,2, for all x continuity points of () and E(- y). For an arbitrary x take a
sequence of continuity points xn with xn \ x (n -> 00). Then (jcny) -> (jcy)
and (;cn) - E{x) since is left-continuous. Hence (1.1.12) holds for all x and y
positive. Subtracting the expressions for i = 1, 2 from each other one obtains
(*) (Ai - A2) =

B2-Bi

for all x > 0. Since cannot be constant (remember that G is nondegenerate) we


must have A\ = A2 and hence also B\ = B2. Conclusion:
A

( y )

= l i m ^
*-< ajt)

exists for y > 0, and for JC, y > 0,


E(xy) = E(x) A(y) + Ejy) .
Hence for s := logx, t := log y (JC, y ^ 1), and if (JC) := Ejex), we have
H (t +s) = Hjs)Ajef)

+ Hjt),

(1.1.13)

which we can write as (since HjO) = 0)

There is certainly one t at which # is differentiable (since H is monotone); hence


by (1.1.14) if is differentiable everywhere and
H'jt) = H'j0)Aje')

(1.1.15)

Write Qjt) := Hjt)/H'j0). Note that //'(O)cannot be zero: //cannot be constant


since G is nondegenerate. Then QjO) = 0, Q'jO) = 1. By (1.1.13),

1 Limit Distributions and Domains of Attraction


Q(t+s)-Q(t)

= Q(s)A(e'),

and by (1.1.15),
Q(t + s)-Q(t) = Q(s)Q'(t).

(1.1.16)

Subtracting the same expression with t and s interchanged we get


Q'(s) - 1
s

Q(s) ,
s

hence (let s -> 0)


Q(t) 2"(0) = Q\t) - 1 .
It follows that Q is twice differentiable, and by differentiation,
Q"(0) C'(0 = Q"(t).
Hence

(ioge / ) / (o = G/,(0)=: Y

for all t . It follows that (note that Q'(0) = 1)

G'(0 = e>"
and (since Q(0) -= 0)

Q(t) = / e^Js
Jo

This means that

-1

H(t) = Hf(0)
and
Z)(0 = D(l) + tf'(O)

t* - l

Hence

Now D(JC) = G*~ (e~l/x), and hence

^ < * >

X
n

, ^ *

<

L L 1 8

-logG(x)
Combining (1.1.17) and (1.1.18) we obtain the statement of the theorem.
If 1 is not a continuity point of D, follow the proof with the function U(txo) with
XQ a continuity point of D.

1.1 Extreme Value Theory: Basics


|

y4

j/=0

r=-*

- - y=i

j/=4
y=j:oo

1.0 -r
0.8 -j
0.6 H

^f

0.2 -J

Fig. 1.1. Family of extreme value distributions Gy.


Remark 1.1.5 Let X\, X 2 , . . . be independent and identically distributed random
variables with distribution function F. The distribution function F is called maxstable if for some choice of constants an > 0 and bn real,
p

y_i1

,_ju

n <x j

/> (X! < x)

for all x and n = 1, 2,


Note that the class of max-stable distributions is the same
as the class of extreme value distributions (cf. Exercise 1.2).
The parametrization in Theorem 1.1.3 is due to von Mises (1936) and Jenkinson
(1955). This theorem is an important result in many ways. It shows that the limit
distribution functions form a simple explicit one-parameter family apart from the
scale and location parameters. Figure 1.1 illustrates this family for some values of y.
Moreover, it shows that the class contains distributions with quite different features.
Let us consider the subclasses y > 0, y = 0, and y < 0 separately:
(a) For y > 0 clearly GY (x) < 1 for all x, i.e., the right endpoint of the distribution
is infinity. Moreover, as x -* 00,1 Gy(x) ~ y~l/yx~l/y,
i.e., the distribution
has a rather heavy right tail; for example, moments of order greater than or equal
to \/y do not exist (cf. Exercise 1.16).
(b) For y = 0 the right endpoint of the distribution equals infinity. The distribution,
however, is rather light-tailed: 1 GQ{X) ~ e~x as x -> 00, and all moments
exist.
(c) For y < 0 the right endpoint of the distribution is l/y so it has a short tail,
verifying 1 - Gy(-y~l - x) ~ (-yx)~1/y,
a s * | 0.
An alternative parametrization is as follows:
(a) For y > 0 use GY({x l)/y) and get with a = l / y > 0,
^ , v
W:=

f0 ,

x < 0,

lexpH-),*>0.

10

1 Limit Distributions and Domains of Attraction


This class is often called the Frechet class of distributions (Frechet (1927)).

(b) The distribution function with y = 0,


GQ(X) = exp

(~e~x),

for all real JC, is called the double-exponential or Gumbel distribution.


(c) For y < 0 use Gy((1 + x)/y) and get with a, = l/y > 0,
WaU)

- J1 ,

x>0.

This class is sometimes called the reverse-Weibull class of distributions.


Recall that if relation (1.1.1) holds with G = GY for some y l , w e say that the
distribution function F is in the domain of attraction of Gy. Notation: F e T>(GY).
The result of Theorem 1.1.3 leads to the following reformulation of Theorem 1.1.2.
Theorem 1,1.6 For y e R the following statements are equivalent:
1. There exist real constants an > 0 and bn real such that
lim Fn(anx + *) = Gy(x) = exp ( - ( 1 + yx)-l/y)

(1.1.19)

for all x with 1 -f yx > 0.


2. 77iere is a positive function a such that for x > 0,
fcW^.W.i^,

t-*oo

a{t)

(1 ,. 2 0)

where for y =0the right-hand side is interpreted as log*.


3. There is a positive function a such that
lim t (1 - F(a(t)x + 1/(0)) = (1 + K * r 1 / K ,

(1.1.21)

for all x with 1 -f- yx > 0.


4. 77ire exists a positive function f such that

.'"f < , ( " ) -+^- w


*t**

<">

l ^(0

for all x for which 1 + y* > 0, w/iere JC* = sup{jc : F(^) < 1}.
Moreover, (1.1.19) holds with bn := U(n) and an := a(n). Also, (1.1.22) holds
with f(t)= a (1/(1 -F{t))).
Proof The equivalence of (1), (2), and (3) has been established in Theorem 1.1.2.
Next we prove that (2) implies (4).
It is easy to see that for s > 0,

1.1 Extreme Value Theory: Basics

11

g (h^(t) - e) < t < g (h^(t) + e),


where g is a nondecreasing function and h *~ its right-continuous inverse (cf. Exercise
1.1). It follows that

(l-g)^-i

U(T^J)-U(T^)

u
<

^t-vJT^Wj)

(i=fe)
U
(r%) " (l=fe)
(1+eF-l
i

(T=W))

' _^.

(l=fe)
as / f **, and consequently

\l-F(t)J

Hence by (2) for all x > 0,

hm ^

'-r =

\\-F{t))

and by Lemma 1.1.1,

^_ F ( :;I^))=" ^
i.e., (4) holds.
The converse (i.e., (4) implies (2)) is similar.

Example 1.1.7 Let F be the standard normal distribution. We are going to prove that
(1.1.3) holds: for all x > 0,
lim n (1 - F(anx + bn)) = e~x

(1.1.23)

w->oo

with
bn := (2log* - loglogw - log(47r))1/2

(1.1.24)

and
an := 1
Note first that bn/(2\ogn)1/2
2 _ 1 log 2 ->- 0 and hence

(1.1.25)

-> 1, w - oo; hence log6 - 2 _ 1 loglogn

12

1 Limit Distributions and Domains of Attraction

b2
1
-^ + \ogbn - log* + - log(2;r) -* 0 ,

(1.1.26)

as n -> oo. Now by (1.1.25),


d
n
- n (1 - F (anx + W ) = -;== exp
dx
bnj2lT
exp { - ( ^ + logbn - logn + I log(2jr)) } c - 2 / ( 2 ^ ) e - ,

for x R. Hence

n(i F(a x+bn))

.
exp

+b / 2/2/

\2 . \

) ' f

Z"00

- " =^r (-(i ") )


bn*j2n Jx

\bn

du

'bl
= exp J - (^ + logfcn - logn + i log(2jr)) J f ^-"2/(2^2)^-M

JM

by Lebesgue's theorem on dominated convergence. Hence (1.1.23) holds.


Since in the limit relation (1.1.23) we can replace an by a!n and bn by b'n provided
an/arn - 1, (bfn bn)/an -> 0, we can replace bn, an from (1.1.24) and (1.1.25) by,
e.g.,
w
,01
,i/2 loglogn + log(47r)
^ = (21gn)
(21ogn)i/2
and
< = (21ogn)- 1 / 2 .
1.1.4 Interpretation of the Alternative Conditions; Case Studies
We want to introduce three problems in which extreme value theory can be fruitfully
used, the first with y near zero, the second with y positive, and the third with y
negative. It shall become clear that the alternative conditions are the important ones.
The present account is introductory. A more complete account will follow later on in
the applications, Sections 3.7.3,4.6.2, 7.2, 7.3, and 8.4.
Sea Level
As we related in the preface to this book, approximately 40% of the Netherlands is
below the sea level. Much of it has to be protected against the sea by dikes. These
dikes have to withstand storm surges that drive the seawater level up along the coast.
The government, balancing considerations of cost and safety, has determined that
the dikes should be so high that the probability of a flood (i.e., the seawater level

1.1 Extreme Value Theory: Basics

13

exceeding the top of the dike) in a given year is 10 . The question is then how high
the dikes should be built to meet this requirement. Storm data have been collected
for more than 100 years. In this period, at the town of Delfzijl, in the northeast of
the Netherlands, 1877 severe wind storms have been identified. The collection of
high-tide water levels at Delfzijl during those storms forms approximately a set of
independent observations, taken under similar conditions (i.e., we may assume that
they are independent and identically distributed).
First we convert the 10~4 probability to a probability per storm (since that is what
the data give us). Since there are 1877 storms in 111 years, we look for the level that
is exceeded by one such storm with probability (111/1877) x 10~ 4 . Let F be the
distribution function of the high-tide water level during one such storm. Then we are
lookingforthel-(lll/1877)xlO- 4 quantile,i.e.,F^ (l - (111/1877) x 10~4) =
U ((1877/111) x 104) * U (17 x 104).
Of course this is a simplification of what really goes on: if X is the maximum in
a year and Z the maximum during a storm, we have X maxi<j<# Z; with N the
(random) number of storms in a year. We ignore the randomness of N.
Normally one would estimate a quantile by the empirical quantile, that is, one
of the order statistics. But the highest order statistic corresponds in this case to
F<~ (1 - 1/1878) U (19 x 102). So we need to extrapolate beyond the range
of the available data.
At this point Theorem 1.1.2(3) can help. In view of Theorem 1.1.3 the condition
can be written as

,. u(tx) - u(t)
y - I
hm

=x

'-oo

a(t)

for x > 0, some real parameter y, and an appropriate positive function a. Let us
use the approximation with t < 19 x 102 (so that we can estimate U(t) using the
empirical quantile function) and tx := 17 x 104. Then the requested quantile is
Y

U ( l 7 x 10 4 ) = U(tx) U(t) + a{t)X

(1.1.27)

For the moment we just remark that Theorem 1.1.2(3) seems to provide a possibility
to estimate an extreme quantile by fitting the function (xY \)/y (in the present case
with y close to zero) to the quantile type function U.
S&P500
Daily price quotes of the S&P 500 total return index over the period from 01/01/1980 to
14/05/2002 are available, corresponding to 5835 observations. The daily price quotes,
pt say, are used to compute daily "continuously" compounded returns rt by taking
the logarithmic first differences of the price series, rt = log(pt/pt-\). Stock returns
generally exhibit a positive mean due to positive growth of the economy. Therefore
we shall focus only on the loss returns. We shall assume here that the observations
are independent and identically distributed (cf. Jansen and de Vries (1991)).

14

1 Limit Distributions and Domains of Attraction

Now consider the situation in which one has to decide on a big risky investment
while one cannot afford to have a loss larger than a certain amount. Then it is of
interest to know of the probability of the occurrence of such a loss.
If F is the distribution function of the log-loss returns and x is the critical (large)
amount, the posed problem is the estimation of 1 F{x). Then Theorem 1.1.6(3)
suggests that for some positive function a and large JC,

,+

>-'<">K ^)

-\/y

for some large t. This again motivates a tail probability estimator under the extreme
value theory approach.
Life Span
There is some discussion among demographers and physicians about whether there
is limited life span for humans; that is, if we consider the life span of a human as
random, does its probability distribution have a finite endpoint? The problem can be
considered from the point of view of extreme value theory.
A data set consists of the total lifespan (in days) of all people born in the Netherlands in the years 1877-1881, still alive on January 1,1971, and who died as a resident
of the Netherlands. This concerns about 10 000 people.
Now we want to decide whether the right endpoint of the distribution is finite.
The endpoint, finite or not, is lim^oo U(t), which we denote by U(oo). It will be
verified later on (Section 3.7) that for the given data set the hypothesis y < 0 is not
rejected. Moreover, it is known from Section 1.2 below that for y negative we must
have U(oo) < oo. Hence we shall believe that there is an age that cannot be exceeded
by this cohort.
Next we estimate this maximal age U(oo). For this we use the limit relation
(1.1.20) again:

U(tx)-U(f)

a(t)

xy-l

(t - > oo),

but as we shall show later, if y < 0, this relation is also valid for JC = oo, i.e.,
U(oo) - U(t)

a(t)

1
y

(t - oo),

or
U(oo) % U(t) -
Y

(t -* oo) .

This relation will be the basis for estimating 17(oo).


1.1.5 Domains of Attraction: A First Approach
In this section we shall derive sufficient conditions on the distribution function F that
ensure that there are sequences of constants an > 0 and bn such that

1.1 Extreme Value Theory: Basics

lim Fn(an x + bn) = Gy(x)

15

(1.1.28)

n->oo

for some given real y and all x. These conditions, basically due to von Mises (1936),
require the existence of one or two derivatives of F.
It is easy to see, using relation (1.1.6) from Section 1.1.2, that F cannot be in the
domain of attraction of Gn and Gn with y\ ^ yi.
The following theorem states a sufficient condition for belonging to a domain of
attraction. The condition is called von Mises'condition.
Theorem 1.1.8 Let F be a distribution function and x* its right endpoint. Suppose
F"(x) exists and F'{x) is positive for all x in some left neighborhood ofx*. If

(u 29)

ss ( ^ y ^

or equivalently
i i m , , , vvo = -y - 1

(1.1.30)

then F is in the domain of attraction ofGy.


Remark 1.1.9 Under (1.1.29) we have (1.1.28) with bn = U(n)andan =nU\n)
\/{nF\bn)).

Proof (of Theorem 1.1.8). Here, as elsewhere, the proof is much simplified by formulating everything in terms of the inverse function U rather than the distribution
function F. By differentiating the relation
1
1 ~ F(U(f)) = t
we obtain
F'(U(t))
Differentiating once more, we find that
U (t)

" -

2ri

F(U(tm

rwW-WWtf

- -2[1 - F(Um

[F'iUit))?

'

so that
t U"(t)
U'(t)

F"(U(t))[l - F(U(t))]
{F'{U(t))f

By Theorems 1.1.2 and 1.1.6 the relation to be proved is equivalent to


hm
*-*oo

7a(t)

for all x > 0. So we need to prove that

(1.1.31)
y

16

1 Limit Distributions and Domains of Attraction

,.

t U"(t)

implies (1.1.31) for the same y.


Since for 1 < JCQ < *>
log U\x) - log U'(xo) = /

j^Hds,
U'(S)

JXQ

we have for x > 0, /, tx > 1,


fx
ds
log /'(**) - log U\t) = / A(f J )
with A(f) := tU"(t)/Uf(t).

It follows that for 0 < a < b < oo,

lim sup log

= 0.

logjr

'->fl<*<*

Hence also, since \es el \ < c \s t | on a compact interval for some positive constant c,
/'(/*)
r^-l
lim sup
0.
tf'(0
This implies that
t/ftjc) - l/(Q

xy - 1

-H^-i*

converges to zero.
For later use we next give von Mises' condition formulated in terms of U.
Corollary 1.1.10 Condition (1.1.29) is equivalent to
tU"(t)

lim

(1.1.32)

= v ~ 1 ,

which implies
.
U'{tx)
Y-\
lim
Uf(t) = x

(1.1.33)

t-+oc

locally uniformly in (0, oo) and finally


,.
i i m u(tx) - u(t)
^-oo

tU'(t)

so that by Theorem 1.1.6 we obtain F e V{GY).


Simpler conditions are possible for y ^ 0:

xy

1.1 Extreme Value Theory: Basics

17

Theorem 1.1.11 (y > 0) Suppose JC* = oo and F' exists. If

I t a ^ - I

(11.34)

t^oo 1 - F(t)
y
for some positive y, then F is in the domain of attraction ofG y
Proof As in the proof of Theorem 1.1.8 we see that condition (1.1.34) is equivalent
t0

lim ^ = y .

(1.1.35)

Further,

-i:

tsU\ts)
U(ts)

logU(tx)-\ogU(t)

ds
s

hence by (1.1.35) for x > 0,


,.
U(tx)
lim - 1 - 1 = ^ ,
t-^oo U(t)

(1.1.36)

or
,.

i/(rjc) - 1/(0

hm
t-+oo

yU(t)

J^- 1

which is condition (1.1.20) of Theorem 1.1.6.

Corollary 1.1.12 (y > 0) Condition (1.1.34) is equivalent to


v

tU'(t)

hm

= y,

which implies (1.1.33) in view of (1.1.36).


Theorem 1.1.13 (y < 0) Suppose JC* < oo and F'(x) exists for x < JC*. If
(JC*

- 0*"(0

lim ,
^ =
ft** 1 - F(t)

(1.1.37)
y

for some negative y, then F is in the domain of attraction ofGy.


Proof As before, we can see that condition (1.1.37) is equivalent to
lim

^-^

*->oo l/(oo) - t/(f)

= -y .
Y

(1.1.38)

Since
fx
tsU'its)
ds
\og(U(oo) - U(tx)) - log(/(oo) - 1/(0) = - / . . . , \*
,
Ji J7(oo) - U(ts) s
relation (1.1.38) implies

'

18

1 Limit Distributions and Domains of Attraction


E/(oo) -

U{tx)

- = xY
U(oo)-U(t)

lim
*->oo

i.e., for x > 0,


U(tx) - /(0

lim

t-oo -y(U(OQ)

JC^

- 1

U(t))

Corollary 1.1.14 (y < 0) Condition (1.1.37) is equivalent to

which implies (1.1.33).

tU {t)
v
'
lim
= y
Y
t->oo U(oo) - U(t)

To get an idea about the tail behavior of the distribution functions in the various
domains of attraction note that for x* = oo and t > x\,
log(l - F(t)) = log(l - F(*0) - /
with f(s) := (1 - F(s))/F'(s).

Under condition (1.1.29) of Theorem 1.1.8,

lim ^
r->>oo

hence
r

hm
t^oo

= lim f(t)
t

= y ;

t-oo

log(l - F(Q)
1
= ,
log t
Y

i.e., for y > 0 the function 1 F(t) behaves roughly like t~l/y, which means a
heavy tail. For y = 0, however,
lim ta{\ - F{t)) = lim x? (1 - F ( * i ) ) e x p - { f f - f r^oo
r->oo
IJ^ \fis)

-a)

1=0
J s J

for all a > 0 and hence the tail is light. Similar reasoning reveals that for y < 0 (in
which case necessarily x* < oo, as we shall see later on in Theorem 1.2.1),
log(l - Fjt))
1
hm
= ,
*t** log(x* t)
Y
so that the function 1 Fix* t) behaves roughly like t~l/y as t | 0.
The reader may want to verify that the Cauchy distribution satisfies Theorem
1.1.11 with Y 1 (Exercise 1.6); the normal, exponential, and any gamma distribution
satisfy Theorem 1.1.8 with y = 0 (Exercise 1.7), and a beta i\x, v) distribution satisfies
Theorem 1.1.13 with y = /ji~l (Exercise 1.8).
r

Remark 1.1.15 Conditions (1.1.29) for y = 0 and (1.1.34) for y > 0 are due to von
Mises. Sometimes a condition, much less general in the case y = 0, is referred to as
von Mises' condition (cf., e.g., Falk, Hiisler, and Reiss (1994), Theorem 2.1.2).

1.2 Domains of Attraction

19

1.2 Domains of Attraction


In this section we shall establish necessary and sufficient conditions for a distribution
function F to belong to the domain of attraction of Gy. Also we shall prove that
the sufficient conditions from Section 1.1.5 are close in some sense to the necessary
conditions.
The reader may have noticed that in the previous sections we were able to prove
the results in a relatively elegant way, by reformulating the problem in terms of the
function U, the inverse function of 1/(1 F). In fact, as we shall see in this section,
the function U plays a role in extreme value theory comparable to the role of the
characteristic function in the theory of the stable distributions and their domains of
attraction. We determine the domain of attraction conditions by starting from condition
(1.1.8) of Theorem 1.1.2 with D(x) = (xy - l)/y (cf. Theorems 1.1.3 and 1.1.6).
That is,
hm

t-+oo

(1.2.1)

a(t)

for all x > 0, where y is a real parameter and a a suitable positive function.
We prove the following results.
Theorem 1.2.1 The distribution function F is in the domain of attraction of the
extreme value distribution GY if and only if
1. for y > 0: x* = supjjc : F(x) < 1} is infinite and
,-*oo 1 _ Fit)
for all x > 0. This means that the function 1 F is regularly varying at infinity
with index l/y, see Appendix B;
2. for y < 0: JC* is finite and
1 - F(x* - **)
_lf
hm
- = * l/y
40 1 - F(x* -t)

(1.2.3)

for all x > 0;


3. for y = 0: x* can be finite or infinite and
i l 1 - ^ " 1 ^
(1.2.4)
*t**
1 - F(t)
for all realx, where f is a suitable positive function, if (1.2.4) holds for some f,
x*
then ft (1 F(s))ds < oo for t < JC* and (1.2.4) holds with
x

fit) :=

Jt[t

*(l-F(s))ds

'

d.2.5)

1 - F(t)
Theorem 1.2.2 The distribution function F is in the domain of attraction of the
extreme value distribution Gy if and only if

20

1 Limit Distributions and Domains of Attraction

1. fory > 0; F(x) < 1 for all x, ff(l - F(x))/x dx < oo, and
,. f, (1 - F{x)) &
,1%,
i-F(t)
=y->
2. for y < 0: there is x* < oo swc/i f/iaf fx*-t^

r|0

(126)

~ ^ (*))/(** x) dx < oo and

1 - F(x* f)

3. for y = 0 (Tzere f/ie ng/tf endpoint JC* may be finite or infinite): f* ffx (1
F(s))ds dt < oo ant/ the function h defined by
hMh(x) :=

V-F(x))f?ft'\l-F(s))dsdt
~2

(1.2.8)

(fZ\l-F(s))ds)
satisfies
lim 6(0 = 1 .

(1.2.9)

ft**

Remark 1.2.3 Limit (1.2.6) is equivalent to


lim E (log X - log t\X > t) = y .
ttoo

In fact,
Jf
1 - F ( 0

since

OO

* = ^ (logX - logf |X > 0 ,


/00

x
/
log * - logf dF(x) = / 1 - F(JC)
Relation (1.2.6) will be the basis for the construction of the Hill estimator of y (cf.
Section 3.2). Similarly, (1.2.7) can be interpreted as
lim E (log(jc* - X) - log t\X >x*-t)

= y,

which will be the basis for the construction of the negative Hill estimator (Section
3.6.2), and (1.2.9) is equivalent to
E ((X - t)2\X > t)
lim ^2

'- = lim 2 h(t) = 2 ,


*t** E (X-t\X>t)
tu*
and this relation leads to the moment estimator of y (Section 3.5).
Next we show how to find the normalizing constants an > 0 and bn in the basic
limit relation (1.1.1).

1.2 Domains of Attraction

21

Corollary 1.2.4 IfF is in the domain of attraction ofGy, then


/. for y > 0;
lim Fn(anx) = exp
n-oo

(-x-l/y)
\

holds for x > 0 with


an := f/(n)
2. /or y < 0:
lim F n (a n Jc+jc*) = exp(-(-jc)- 1 / } / > )
holds for x < Owith
an := x* - U(n) ;
3.

fory=0:
lim Fn(anx + bn) = exp (e x)
holds for all x with

bn := /(n),
tfra/ / as in Theorem 1.2.1(3).
We reformulate Theorem 1.2.1 in a seemingly more uniform way.
Theorem 1.2.5 The distribution function F is in the domain of attraction of the
extreme value distribution Gy if and only if for some positive function f,

t\x*

(1.2.10)

1 F(t)

for all x with 1 + yx > 0. If (1.2.10) holds for some / > 0, then it also holds with
(yt,
f(t)=

y > 0,

j -y(x*-t),
[ ff

y<0,

1 - F(x)dx/(l

- F(t)) , y = 0 .

Furthermore, any f for which (1.2.10) holds satisfies


f lim f(t)/t
1

y>0,

=y ,

t->oo

lim f(t)/(x*

-t)

-y,

Y <0,
(1.2.11)

fit) ~ / i ( 0 w/z^re / i ( 0 w some function


[ for which f[{t) -> 0 , t t x* ,

y = 0

22

1 Limit Distributions and Domains of Attraction


A useful representation is given next.

Theorem 1.2.6 The distribution function F is in T>(Gy) if and only if there exist
positive functions c and f, / continuous, such that for all t e (to, x*), to < x*,
1-F(r) = c(0exp{-/

JL.

JtQ

with lim^jc* c(t) = c (0, oo) and


lim f(t)/t

=y ,

y > 0,

t>00

lim f(t)/(x*-t)
lim /'(f)

I If**

= 0 and

= -y,
lim /(f)

y<0,
= 0 I/JC* < oo , y = 0 .

t\x*

Remark 1.2.7 The auxiliary functions / in Theorems 1.2.5 and 1.2.6 are asymptotically the same. If von Mises' condition is satisfied for y = 0, then we can take
/(f) = (1 F(t))/F'(t).
Remark 1.2.8 Note that Fo(f) := max(0, \c exp(- / ^ f~l (s) ds)) is a probability distribution function and that
1-F(f)~l-F0(f),

f ff*.

(1.2.12)

It follows that for any F eV(Gy) there exists a distribution function Fo with (1.2.12)
such that Fo satisfies von Mises' condition of Theorem 1.1.8 (for y = 0), Theorem
1.1.11 (for y > 0), or Theorem 1.1.13 (for y < 0).
In order to prove the results of Theorems 1.2.1-1.2.6 we are going to study relation
(1.2.1) for the inverse function U first.
Lemma 1.2.9 Suppose (1.2.1) holds.
1. Ify > 0, then lim^oo U(t) = oo and

lta m . A .
t-+oo a(t)

a2,3,

2. Ify < 0, then lim^oo U(t) < oo and, with (/(oo) := lim^oo U{t),
lim

= .
t-+oo
a(t)
y
In particular this implies that limr_>oo &(t) = 0.
3. Ify = 0, then
lim ^
= 1
for all x > 0andlimt-+ooa(t)/U(t) = 0. Moreover, ifU(oo) < oo,

(1.2.14)

(1.2.15)

1.2 Domains of Attraction

UjoQ-Um

23

f^oo U(oo)-U(t)
forx > Oandlimt^ooaW/iUioo)

'

- U(t)) = 0. Further,
lim *
^oo

= 1

(1.2.17)

a(t)

for x > 0.
Corollary 1.2.10 7. For y > 0 relation (1.2.1) w equivalent to
lim - 7 ^ = JC^ /or JC> 0 .

(1.2.18)

2. For y < 0 relation (1.2.1) w equivalent to U(oo) < oo awJ


hm --- = x y /or x > 0 .
(1.2.19)
'-oo
U(00)-U(t)
Remark 1.2.11 Relation (1.2.18) means that the function U is regularly varying at
infinity with index y (see Appendix B); similarly for the relations (1.2.15)(1.2.17)
and (1.2.19). In (1.2.15)(1.2.17) the index of regular variation is zero. In that case
we say that the function is slowly varying (at infinity).
Lemma 1.2.12 Let F\ and F2 be two distribution functions with common upper
endpoint x*. Let F\ be in the domain of attraction ofGY, y R, that is,
lim
'-00

x >0,

(1.2.20)

a(t)

where a is a suitable positive function and Ui := (1/(1 F())*~, i = 1, 2. The


following statements are equivalent:
lim
rfjc* 1 -

r
hm
'-oo

= 1.
F\(t)

U2(t)-Ui(t)

a(t)

.
= 0.

Moreover, each statement implies that F2 is in the domain of attraction ofGy.


Proof Assume statement (1). Take s > 0. The inequalities (valid for sufficiently
large t):
(1 - g)(l - Fi(0) < 1 - F2(t) < (1 + s ) d - Fx(f))
are equivalent to the following inequalities for the inverses (for sufficiently large s):

Uil-e)s)<U2(s)<Uil+e)s)

24

1 Limit Distributions and Domains of Attraction

Hence
Ul((l-8)s)-Ul(s)
a(s)

^ U2{s) - Ui(s) < UX ((1 + e)s) ~


a(s)
~
a(s)

U^s)

The left- and right-hand sides converge respectively to ((1 s)Y l)/y and
((1 -f s)y \)/y. Hence statement (2) has been proved. The converse is similar.

Proof (of Lemma 1.2.9 and Corollary 1.2.10). We prove the assertions for y > 0.
The proof of the other assertions is similar.
It is easy to see that if (1.2.1) and (1.2.13) hold, (1.2.18) is true and that (1.2.18)
implies the other two.
Note that (1.2.1) implies for x > 0,
a(tx)
lim
t-*oo a(t)
=

(u(f*y)-u<f)
t-+oo \
a(t)

_ U(*x)-U(t)\
a(t)

lim

//U(txy)-U(tx)\
&(tx)
)

) J V

Hence for Z > 1,


U(ZM) - U(Zk)
k^o U(Zk) - U{Zk'1)
'U(Zk+l)-U(Zk)\
= lim
k

k^io V

//U(Zk)-U(Zk~l)\

a(Z )

ZYa(Zk~l)

) I V

ZY ,

that is, for 0 < e < 1 Z y, k > no(s)9


(u(Zk)

- U{Zk~l)\

Zy(l -s)<

U(ZM)

U(Zk)

\U(Zk) - U(Zk~l)j

(1.2.21)

ZY{\ + 8) .

From this, we have


lim U{ZN+l)

U(Zn)

N-+oo
N

n 1
= N.+00
lim YI s (u(Zo)-U(Z
)) f[ ^ P " ? ?
\
>
7 11 U(Z ) - U(Z ~ )
v

n=0

k=n0

> lim y

(u(Zn) - U(Zn-1)) fT Zy(l

n=n0

-e)

k=n0
oo

= (C/(Z") - l/tZ""-1)) 5 3 ((1 - e)Z>')"-"0+1 = oo ,


n=no

'

1.2 Domains of Attraction

25

since Z > 1 and by assumption y > 0. Hence /(/) -> oo, t -> oo.
In order to prove (1.2.13) add inequalities (1.2.21) for k = o , . . . , . Divide the
result by /(Z n ) and take the limit as n -> oo. This gives
/(Z n+1 );
lim v
= Z'
n-oo /(Z n )

(1.2.22)

for all Z > 1.


Next, for each x > 1, define n(*) N such that
Z n W < x < Zn(*)+1 .
We then have for t, x > 1,
/(Zn(0Z*<*))
/(,*)
I/(ZW+i) - U ( r ) -

(1.2.23)

(/(Z n ( ' ) + 1 Z n ( * ) + 1 )
c/(zG>)
*

(1-2.24)

Now write the left-hand side as


I/(Z n(r) Z n(jf) ) /(Z n(r) )
(7(Zn<'>) t/(Z n W + 1 ) '
By (1.2.22) this converges to Z y n ( j c ) Z" y as t -> oo, which by (1.2.23) is at least
(x/Z2)y. Similarly, the limsup of the right-hand side of (1.2.24) is at most (xZ2)y.
Now let Z | 1. Then we get (1.2.18) and hence (via (1.2.1)) relation (1.2.13).

Proof (of Theorem 1.2.1 for y ^ 0). We prove the theorem for y > 0. The proof for
y < 0 is similar. From the definition of the inverse function U(x) one sees that for
any s > 0,
\\-F{t))-

\l-F(t)J

Hence
^ ) <

- . ( _ L _ )

\l-F(t)J

s
U

^ .

(1.2.25>

\l-F(t)J

Suppose (1.2.1) holds, i.e., we have (1.2.18). Then the right- and left-hand sides
of (1.2.25) converge to (x/(l + s))y and (JC/(1 s))y respectively. Hence, since the
relation holds for all s > 0, it implies
lim t~lU f l

) = xy.

(1.2.26)

Next we apply Lemma l.l.l and get from (1.2.26),


r

"

t^oo l -

F { t )

i/y

F(tx)

i.e., (1.2.2). The proof of the converse implication is similar and is left to the reader.

26

1 Limit Distributions and Domains of Attraction

Proof (of Theorem 1.2.2 for y ^ 0). We prove the theorem for y > 0. The proof for
y < 0 is similar. First note that by (1.2.2) for any e > 0 and sufficiently large t,

LzlM<ee-l/Y
Hence

l-F(ten)

l-F(tek)

- - < *(s-l/y)n

=n

IJ 1 - F(tek~l) ~

1 - F(t)
and for all x > 1,
1

- *('*> <

- ^[1g']>

1-F(f) "

I-F(t)

<

e(-l/y)[log,]

< ^(e-l/K)(l+logx) _ e-l/Y+ex-l/Y+e

The dominated convergence theorem now gives, in combination with (1.2.2),

p-FfrQ*

lim

f-

1/y

which is (1.2.6). In particular, we know that / ^ ( l F(x))/x dx < oo.


For the converse statement assume that
lim a(t) =

\/y

f-*00

with
m
1 ~ F(t)
ait) := / f (l - Fix))/x dx '
Note that
P
dx
f
dx
C*
dx
- log / (1 - Fix)) + log / (1 - F{x)) = / a(x) x
Hence, using the definition of the function a again, we have
/OO

1-- F ( 0 = a(t)J

(1--

F(x))

dx
X

/OO

a(0j

dx
exj
(1- F(x))
X
(-/'

(1.2.27)

and for x > 0, as t - oo,


1 - Fitx)
1 - F(t)

a(fjc)
=

x
q.r-1 [ ?l) = x -

e x p ( - jf a( : , v > ^ ) - e^

^ .

1.2 Domains of Attraction

27

Remark 1.2.13 Representation (1.2.27) is sometimes useful. It expresses F in terms


of the function a, which is of simpler structure.
Proof (of Corollary 1.2.4). For y > 0 by Theorem 1.1.6 and Lemma 1.2.9, for* > 0,
,.
U(nx)
lim
hm
, _ v = xY
"777V
Then by Lemma 1.1.1,
n{l-F(xU(n))}=xl/y,

lim
n-oo

and hence by Theorem 1.1.2,


lim Fn(xU(n))=exp(-xl/y)

The other cases are similar.

Proof (of Theorem 1.2.5). Since U is the left-continuous inverse of 1/(1 F), for
e >0,
1 - F (J7(Q + sf(U(t)))
1 - F(U(t))

^
1
^ 1 - F (U(t) - sf(U(t)))
~ t{\ - F(U(t))} ~
1 - F(U(t))

Since by (1.2.10) the left- and right-hand sides converge to (1 ye)~l/y,


that
lim t{\ - F(U(t))} = 1.

it follows
(1.2.28)

With (1.2.10) this gives


lim t{\ - F(U(t)+xf(U(t)))}

= (1 + yx)~l/y

t>oo

Now, Theorem 1.1.6 tells us that F V(GY). This proves thefirstpart of the theorem.
The second statement just rephrases Theorem 1.2.1. Relation (1.2.11) for y ^ 0
follows from the easily established fact that if (1.2.10) holds for / = f\ and / = /2,
then lim,f ** f\ (t)/fc(f) = 1. For the case y = 0 use Theorem 1.2.6.

Proof (of Theorem 1.2.6 for y ^ 0). For the "if part just check directly that (1.2.2)
or (1.2.3) of Theorem 1.2.1 is satisfied. Next suppose F V(GY) for y > 0. Write
1

a(t) =

" F(f)
ft(l - F(x))/x dx '

Note that
00

(-

l o g

n(1 - F(x))
vt x. dx \
x )

a(t)

)=-,

hence for t > to,

1 - F(f) = 1

.f^ a(t) exp

U"T)

Theorem 1.2.2 states limr_>oo a(t) = 1/y, hence the representation. The proof for
y < 0 is similar.

28

1 Limit Distributions and Domains of Attraction

For the proof of Theorems 1.2.1, 1.2.2, and 1.2.6 with y = 0 we need some
additional lemmas.
Lemma 1.2.14 Suppose that for all x > 0, (1.2.1) holds with y = 0, i.e.,
lim

~-

'-oo

a(t)

= log* ,

where a is a suitable positive function. Then for all e > 0 there exist c > 0, ^o > 1
such that for x > 1, t > to,
U(tx) - 1/(0
a(t)

Proof For all Z > 1, there exists to > 1 such that for t > to,
U(fe) - U(t)
< Z
a(t)

and

a(fe)
< Z
a(t)

(use (1.2.17) for the last inequality). For n = 1, 2 , . . . and t > to,
U(ten) - U(t) _ A
n

<

U(tek) - U{tek~l) yl

a(ter)

k-l

Y,zY\z<nZ.
fc=l

r=l

Hence for JC > 1 and 1 < Z < e,


U(tx) - U(t)
U (te[lzx]+l) - U(t)
a(t)
~
a(t)
< ([log JC] + 1) Z [ l o g * ] + 1 < (log JC + 2) Z l o g *+ 2
<

.^.^logz

~ logZ
(use for the last inequality a + 2 < 2(log Z)~lZa for any a > 0 and 1 < Z < e).
Corollary 1.2.15 7/(1.2.1) holds for y = 0, tfien / ^ f/(s)/,s2 ds < oo and

limM)-(o=0
r-oo

a(r)

with
t f00
J.s
f00
ds
Uo(f) := - / U(s) -j = / l/frf/e) - j .
Afote f/iaf /Q w continuous and strictly increasing.

1.2 Domains of Attraction

29

Proof.

-I

Uo(t) - U(t)
f U(st/e) - U(f) ds
a(t)
Ji
a(t)
s2
We can now apply Lebesgue's theorem on dominated convergence: (1.2.1) gives the
pointwise convergence and Lemma 1.2.14 the uniform bound. In particular we have
Ji00 U(s)/s2 ds < oo.

Corollary 1.2.16 If F e V(Go), there exists a distribution function Fo, continuous


and strictly increasing, such that
lim
= 1
*t** 1 - F(t)
and (1.2.29) holds.
Proof Apply Corollary 1.2.15 with U = (1/(1 - F))<~. Next apply Lemma 1.2.12.

Proof (of Theorem 1.2.1 for y = 0). We have proved already in Theorem 1.1.6 that
(1.2.4) is necessary and sufficient for the domain of attraction. Since the function F
is monotone and e~x is continuous, relation (1.2.4) holds locally uniformly. Hence,
in order to prove that (1.2.4) holds with the function / from (1.2.5), it is sufficient to
prove that if (1.2.4) holds, then
ftx\l-F(s))ds
1 - F(f)
Take I/ 0 and F 0 from Corollaries 1.2.15 and 1.2.16. Note that (1.2.4) holds with
F replaced by Fo, i.e.,
l-Fo(*+*/(*))
_,
hm
= e .
tu*
1 - F 0 (0
Since also (use THopital's rule)
r

ftx*(l-Fo(s))ds

ftx*(l-F(s))ds

1-F0(0
1-F(0
it is sufficient to prove the statement for Fo rather than for F.
By dominated convergence, using the inequality from Lemma 1.2.14, we have
r

hm

zjr

^ o ^ f f - U0(z)
r~ U0(zx) - U0(z) dx
i
f= hm /

=- = 1

If we substitute s by 1/(1 FO(M)), i.e., u = Uo(s) (since Fo is continuous and strictly


increasing!) and z by 1/(1 Fo(0) in the left-hand side, we get
jf{\-

F0(u))du

r fl(l/(l - F 0 (0))(1 ~ F 0 (0)


Relation (1.2.30) follows by Theorem 1.1.6, last part, and the proof of Theorem 1.2.1
is complete.

30

1 Limit Distributions and Domains of Attraction

Proof (of Theorem 1.2.2 for y = 0). Clearly relation (1.2.4) implies
lim t + xf(t) = x*
for all x .
tfx*
Now we replace the running variable t in (1.2.4) by t' + yf{tf) (tf f x*) and get
Um

l-F({t'

t*

+ yf(tf))+xf{t'

+ yf{t')))

1 - F(t' + y/(f'))

l - F ( ; ' + y/(Q)

r*

1 ~ F(t')

_y

=
e

'

that is,
lim
t'\x*

i
l-F(r')

U. = e - * e - y .

(1.2.31)

Now by (1.2.4), also


l-F(t'
+ (x + y)f(t'))

r
lim
, ' /
= ~Jt~3'
(1.2.32)
t'u*
1 - F(r')
Keep in mind that the convergence in (1.2.4) is locally uniform. It then follows
from (1.2.31) and (1.2.32) that
lim
i'\x*

J v

"
fit')

= 1

(1.2.33)

for all v (formally this is proved by contradiction: suppose that for some sequence
t'n f x* the limit in (1.2.33) equals c e [0, oo], c ^ 1; then (1.2.31) cannot be true).
This holds in particular for the function / from (1.2.5), i.e.,
ftX+xf(t)(l -

,.

<s

lim

ft*'

l-F(t

+ xf(t))

ds

1-F(/)
ftx\l

= l

- F(s)) ds

for all x, which in combination with (1.2.4) gives

tu*

(1.2.34)

ffx (1 _ F(s)) ds

for all x. Now define the distribution function F\ by


Fi(*):=max(0,1- f
Relation (1.2.34), i.e.,

ffx*

l-Fi(f)

(l-F(s))ds\

1.2 Domains of Attraction

31

tells us by (1.2.4) that F\ is in the domain of attraction of Go. But then again by
(1.2.4) and (1.2.5) we must have
*t**

i-*i(o

with

(f{\-Fi{s))ds

i-w

(1 2 37)

''

Since the convergence in (1.2.35) and (1.2.36) is locally uniform, the functions
/ and / i must be asymptotically equivalent:
/i(0~/(0

as f t * * .

(1.238)

This relation is the same as (1.2.9) of Theorem 1.2.2.


Conversely, suppose (1.2.9) holds. Note that for t large enough,
^ { - log(l - F2(f))} =
at

2k{

?~
fi(t)

> 0,

(1.2.39)

as t t JC*.

(1.2.40)

where
1 - F2(t) :=

(1-Fi(0)2
ftx

{1-Fi(s))ds

Moreover, by (1.2.9),
1 - F2(t) ~ 1 - F{t)

Since F2 is eventually monotone by (1.2.39), and 1 - F2{t) -> 0, t \ JC*, by (1.2.40),


there is a distribution function, F% say, such that F2(t) = F(t) for large t. So by
Lemma 1.2.12 it is sufficient to prove that F is in the domain of attraction of GoNote that for t > to,
/i(0-/i(*(>)= /

{h(s)-l)ds;

J to

hence for all JC,


hm
t\x*
(by (1.2.9)) or

fx{t)

= hm / (h(t + uf\{t)) - 1) dw = 0
'tWo
limMt+xMt))=i

nx*

Mt)

uniformly on bounded x intervals. Now by (1.2.39) for large t,


\-F*it + xfx{t))
(
f
fi(t)
= exp (- f (2h(t + ufx{t)) - 1) 1^1du)
\ Jo
fi(t + ufi(t))
1 - F*(t)

32

1 Limit Distributions and Domains of Attraction

Hence by (1.2.9) and (1.2.41),


lim
ft**

= e x.

-^
1-F*(0

(1.2.42)

Proof (of Theorem 1.2.6 for y = 0). Suppose F X>(G0). Define for n = 1, 2,
recursively
F(/) := max ( 0,1 - /

(1 - F n _i(s)) ds |

(1.2.43)

( * - /

and Fo(0 := F(t). The integrals are finite: the arguments in the previous proof show
that Fn e V(Go) for all n. Moreover, we have for all n, as t f **,

Write
Q n i t )

._ 1 - F n + 1 ( Q
- 1-F(0

Now note that


1 - F(t) = qi(t)q%(t)ql(t)(l
1

~m ) \ .
(1 - F 4 (0) 3

(1.2.45)

Define
1 - F*(0 :=

(1 - F 3 (0) 4

(1 - F 4 (0) 3
Note that by (1.2.43), for sufficiently large t,
d
1 - Fi(0
1 - F 3 (0
1
- {-log(l - F*(0)} = 4 - ^ i - 3 - =: - .
dt
1 - F 3 (0
1 - F 4 (0
/*(0
Hence
1
1 - F3(Q IA
=
(4 a 3 (0 3) > 0
Mt)
1-F4(0
and

^V1

rf_ /_4
f (t)

~ dt \Q2(t)
=

fi3(0/

_ /_4
\C2(0

(1.2.46)

3 _ V 2 /-4<22(0

3Q 3 (Q \

23(0)2/

63(0/

By (1.2.44) we have Q2(t)/Q3(t)


t \ x*.

V(G 2 (0)

-> 1, G^(0 - > 0 , r f x*. Hence ( 0 - 0,

1.2 Domains of Attraction

33

Note that F* is monotone by (1.2.46) for large t and that l i m ^ * F*(f) = 1 by


(1.2.45); hence F* coincides with a distribution function for large t. Finally, 1
/r^(f) ~ i __ f(t), t t Jc*, by (1.2.44) and (1.2.45). Note that for F we derived the
representation with
1 - F(t)
c(t) =

1 - F*(f)
and

m = Ms).
To prove the converse, first note that
lim Up- = 0 , JC* = oo,
l

, t-+00

lim 4Q = 0 ,

JC*

< oo,

since if x* = oo,

fit) - /fro)

1 /" ,,, w

which converges to zero by hypothesis, and if JC* < oo,

no

=-sf

x* t

ns)ds
JC*

which again converges to zero by hypothesis. Hence we have that


t + xf(t)

<JC*

(1.2.47)

for sufficiently large t and all real JC. Obviously t + xf(t) -^ JC*, f f JC*. Next note
that
f(t+xf(t))-f(t)
fit)

1 /"+*/('>

ff(s)ds =

7(f) I

ff(t sf(t))ds

>

which converges to zero locally uniformly as t \ JC*. Hence


lim
tn*

f{t +

* m
fit)

=1

(1.2.48)

locally uniformly. Combining (1.2.47) and (1.2.48) we have


l-F(t+xf(t))
1 - F(t)

c(t+xf(t))
c(t)

exp

(~ fo

as t f x*. The result follows from Theorem 1.2.5.

itTjm*)

34

1 Limit Distributions and Domains of Attraction

Exercises
1.1. Let / be any nondecreasing function and /**" its right- or left-continuous inverse,
respectively f*~(y) := inf {s : f(s) > y] or f*~(y) := inf {s : f(s) > y}. Check
that:
(a) (/**")*" = /"" if / * " is the left-continuous inverse, with / " the left-continuous
version of / .
(b) (f*~)*~ = / + if / " " is the right-continuous inverse, with / + the rightcontinuous version of / .
(c) / " (/*"(0) < t < f+ (/*~(0) whether / * " is the right- or left-continuous
inverse.
1.2. Verify that Gny{anx + bn) = GY{x) = exp ( - ( 1 + yx)~l/y)9
for a = n^ and i?n = (ny l ) / y , for all.

with 1 + yx > 0,

1.3. Consider the generalized Pareto distribution Hy(x) = 1 (1 + yx)~l/y, 0 <


x < (0 v ( - y ) ) " 1 (read 1 - e~x if y = 0). Determine a(t) such that (1.1.20) holds
for all t and *, i.e., (C/(fjc) - U(t))/a(t) = (xy - l ) / y .
1.4. For F(JC) = 1 1/jc, JC > 1, determine sequences an positive and bn real such
that Fn(anx + &) - exp(~l/jc), for JC > 0.
1.5. Verify that the distribution function of X is in the domain of attraction of GY
with y < 0 if and only if X = 1/(JC* X) is in the domain of attraction of G-Y.
1.6. Check that the Cauchy distribution F(JC) = 2" 1 + n~l arctan JC, JC R, is in the
Frchet domain of attraction with y = l.
1.7. Check that the following distributions are in the Gumbel domain of attraction:
(a) Exponential distribution: F(JC) = 1 e~x, x > 0.
(b) Any gamma (v, a) distribution for which F'(JC) = (T(v))-lavxv~le~ax,
v > 0,
a > 0,JC > O.Hint: use l'Hopital's rule to verify that lim r _> o o (l--F(0)/F'(0 =
a" 1 .
1.8. Check that the beta (/x, v) distribution, for which
F\x)

= T(/x + v X r ^ ) ) " 1 ^ ^ ) ) - ^ !

-JC^-V"

/x > 0, v > 0, 0 < JC < 1, is in the Weibull domain of attraction with y = /JL~1 .
1.9. Check domain of attraction conditions for F(JC) = ex, x < 0, and F(x) =
l-el'x9x
<0.
1.10. Show that F e V(Gy), for some real y, is equivalent to limt^oo(V(tx)
V(t))/a(t) = (xy l ) / y , x > 0, for some positive function a, where V :=
(l/(-logF))-.

1.2 Domains of Attraction

35

1.11. Suppose Ft V(GYi) fori = 1,2 and y\ < yi. Suppose also that the two
distributions have the samerightendpoint x*. Show that 1 F\ (x) = o{\ F2O)),
as* t **
1.12. Let Ft 6 V(GYi), i = 1,2. Show that for 0 < p < 1 the mixture pF\ +
(l-p)F 2 2>(G m ax( y i ,^))if:
(a) Y\ ^ Y2,
(b) yi = n # 0.
Can you say something about the case yi = yi 0?
1.13. Show that if F V(GY), then (for any y)
lim 5t , 1 - F(^)
hm - L = 1 .
*t** lim^, 1 - F(s)
Conclude that the geometric distribution F(x) = 1 e~^x\ x > 0, and also the
Poisson distribution are in no domain of attraction.
1.14. Find a discrete distribution in the domain of attraction of an extreme value
distribution.
1.15. Let Xi, X 2 , . . . be an i.i.d. sample with distribution function F. Show that
if F is in the domain of attraction of Gy with y negative and c (0, 00), there
exist constants an > 0 such that (max(Xi,..., Xn) F*~(l (cn)~1)) /an converges in distribution to Y cy /y, where Y has distribution function of the type
exp ( - ( - v ) " 1 ^ ) , v < 0 (i.e., Weibull type).
1.16. Prove that if F V(GY) and X is a random variable with distribution function
F, then for all 00 < x < U(oo) (U(oo) is therightendpoint of F),
E\X\"l{x>x)<OQ
if 0 < a < l/y+ with y+ := max(0, y) and E\X\al{x>x) = 00 if a > l/y+. Recall
that U(oo) < 00 if y < 0 and U(oo) = 00 if y > 0.
1.17. Let F(x) := P(X < x) be in V{GY) with y > 0. Let A be a positive random
variable with EAl/y+s
< 00 for some e > 0. Let A and X be independent. Show
that
limP(AX>x)=EAl/

t->oo P(X > X)

so that P(AX < x) is also in V(GY).


Hint: By Appendix B, for x > to, a < x/to, we have (1 e)al/y~e
x)/P(X > x) < (1 + e)al/y+s. Hence
fx/t 1/v
P
(1 - e) /
ax'y-sdP{A

<a)<

-i

< P(aX >

P(AX>x,A<x/t
)
' "- / 00 ;
p

px/to
< (1+a) /
a 1 / K + ^F(A < a) .
Jo

36

1 Limit Distributions and Domains of Attraction

Take the limits x oo and then s | 0. Further use P(AX > x, A > x/to) < P(A >
*/fo).
1.18. Show that F(x) = 1 e~x~smx, x > 0, is not in any domain of attraction (R.
von Mises).
Hints: Show that lim^oo (1 F(x + logn*)) = e - * - s m * for all JC > 0 with
nit = [e2jtk] for A: = 1, 2 , . . . , i.e., limw_>oo U(rikx) logn^ = /I(JC), where U :=
(1/(1 F))*"" and U\ is the inverse of ex+smx. Now proceed by contradiction.

Extreme and Intermediate Order Statistics

2.1 Extreme Order Statistics and Poisson Point Processes


The extreme value condition
lmi

= (l + y*)

1/y

(2.1.1)

for each JC, with 1 + yx > 0, or equivalently

lim HMzm
t-+oo

X
=

1JZ1

a(t)

(2.L2)

for each x > 0, where y is a real constant called the extreme value index, is designed
to allow convergence in distribution of normalized sample maxima, as in (1.1.1). But
the conditions also imply convergence of other high-order statistics.
Let us start to derive the result for the exponential distribution. Suppose E\, 2,
are independent and identically distributed standard exponential and E\in < E2,n <
- < En,n are the nth order statistics. By Renyi's (1953) representation we have for
fixed k <n,
\E\,n, E2,n> Ek,n)

WS,5 t A
\n
with E\,...,

n\

S. + JL+..
n

n1

^B\
n k+\J

El independent and identically distributed standard exponential. Hence

n (Elfn, E2,n, , Ektn) - i (Ef, J + J,..., J + + J) .

(2.1.3)

This suggests, and we shall show this later on, that the point process of normalized
lower extreme-order statistics converges to a homogeneous Poisson process.
Next we generalize the result (2.1.3) to the entire domain of attraction, and as
usual, we formulate it for upper order statistics rather than lower ones.

38

2 Extreme and Intermediate Order Statistics

Theorem 2.1.1 Let X\, X2,... be i.i.d. with distribution function F. Suppose F is
in the domain of attraction of Gy for some y e R Let X\jn < X2,n 5: < Xn,n
be the nth order statistics. Then with the normalizing constants an > 0 and bn from
(1.1.1) and fixed keN,

an

an

an

converges in distribution to

'""

where p | , . . . are i.i.d. standard exponential.


From this representation the derivation of the (complicated) joint limit distributions
is straightforward.
Proof. Note that if is a random variable with standard exponential distribution,
then

(T)
has distribution function F. Hence
\Xn,m Xni>w, . . . , An_fc-f i>nJ

= T^)."(d^)

(d^))

Next note that


hm -

= hm
w->oo

an

(lim n(l-e-*'n)YY

-1

x-y-i

Hence by (2.1.2), (2.1.3), and the fact that n (l - e"x/n) -* JC, n -* 00, for x > 0,
we get the result.

Under the conditions of Theorem 2.1.1 consider the random collection


100

IG^)}:
of points in R+ x R and define a point process (random measure) Nn as follows: for
each Borel set B c R + x E ,

2.1 Extreme Order Statistics and Poisson Point Processes

39

Moreover, consider a Poisson point process N on R+ x (**, x*], where *JC and JC*
are the lower and upper endpoints of the distribution function GY, with mean measure
v given by, with 0 < a < b, *x < c < d < x*9
v ([<*, b] x [c, d]) = (b-a)

[(1 + ycyl/y

- (1 + yJ)"1^]

The following limit relation holds. For information about point processes see, e.g.,
Jagers (1974).
Theorem 2.1.2 The sequence ofpoint processes Nn converges in distribution to the
Poisson point process N, i.e., for any Borel sets B\,..., Br C M+ x (*JC, x*] with
v(dBi) = 0fori = 1,2, . . . , r ,
(tf n (Bi),..., Nn(Br)) - i (tf ( B i ) , . . . , N(Br)) .
Proof. By Theorem 4.7 of Kallenberg (1983), see also Theorem A. 1, p. 309, of Leadbetter, Lindgren, and Rootzen (1983), and Proposition 3.22, p. 156, of Resnick (1987),
it is sufficient to check that for all half-open rectangles / := (JCI , X2] x (yi, yi\,
lim ENn(I) = EN(I) ,

(2.1.4)

and that for each B = U*=i ^> a ^ n ^ te u n i o n of half-open rectangles parallel to the
axes,
lim P(Nn(B) = 0) = P(N(B) = 0) .
(2.1.5)
H->00

Now
EN*(/)
n^ .

P^yl<^Zh.<y^

nx\<i<nx2

For the proof of (2.1.4) it is sufficient to note that


nP (y, < ^^-

< ? 2 ) -4 (l + yyirl/r

- (l +

yyiTxly

by (1.1.7) of Theorem 1.1.2 and that


JC2 JCI,

n -> oo .

nx\<i<nx2

For relation (2.1.5) note that the rectangles It can be taken to be disjoint. In fact, by
the independence of the X,-, i = 1, 2 . . . , it is sufficient to consider a set B of disjoint
half-open intervals with identical first coordinates, i.e., in a vertical strip. Then

40

2 Extreme and Intermediate Order Statistics


P(Nn(B)

= 0)
xi<i/n<x2l

or y\

_/

<

an

< y\

or-

HUnP(y^<^<y?)

#{i : nxj</<rtJC2)

- exP L(X2 - x!) [(i + yji0))"1/K - 0 + yy2a))


= P(N(B) = 0) .

-i/y"

The result is quite helpful for developing intuition in extreme value theory: the
larger order statistics can be thought of as points of a Poisson point process with mean
measure determined by the extreme value distribution.
A clear and useful way to see what convergence in distribution of the point process
means is the following (cf. Appendix A): there exists a sequence of point processes
N, N\, # 2 , . . . defined on one sample space such that N =d N and Nt =d N( for
i = 1,2,... and Nt -> N a.s., as / -> oo. That is, for every relatively compact set
B whose boundary has zero mass under the limit measure, the number of points in B
under N( converges to th6 number of points in B under N. Moreover (note that the
numbers of points will be eventually equal) the position of all the points in B under
N( will asymptotically coincide with the position of the points in B under N.

2.2 Intermediate Order Statistics


In the previous section we studied the asymptotic behavior of order statistics, that is,
Xn-k,n> when n -> oo and k is fixed, along with an approximation by a Poisson point
process. One can also consider Xn-k,n with k = k(n) -* oo as n -> oo. A commonly
considered case is k(n)/n -> p e (0,1) (the so-called central order statistics, see,
e.g., Arnold, Balakrishnan, and Nagaraja (1992)). The normal distribution is then an
appropriate limit distribution, and in fact, the stochastic process X[rt5])rt, for some
0 < s < 1, properly normalized, can be approximated by a Brownian bridge (see,
e.g., Proposition 2.4.9 below). But there is a case in between these two. Consider
the order statistics Xn-k,n with n -> oo, k = k(n) -> oo, and k(n)/n -> 0.
Those are called intermediate order statistics. Their behavior can be connected with
extreme value theory, and the stochastic process Xn-[ksin, properly normalized, can
be approximated by Brownian motions, as we shall see.
The following result shows that there is a connection between intermediate order
statistics and extreme value theory.

2.2 Intermediate Order Statistics

41

Let X\, X 2 , . . . be independent and identically distributed random variables with


distribution function F. Recall that U = (1/(1 - F))*~.
Theorem 2.2.1 Suppose von Mises' condition for the domain of attraction of an
extreme value distribution GY holds (cf Section 1.1.5). Then, ifk = k(n) 00,
k/n - 0 as n -> 00,
Xn-k,n - l/(f)

Vk

*U'(i)

is asymptotically standard normal.


In view of applications later on we state the following immediate corollary:
Corollary 2.2.2 For FY(y) = 1 - 1/y, y > 1, as n -> 00, k -+ 00, k/n -+ 0,

is asymptotically standard normal.


We first give the proof for the uniform distribution. In fact, we shall prove the
following more general result due to Smirnov.
Lemma 2.2.3 (Smirnov (1949)) Let U\,n < Ui,n < < Unyn be the nth order
statistics from a standard uniform distribution. Then, as n > 00, k > 00, n k >
00,
Uk,n On

an
is asymptotically standard normal with
bn:=

k-l

n-Y

an := Jbn(l

-bn)n-\

Proof. The density of Uk,n is


(k-\)\{n-k)\

-xk-l(l-x)n-k;

hence the density of (Uk,n bn)/an is

Using Stirling's formulafor n! one sees easily that the first factor tends to
Next note that

(2TT) _ 1 / 2

42

2 Extreme and Intermediate Order Statistics

(* - l)log (l +*y\ + ( - *)log (1 - x ^ - j

^-(-^-T(T^)2-)
so the highest-order terms cancel. The coefficient of x2/2 is

The other terms are of smaller order. Since the sequence of densities converges
pointwise, we have weak convergence of the probability distributions (Scheffe's
theorem).

Proof (of Theorem 2.2.1). Smirnov's lemma implies that

converges to a standard normal distribution. Hence, since (n/k)Uk+i,n - > p 1, also

M-l

i)

converges to a standard normal distribution. Now note that

Xn-k,n = F*~(l - Uk+i,n) = U (jj^-)

hence

Vik

"lt^V3Fk/(nUk+i,n)

(J' ( | s )

U'(l)

ds

By (1.1.33) and Potter's inequalities (Proposition B.1.9), for n > no, s > 1,

Hence

2.3 Second-Order Condition

<(i+)^

43

(--Y;+E' _1

Vnt/t+1

,
y +ef

We already know that k/(n Uk+\,n) -+P 1. Hence

ye'
has the same limit distribution as Vfc (k/(nUk+i,n) l)- Since e > 0 is arbitrary we
find that

f"'(f)
has the same limit distribution as y/k (k/(nUk+\,n) l)-

So we see that the normal distribution is a natural limit distribution for intermediate order statistics. As in the case of extreme order statistics, where we made the
connection with point processes, we want to put the present limit result in a wider
framework, which in this case will be convergence toward a Brownian motion. However, for this result we need more than just the domain of attraction condition. One can
consider the domain of attraction condition as a special kind of asymptotic expansion
of U near infinity. For the approximation by Brownian motion, as well as for many
statistical results as we shall see later on, it is very useful to have a higher-order expansion. We call this the second-order condition. This condition will be discussed in the
next section. The extension (or rather analogue) of Theorem 2.2.1 in this framework
will be discussed in Section 2.4.

2.3 Second-Order Condition


Once again we start with the extremal value condition (2.1.1), or equivalently
hm
'-oo

=: DY(x),

a(t)

(2.3.1)

for each x> 0, where U = (1/(1 - F))*~.


We are going to develop a second-order condition related to (2.3.1). Suppose that
there exists a function A not changing sign eventually and with lim^oo A(f) = 0
such that for all x > 0,
lim
*->oo

a(f

>

A(t)

/ W

(2.3.2)

44

2 Extreme and Intermediate Order Statistics

exists. The function A could be either positive or negative. Write H for the limit
function. Of course the case H(x) = 0 for all x > 0 is not very informative.
Let us rewrite relation (2.3.2) as follows: for all x > 0,
hm

= H(x)
(2.3.3)
t-*oo
a\(t)
with a as before and a\ a A. The first question is, which functions H are possible
limit functions in (2.3.3)?
Note first that when we replace the function a by a + ca\, for some constant c,

a(t) ~a(t) + ca\(t) ,


then we still have the limit relation (2.3.3) but with a new limit function H satisfying
H(x) = H(x) - cDy(x) .

(2.3.4)

This means that we can always add a multiple of Dy to the function H. It follows
that if (2.3.3) holds with H(x) = cDy (x), the relation is still not very informative.
So we require that the function H in (2.3.3) not be a multiple of DY. In particular, H
should not be identically zero.
Definition 2.3.1 The function U (or the probability distribution connected with it) is
said to satisfy the second-order condition if for some positive function a and some
positive or negative function A with limf_oo A(t) = 0,
u(tx)-u(t) _ xy-\
Y
lim 22>
=: H(x) , x > 0 ,
(2.3.5)
t-+oo
A(t)
where H is some function that is not a multiple of the function (x y 1) / y. In particular,
H should not be identically zero. Occasionally we shall refer to the functions a and
A as (respectively) first-order and second-order auxiliary functions.
Remark 2.3.2 Note that the second-order condition implies the domain of attraction
condition.
We have the following results. Proofs are given in the appendix on regular variation, Section B.3.
Theorem 2.3.3 Suppose the second-order condition (2.3.5) holds. Then there exist
constants ci, Q R and some parameter p < 0 such that
H(x) = a f

sY~l

fS up~l du ds + c2 f

sy+p-1

ds .

(2.3.6)

Moreover, for x > 0,


lim
t-too

fl(r)

=clXy-

A(t)

(2.3.7)

and

lim ^ l =
t-+oo A(t)

p
x

(2.3.8)

2.3 Second-Order Condition

45

For p ^ 0 we can write H as


H(x) = c i - (Dy+P(x)

- Dy(x))+c2Dy+p(x),

(2.3.9)

and for p = 0 and y ^ 0 we can write / / as


H(x) = ci- (xy log* - Dy(x)) + c 2 D K (x) .

(2.3.10)

For y = p = 0 we get directly from (2.3.6)


H(x) = c i - (log*) 2 + c 2 logx .

(2.3.11)

Next we are going to simplify the limit function H by changing the functions a
and a\ a little as in (2.3.4). We work out one of the three cases. Suppose p ^ 0, so
(2.3.9) holds. Replace a by a + c 2 ai. Then the limit / / changes to
(c\ + p c 2 ) - (D y + P (*) - Dy(x))

Next replace the function a\ by a\(c\ + pc 2 ). Then the limit / / changes to


- (Dy+P(x) - Dy(x)) = / sy~l /
P
Ji
Ji

p_1
W

Jw^ ,

which is just the first term in (2.3.6) with c\ = 1. Notice that in the process we may
have changed the positive function a\ into a negative one. One can simplify relations
(2.3.10) and (2.3.11) as well. We formulate our result.
Corollary 2.3.4 Suppose relation (2.3.3) holds for all x > 0 and the function H is
not a multiple of Dy. Then there exist (possibly different) functions a, positive, and
a\, positive or negative, such that
tfm " < ' * > - " ( ' > - ( W * > == ,f
^-^oo
a\(t)

Y l
8sr->
~

,f Hu,p-~.l dudSt
duds,

.3.12)
(2(2.3.1

or equivalently, there exist functions a, positive, and A, positive or negative, such


that
U(tx)-U(t) _

lim
t-oo

a(t)

A(t)

= f

sY~l

f up~l duds =: HY,p(x) .

(2.3.13)

Sometimes we shall write the limit function as


Hy,p(x) := - ( X~-

) ,

(2.3.14)

which for the cases y = 0 and p = 0 is understood to be equal to the limit of (2.3.14)
as y -> 0 or p -> 0, respectively, that is,

46

2 Extreme and Intermediate Order Statistics

l(s=I-log*),

p # 0 = y,

0=y

I 2 (log*) ,

Corollary 2.3.5 Suppose relation (2.3.3) holds for all x > 0 and the function H
is not a multiple of Dy. Then there exist functions a+, positive, and A+, positive or
negative, such that
U(tx)-U(t)

DY(X)

MO

lim

(2.3.15)

y,p\X)>

A*(0

f-*00

where
Y+p

*Y,p(x)

:=

Y+P^O,

log*,

y+p

p<0,

0,p<0,

(2.3.16)

ixHogx.p^O^y,
i(log*)2,

p=0=y,

a(0 ( l - A ( 0 ) , P < 0,

o*(0 :=

a(0 ( l - A ( o ) ,/) = 0 j t y ,
a(0 ,

P = 0 = y,

and
A+(t) :=

A(f) , p < 0,
A(0,

P = 0,

witfi a and A from Corollary 2.3.4.


Next we consider the limit relation forfixed* y , p . The limit relation entails a very
useful set of uniform inequalities.
Theorem 2.3.6 Suppose (2.3.15) holds for some fixed y e R and p < 0. Then there
are functions OQ and AQ satisfying, as t - oo,
A0(t) ~ A*(f),
ao(t)

MO

- 1 = o(A*(t)),

with the following property: for any s, S > 0 there exists to = to(e, S) such that for
all t, tx > to,
u(tx)-u(t) _
apif)

AoO)

xy-i

%AX)

<sxy+pmsK(xs,x~s),

(2.3.17)

2.3 Second-Order Condition


and

Mtx) -XY
aoit) *
A0(t)

VP
rv*

1
l

< exy+p max(x s , x's)

47

(2.3.18)

The functions ao and Ao can be chosen as follows:


' CtY ,

P < 0,

-y(t/(oo) - 1/(0) , y < p = 0,


a0(t) :=

yl/(0 ,

y > p = 0,

(7(0 + U(f) ,

y =

p=0,

with c := lim^oo t~Ya{t) > 0,

1/(0 - c ^
U(t) := {

p < 0,

r y ( l / ( o o ) - 1 / ( 0 ) , y < P = 0,
r-^l/W,

y > P = 0,

f>(0 ,

y=

= 0,

and for some integrable function g,


git) := g(f)
and

t Jo

g(s) ds

1
-iY+p)Qi2%P
,Y+P<0,P<0,
aoit)

A0(t) := {

fr+p>28.

y + p>o, P<0,

V(t)
aoit) '

y + P = o, P < 0 ,

Uit)
U{t) '

y * P = o,

Uit)
. aoit) '

y = p = 0.

The next corollary is an alternative formulation of the result of the last theorem
that is sometimes useful.
Corollary 2.3.7 Under the conditions of Theorem 2.3.6, with the same functions ao
and Ao satisfying, as t > oo, Ao(t) ~ A*(f) and a${t)/a+(t) 1 = 6>(A*(0)> for
any e, 8 > 0 there exists to = to(e, 8) such that for all t, tx > to,
Ujtx)-boJt) _ xy-i
flo(0
A0(t)
w/iere

*y,pW

ejc^maxOc 5 ,*-*) ,

(2.3.19)

48

2 Extreme and Intermediate Order Statistics


. = f U(t) - ^a0(t)A0(t)
bo(0 :

, y + p ^ 0, p < 0,
otherwise ,

1^(0

and
K + p ^ 0 , p < 0 ,

Y+P '

log* ,
:

Vy,PW

y+p = 0,p<0,

*nog*,p = 0#y,

(2.3.20)

[i(logx)2, p = 0 = y .
Next we formulate the second-order condition in terms of the distribution function,
rather than in terms of the function U.
Theorem 2.3.8 Suppose (2.3.5) holds. Then for all x with 1 + yx > 0,
l-F(t+xf(t))
n
(
\-F(t)
~ UY[
lim
= (Gy W ) 1 + > / #y,p ( G ; 1 (*)) ,
a(t)
tfx*
wfere / ( f ) : = A ( 1 / ( 1 - F ( 0 ) ) , ( / ) : = A ( l / ( 1 - F(f))), ourf Qy(x)
y j c ) - 1 / ^ . Conversely, (2.3.21) imp/ww (2.3.5).

(2.3.21)
: = (1 +

For convenience we state the simpler corresponding results for the case y > 0
separately.
Theorem 2.3.9 Suppose for some y positive and positive or negative function
Ujtx)
U(t)

lim
t^oo

A,

:=K(x)

A(t)

exists for all x > 0 and K is not identically zero. Then for a possibly different function
A, positive or negative,
Uitx)

_xy

,xf> - \
lim
=xY
(2.3.22)
A,^
t-*oo
A(t)
p
for all x > 0 with p < 0. Moreover, for any e, 8 > 0 there exists to = to(s,8) > 1
such that for all t, tx > to,
U W -

x r

XP 1

XY

A0(t)

< sxY+f)

md&(x\ x~8),

(2.3.23)

with
Aoit)

:=

Relation (2.3.22) w equivalent to


lim
f-00

of(0

= x-^:

/or a//x > 0 H#A p < 0 o m / a ( 0 := A(l/(1 -

yp

(2.3.24)

F(t))).

Remark 2.3.10 For the equivalence of (2.3.22) and (2.3.24) see Exercise 2.11

2.4 Intermediate Order Statistics and Brownian Motion

49

Example 2.3.11 The function U(t) = c$tY + c\, with c$ and y positive, and c\ =
0, satisfies the second-order condition of Theorem 2.3.9 but not the second-order
condition of Definition 2.3.1.
It is interesting to observe that if (2.3.22) holds with p < 0, then for some positive
constant c the function | U (t)cty | is regularly varying with index y + p . In particular;
U(t) ~ cty, t -> oo. So the second-order condition with p < 0 makes the first-order
relation particularly simple.
Finally, we provide the sufficient second-order condition of von Mises type.
Theorem 2.3.12 Suppose the function U = (1/(1 F))*~ is twice differentiable.
Write
tU"(t)
A(t) :=
- y +1 .
If the function A has constant sign for large t, lim^oo A(t) = 0, and the function
\A\ is regularly varying with index p < 0, then for x > 0,
U(tx)-U(t)
tU'{t)

hm

_ JC^-1
Y

TJ

< N

= HYtP(x) .

Proof. First note that

log U'ifx) - log U'(t) = [' ^ >


Ji

U'(tu)

* U f ( y - 1) "A (2.3.25)
J\

since the function A(t) vanishes at t oo, i.e., U' e RVy-\. Now
u(tx)-u(t)
tU'(f)

xy-\

Ait)

A(t)

_ ft sr-hr-^ H u"mm^}-y)m-yu'm

duds

~
W)
_ /r >*-* n ( ^ - y+1) v-Mu~Y du ds
A{t)

J\ Ait) U'it)

Since A{tu)/A{t) -* up and U'itu)/U'it)


(Theorem B. 1.4), the result follows.

-> uY~x locally uniformly for u > 0

2.4 Intermediate Order Statistics and Brownian Motion


We continue the discussion on the behavior of intermediate order statistics under
extreme value conditions. We have seen that a sequence of intermediate order statistics

50

2 Extreme and Intermediate Order Statistics

is asymptotically normal (when properly normalized) under von Mises' extreme value
condition. However, we want to consider many intermediate order statistics at the
same time; hence we want to consider the tail (empirical) quantile process.
It is instructive to start by proving the main result of Section 2.2, Theorem 2.2.1,
i.e., the asymptotic normality of a sequence of intermediate order statistics, again,
now not under von Mises' conditions but under the second-order condition.
Theorem 2.4.1 Let X\iH < X2,n < < Xnytl be the nth order statistics from an
Ltd. sample with distribution function F. Suppose that the second-order condition
(2.3.21), or equivalently (2.3.5), holds for some y R, p < 0. Then

(f)
is asymptotically standard normal provided the sequence k = k(n) is such that
k(n) -> oo, n -> oo, and
lim
n-oo

exists and is finite.


Proof Take independent and identically distributed random variables Y\, Y2,... with
distribution function 1 1/v, v > 1. Let Y\^n < Y2,n < * < Yn,n be the nth order
statistics. Then on the one hand, by Corollary 2.2.2
Vk(^Yn-ktn-l\4>N

(2.4.1)

with N standard normal. On the other hand,


Xn-k,n = U(Yn-k,n) .
Hence by Theorem 2.3.6,

r-

,n\(k

+ Op

Now note that by Cramer's delta method,


(k

\y+p

2.4 Intermediate Order Statistics and Brownian Motion

51

has the same limit distribution as >/k(k Yn-k,n I n1), which is asymptotically standard
normal; moreover, since kYn-k,n/n ->p 1 by (2.4.1), the other terms go to zero by
assumption (note that tyYiP(l) = 0).

The last result can be vastly generalized and yields the following, relating the tail
quantile process to Brownian motion in a strong sense.
Theorem 2.4.2 (Drees (1998), Theorem 2.1) Suppose X\, X2, ... are Ltd. random
variables with distribution function F. Suppose that F satisfies the second-order
extreme value condition (2.3.21) for some y R and p < 0. Let X\,n < X2,n <
< Xn,n be the nth order statistics. We can define a sequence of Brownian motions
{Wn(s)}s>o such that for suitably chosen functions ao and AQ and each e > 0,
l 2+s
sy+ '

sup
l

k~ <s<\

^\

S(f)

- s-y-lWn(s) - VkA0 () VYtP{s-1) 4- 0 , (2.4.2)


n -> oo, providedk = k(n) -> oo, k/n - 0, and VkAo(n/k)

= 0(1).

Definition 2.4.3 Let X\il < X2,n < < XHtn be the nth order statistics and
k = k(n) a sequence satisfying k -> oo, k/n -> 0, as n -> oo. We define the tail
(empirical) quantile process to be the stochastic process [Xn-\ks],n}s>0Remark 2.4.4 It may happen that the convergence of (U{tx) U(t))/a(t) to
{xY \)/y (cf. Section 2.3, relation (2.3.1)) is faster than any negative power of

t->oo \

a(t)

for all x > 0 and a > 0. In that case the result of Theorem 2.4.2 holds with the
bias part y/kAo(n/ k)tyyiP(s~l) replaced by zero provided k(n) = o (nl~) for some
e > 0. A similar remark can be made in connection with the convergence results for
the various estimators of Chapter 3.
One can extend the interval of definition of s as follows.
Corollary 2.4.5 Define

o-=
v) I Y

*'

+^1oil

I ^n,n ^

yy

v <-i
Y ^

Then, under the conditions of Theorem 2.4.2,


sup
0<s<l

sy+l'2+E

KfXn-vvte-BoCfr
^ \
5K|)

s-y-\\

- s-y~x wn(s) - VkA0 () *y,P(s~l)

0.

52

2 Extreme and Intermediate Order Statistics

We shall usually apply this result in the following form:


Corollary 2.4.6 Under the conditions of Theorem 2.4.2, for each s > 0,
sup mm

(i,sy+w)\Vk(^

-[ks],n ~ Xn-k,n

ao(f)

0<s<l

- s^-1

Y ')

Wn(s) + Wn(l) - VkA0 (j)

Vy,P(s~l)

0.

Remark 2.4.7 A somewhat similar approximation to the tail empirical distribution


function will be proved in Section 5.1.
A simpler version of Theorem 2.4.2 is valid when y is positive.
Theorem2.4.8 Suppose X\,X2,...
are i.i.d random variables with distribution
function F. Suppose that F satisfies the second-order extreme value condition (2.3.24)
for some y > 0 and p < 0. Let X\,n < X2,n < * * < Xn,n be the nth order statistics.
We can define a sequence ofBrownian motions {Wn(s)}s>o such that with a suitable
function AQ, and for e > 0 sufficiently small,
sup s

K+l/2+<?

/TlXn-[ks],n

0<s<l

^""-l

- ys-y~l Wn(s) - VkA0 () s~

0,

n -> oo, provided k = k(n) > oo, k/n > 0, and VkAo (n/k) = 0(1).
Moreover,
sup s

1/2+e

^/ogx,-,t^-togt/(;)+|o8\

0<S<1
\k/

U.

The remainder of this section is devoted to proving the above results. It is instructive to prove the result of Theorem 2.4.2 first for the special case F(x) =
1 (1 + yx)~l/y, y e R , and all x with 1 + yx > 0. The next proposition will be
used in its proof and in the proof of Theorem 2.4.2. Let Q(t) := F*~(t)9 *x and let x*
be the left and right endpoints of F respectively, and \x~\ the smallest integer greater
than or equal to x.
Proposition 2.4.9 (Csorgfl and Horvath (1993), Theorem 6.2.1) LetXu X2, ..be
i.i.d. random variables with distribution function F and assume:
1. F is twice differentiate on (*JC, **), oo <*JC < x* < oo,

2. F'(x) = f(x) >0,xe

(**, JC*),

2.4 Intermediate Order Statistics and Brownian Motion

53

3. for some C > 0,

,n

(1

J/'(g(0>l ^ ^
c

o <^/ -7^(o7- Z>f 0 < < 3. TTien we can <fe/we a sequence of Brownian bridges {Bn{t)} such that
nets-l'2{\

sup

- t)-1'2 | V n f (G(0) (C(0 - X r n , n ) - B n (0|

l/(n+l)<f<n/(n+l)

= OP(1) ,

n -> 00 .

(2.4.3)

Lemma 2.4.10 Lef Fi, F2,... &e i./.d. random variables with distribution function
1 1/j, j > 1. Consider the nth order statistics Y\,n < Yi,n < * < ^n,w For each
y eRwe can define a sequence of Brownian motions {Wn(s)}s>o such that for each
e>0,
^+l/2+*

SUp
k~l<s<\

Y
Vk (({^y
Vn-[ks\n)
iWW

- 11 _ ^ ! \

_ ,-y-i

Wj| ( j )

w - 00, A: =fc(rc)-> cx>, am/ A:/n 0.


Proof The sequence (F^ l)/y, (F^ l ) / y , . . . has distribution function 1
(l+yjc) - 1 /^, for which the conditions ofProposition 2.4.9 hold. Hence forO < s < \
and {Bn(t)} a sequence of Brownian bridges,
n8t*-l'2(l

sup

t)s~l/2

l/(n+l)<r<n/(n+l)
ni/2(1

_ 0y+i / ^ L _ l _

( 1

Replace t by 1 ks/n and note that (1 ks/n)s


After some rearrangements we get

"'r'

IW

Bn(t)

= 0P{\) .

x 2

l -> 1 uniformly for A: * < 5 < 1.

sup H (k-Ys^/^Vk W-v*-1 *-I<S<I

1
- >)
]_

(T)~V-I\
Y

( | ^-[ks],n)

- 1

Next note that for all y e R,

/ * y /y, r -ibi,-' _

(T)'X-I\ =

"~K - 1

54

2 Extreme and Intermediate Order Statistics

and that

ks\ d

(ks\

(ks\

ks

Bn 11 - J = Bn i \ L Wn l\ - W(l) .

(2.4.4)

It follows that
sy+l/2+e

1/2

+(-)

^w(l)

Op(D

uniformly for k l < s < 1. Hence for s > 0 the part within the absolute value is
op (I) uniformly. Moreover, notice that
\(ks \ l / 2 e
sup
1 S
-1<,<1 V "

Wn{\) =

OP(1).

The result follows.


The next step is to prove the result for sufficiently differentiable distribution
functions. That is, we use von Mises' second-order condition.
Lemma 2.4.11 Let X\, Xi,...
be i.i.d. random variables with distribution function F. Suppose U = (1/(1 F))*~ satisfies von Mises' second-order condition
of Theorem 2.3.12, Section 2.3. Then the result of Theorem 2.4.2 holds, provided
k = k(n) oo, k/n -> 0, and \/kAo{n/k)
= 0(1).
Proof. We are going to apply Proposition 2.4.9 but only for the right tail. The conditions of the lemma are that F" exists, F' > 0, and

,
sup til t)0<*<i

\F"(Q(t))\
^ < oo .
(F'(Q(t)))2

Since we are interested only in the right tail, we may without loss of generality change
our distribution near the left endpoint in such a way that
, \F"(Q(t))\
/t
s u p n l t)J
? < oo .
It remains to verify that
supf (1 - 1 )
rtl

\
(F'(Q(t)))2

or equivalently,
sup

G"(0
" ~75^V<00
(1

2.4 Intermediate Order Statistics and Brownian Motion

Since Q(t) = U(l/(l

55

- 0 ) . this is the same as


sU"(s) I

sup

2 + U'(s)

< oo

Now by assumption,
l i m 2 + - ^ = l + y;
^-oo

U (s)

hence the conditions are fulfilled.


Next we apply Proposition 2.4.9. Since we can be sure of the behavior only near
the right endpoint, in (2.4.3) we replace t by 1 ks/n throughout:

^/GPr"<r(^^)-*('-S)l
= Op(l) ,
with a(t) = tU'(t), or after some rearrangement,
sup

ks ,1/2+e

k- <s<\

vk>)
- , - ^

1 / 2

( l - ^ ) | = OHl).

(2.4.5)

Now take e > 0. Then the expression within the absolute value must be op(l)
uniformly in s. Next we look at the Brownian bridge part. Recall (2.4.4) with {Wn]
Brownian motion. Further note that
^-I/2

sup

k-l<s<1

(")

1 /

V ( 1 )= OP(1).

(2.4.6)

\k/

Combining (2.4.4), (2.4.5), and (2.4.6), we get as n -> oo,


sup s

1/2+e

Op(l)

(2.4.7)

It is not difficult to see mat (2.4.7) still holds if we replace the function a with any
function a\ provided a\ (t) ~ a(t), t -> oo. In fact, we shall use the function oo from
Theorem 2.3.6.
Finally, we can handle the expansion in the statement of the theorem:
v+1/2+sf /T(Xn-m,n-u(.j)

V
-

-y-

a0(?)
o(|)

s-y-i\
Y

56

2 Extreme and Intermediate Order Statistics

-VlA>(j)*(>"')}

(2.4.8)

+ ^)l, W ,/^^-^ : ^)_j^)j (2A9)


+ ^ + i/ 2 + ,/fo(|)_,-A^() )

V o(?)

(2A10)

/ *

where we used the identity


Xn-lksln

~ U(j)

/(&) ~fr( f )

flO(f)

flD(fe)

flO(f)

*-M, -

*0(f)

U(%)

0(&)

Expression (2.4.8) tends to zero uniformly in s by the inequalities of Theorem 2.3.6,


Section 2.3.
Since syao(n/(ks))/ao(n/k)
< cs~s for n > no uniformly for 0 < s < 1 by
Potter's inequalities (Proposition B.1.9), relation (2.4.7) implies that (2.4.9) tends to
zero in probability uniformly in s. Note that (2.4.7) still holds with the function a
replaced by ao.
Again by (2.3.7) and (2.3.23) we have

uniformly for 0 < s < 1. Also recall that


Wn(s)

sup

0<5<1 S1'2

/2

< oo

a.s.

Hence (2.4.10) tends to zero in probability uniformly in s.

Proof (of Theorem 2.4.2). Since U satisfies the second-order condition, there exists
a function U\ satisfying von Mises' second-order condition such that
lim

with Ai(t) = tUf{(t)/U[(t)


(note that |^(0I RVY+P\
sy+\/2+e

JT U(Xn-lksln)

* \k))

Tin x , / x =
t-+oo tU[{t)Ax{t)
- y + 1 (Theorem B.3.13). With q(t) =

(2.4.11)
tU[(t)Ax(t)

Ul(Yn-[ks),n)

q(Yn-[ks],n)

9(f) / '
(2.4.12)

2.4 Intermediate Order Statistics and Brownian Motion

57

Thefirstfactor is bounded by assumption and the second one tends to zero by (2.4.11).
For the last factor recall Potter's inequalities (Proposition B. 1.9(5)): for t
>to,x>j
(l-6)xy^min(x-\xe^

< ( l + ) j c ^ m a x ( * - e ' , j c e ' ) . (2.4.13)

<^ 1

We apply this with t := n/kandx := (k/n)Yn-[ks],n > (k/n)Yn-k,n


Now from Lemma 2.4.10 we have that

sr+^

(k-Yn-^ n)

=J - P - I / W ^

-+P l,n -> oo.

fg),-P^(2.4.14)

Combining (2.4.13) and (2.4.14) and the fact that


s-l/2+sWn(s)

sup
0<s<l

is bounded a.s., we find that the third factor of (2.4.12) is bounded as well. Hence
17 U(Yn-[ks],n)

y+l/2+e

sup sy+l/*+* s/k

~ U\{Yn-[ks],n)

k~l<s<\

/1x

= oP (I) ,

kUl^k'

n -> oo .

(2.4.15)
We already know from Lemma 2.4.11 that the result of Theorem 2.4.2 holds with
Xn-[ks],n replaced by U\(Yn-\ks],n)' Relation (2.4.15) then implies that the result of
Theorem 2.4.2 also holds with Xn-[ks],n replaced by U{Yn-\ks^n). This completes
the proof.

Proof (of Corollary 2.4.5). The range of t values in (2.4.3) is (n + 1) _ 1 < t <
rt(n + l)~ 1 .FortheresultofTheorem2.4.2weusedonlytherangen~ 1 < t < 1 n~l.
By taking t = n/(n + 1) in (2.4.3) and following the lines of the proof of Theorem
2.4.2 with s = n/(k(n + 1)) we obtain

n y+1/2+ j / - (xn,n-uq) _ (wH))~y-i

\k(n + l)J

io(f)

(2.4.16)
Let us consider the case y > 5 first. Since
sup s~l/2+e

\W(s)\ 4 - 0 ,

(2.4.17)

4>0,

(2.4.18)

0<s<k~l
SUp
0<s<k~l

^+l/2+

*K,p(^_1)

58

2 Extreme and Intermediate Order Statistics

and
sup ^+1/2+*
0<5<^ _1

J-''-l

(2.4.19)

as n -> oo, (2.4.16) implies that, for y ^ O ,

Xn,n - /() 1
+
oo(f)
K

k~y-

(2.4.20)

Since
sup

jy+i/2+* <

fc-y-i/2-*

0<s<k~l
5.-1
and X_[^]>n = Xn, for 5 < A
: x, we get, using (2.4.17)-(2.4.20),

sup s

Vk(Xn-insu-uty

y+l/2+e

0<s<k

s-y-i\
X

o(f)

For y = 0 a similar proof applies.


Next we consider the case y < | . We need to prove
s

SUp ^ + V 2 + * U P ^],/i ~" ^ ,


it- <5<l
I V
^o()
1

y )

(2.4.21)

_1

-*->'- W() - VfcA0 Q ) ^ , p ( 5 )


and
c-y

sup s

y+l/2+e K * -

*-''- 1 iy n (5) - VkA0 ( ) *y, p (*- 1 )

0.

0<s<k~

(2.4.22)
Now, (2.4.21) is dominated by the sum of two terms: the left-hand side of (2.4.2),
which goes to zero, and
+l 2+e

sup

Sr

' Vk

x, - uq)

k~ <s<\

<

0(f)
k-v-l'2-Jk X, - U(i)

o()

k-y-l/2-e

V*!

*, - t/(g)
ao(f)

i
Y
1

(JOT)

(^o)""-fe)-^G)^(^)

2.4 Intermediate Order Statistics and Brownian Motion

59

-y
|fc.y

_(*(*+!))
Y
-y-l

The first term of this expression tends to zero by (2.4.16). One easily checks that the
other three terms tend to zero too. Similarly one checks that (2.4.22) tends to zero by
considering the three terms separately.

Proof (of Theorem 2.4.8). As before, we prove the result with Xn-[ks],n replaced with
U (Yn-[ks],n), where \Yi^n} are the nth order statistics from the distribution function
1 - 1/JC, x > 1. Theorem 2.3.9 tells us that for s > 0,

ujYn-iksm) _(k
y
n [ksln
u(i)
V* - )
+ A

\k) \\nYn-[ks]>n)

^ ( 1 ) \nYn-[ksU)

}'
(2.4.23)

Now Lemma 2.4.10 tells us that, uniformly for k~l


(^-[fa],y =s~y + y^ (s'^Wnis)

<s<\,

+ op(\)s-y-xl2-e)

(2.4.24)

Similarly, we get from Lemma 2.4.10,

(ri-[fai,i,)p - 1 = Lzi
L-P-IWH{S) + 0
P
P Vk V

{l)s-P-w-e\

(2425)

'

Now note that in fact the product of the right-hand sides of (2.4.24) and (2.4.25) can
be written as
s-rS-^-Z+op(l)s-y-V2-.
Hence
Y

(-Yn-lksln)

^Yn-lkslnY

1=

- y L l ^ l

(l)s-Y-l/2-e

( 2 A 2 6 )

Similarly one checks that, for e < p + \,


(k
\Y+P+e
,
! v
(-*-*].)
= P (s~y-*~)

(2.4.27)

60

2 Extreme and Intermediate Order Statistics

It follows by combining (2.4.23), (2.4.24), (2.4.26), and (2.4.27) that the supremum over k~l < s < 1 of the expression in the first statement of the theorem is
op (1). The rest of the proof is like that of Corollary 2.4.5.
For the second statement note that the second-order condition (2.3.22) is equivalent to
,. logU(tx)-logU(t)-ylogx
xp-\
lim
t-+oo

A(t)

Moreover, we have the uniform inequalities (Theorem B.2.18)


I log U(tx) - log U(t) - y log x

xp - 1

A0(t)

<xpmd^(x\x~8)

The rest of the proof is similar to that of the first statement.

Exercises
2.1. Let X\, X2,... be i.i.d. random variables with distribution function F and
X\,n < %2,n "m < Xn,n the nth order statistics. Let F be in the domain of attraction of Gy with y > 0. Prove that Xn,n/Xn-\tn -+d Yy as n -> 00, where Y
has distribution function 1 1/jt, x > 1.
Hint: Renyi's representation (Section 2.1) implies that for exponential order statistics,
(_!,, Enyn En-\,n) are independent and En^n En-\,n has a standard exponential distribution. Hence (X n ,/X_i,) =d(U(Y*Yn-hn)/u(Yn-.hn),
where y* and
*n-i,n are independent, Y* has distribution function 1 l/x, x > 1, and F_i,n is
the second maximum of a sample of size n with distribution function 1 l/x, JC > 1.
Finally, use Corollary 1.2.10.
Remark: The converse statement is also true (see Smid and Stam (1975)).
2.2. (Beirlant and Teugels (1986)) Let Xi, X 2 , . . . be i.i.d. random variables with distribution functionFwithJC* > O.DefineM^ :== k'1 X!?=o logX-,> - l o g X _ M .
If F is in the domain of attraction of some extreme value distribution Gy with auxiliary
function a(n) and k < n is a fixed integer, then
M (1)
M

n,k

a(n)/U(n)

V>k k 2U=o

, y < u,

asn -> 00, with/ := (1/(1-F))*~, Qk,Z0, Z\,..., Z*_i independent, Q* gamma
distributed with A: degrees of freedom, and Z/, 1 = 0 , 1 , . . . , k 1, i.i.d. exponential.
2.3. Derive the limit distribution of Xn,n from the point process convergence of Theorem 2.1.2. Do the same for the joint distribution of (Xn-\tn, Xn,n).
2.4. What are the possible limit distributions of (X_i,w, Xn

n)l

2.4 Intermediate Order Statistics and Brownian Motion

61

2.5. Let Y\, Y2> be independent and identically distributed with distribution function 1 l/x, x > 1. Using the point process convergence of Theorem 2.1.2 find the
limit distribution under a trend, i.e., the limit distribution of maxi<;< n (X; i).
Hint: Recall what convergence of point process means (cf. last paragraph of Section
2.1).
2.6. Let U\,n < U2,n < - < Un, n be the order statistics from a standard uniform
distribution. Let k = k{n) be a sequence of integers such that for some p e (0,1),
lim^oo <s/n(k/n p) = 0. Prove that y/n(Uk,n p)/y/p(l - p) has a standard
normal limit distribution as n -> oo.
2.7. Show that the distribution function F defined by 1 - F(x) = x~l(l +
JC -1 exp(sinlogx)) satisfies the domain of attraction condition but not the secondorder relation (2.3.24).
2.8. Find the second-order relation for the Cauchy distribution.
2.9. Check the second-order condition for the normal distribution: note that with O
the standard normal distribution function,
1 - <D(f) = (27t)~l/2e-t2/2(l/t
Write ir(t) := 1/(1 - O(0). Prove that for

- l/t3+o(l/t3))

JCGR,

lim t2 U(t + x/t)I f{t) - ex) = (x2/2 + x) ex


locally uniformly and conclude that for x > 0,
lim (V(t))3{V(tx)

- V(t) - (\ogx)/V(t)}

= -(logjc) 2 /2 - log* ,

x >0,

f-00

with *I> the inverse function of fit). Finally, for x > 0,


lim (21ogf) 3/2 ( V(tx)

-V(t)-

log*
\
1
(2 log t - log log t - log 47r) /2 J

= -(log*)2/2-logx.
Hint: For the last step use (and prove)
tp(0 = (21ogr - log log t - log4jr) 1/2 + o ((21og0~ 1 / 2 ) ,

t -> oo .

2.10. Check that the gamma distribution satisfies the second-order regular variation
condition with y = p = 0 and determine possible auxiliary functions a and A.
2.11. Prove the equivalence of (2.3.22) and (2.3.24) by noting that (2.3.22) is equivalent to
U{U-(,)xUr)_x

xp/y_l

lim
= x
t-+oo A (t/<-(*))
y
(with U*~ the left-continuous inverse of U) and then applying Vervaat's lemma (Appendix A).

62

2 Extreme and Intermediate Order Statistics

2.12. Let U(t) = tv -k/y + ^ + T / T + t f ( ^ + T ) for y > 0and r < 0, f -> oo. Check
that C/(f) satisfies the second-order condition for y positive (2.3.22) with A(t) = tT
iffc= 0 or y -f r > 0 and A(t) = kt~y if y + r < 0. Discuss the case y + r = 0.
2.13. Let y > 0. Check that for y + p > 0, or y + p < 0 and lim^oo 7(0
a(t)/y = 0, if A*(f) is the auxiliary function in (2.3.22) then possible first- and
second-order auxiliary functions for (2.3.5) are a(t) = yU(t)(l + A*(t)/y) and
A(t) = (y + p)A*(t)/y respectively (and vice versa).
2.14. Verify that if U(t) = c 0 + c i ^ (1 + c2tp + 0 ( f ^ f o r y < 0,p < 0, y + p ^ 0,
co, <?2 # 0, and c\ < 0, as t -> oo, then the second-order condition (2.3.5) holds
withA(0 = py~l(y + p)c2tp and tf(0 = ycif^ (l + p _ 1 A ( 0 ) .
2.15. The Student /-distribution with v degrees of freedom satisfies
v/2

1 - F(0 _

-cv

vv/2+l

L-7

{V

+av

(V+2

n(t~v~4}

+u(t

) ,

t -* oo, where
V

cv

1/JT

1/4

2/(3JT)

3/16

8/(15*)

dv

--1/(3*)

-3/16

-4/(5)

-10/32

-8/(7*)

(Martins (2000)). Hence this model satisfies the second-order condition (2.3.5) with
y = 1/v and p = 2/v. Obtain the auxiliary functions.
2.16. Lemma 2.4.10 implies that

Vk (^Yn-[ks]tn - s~l) -i s~l W(s)


in D(0,1], with W denoting Brownian motion. Let Fn be the empirical distribution function. Note that (n/k) {1 Fn(n/(kx))} is the inverse function of
((k/n)Yn-[ks],n)~ Use Vervaat's lemma (Lemma A.0.2) to conclude that

^('-'(HHJM;)
inD[l,oo).
2.17. Let s be some fixed positive constant. Deduce under the conditions of Theorem
2.4.2 that

converges to a normal random variable with mean zero and variance s~2y~l
n -> oo.

as

2.4 Intermediate Order Statistics and Brownian Motion

63

2.18. Formulate an analogous weak convergence result for the empirical distribution
function in the situation of Theorem 2.4.2. Assume VkAo(n/k) -* 0. Prove the
result.
2.19. Prove that under the conditions of Theorem 2.4.8 and \/kAo(n/k) -> X,
^n2k,n

has asymptotically a normal distribution. This is a first example of an estimator of


y. Notice that (log Xn-2k,n log Xn-k,n) I log 2 is a consistent and asymptotically
normally distributed estimator of y.

Estimation of the Extreme Value Index and Testing

3.1 Introduction
The alternative conditions of Theorem 1.1.6 (Section 1.1.3) serve as a basis for statistical applications of extreme value theory.
Consider relation (1.1.22) (Section 1.1.3): there exists a positive nondecreasing
function / such that
1 II

l i"

F7^

= (l + yx)

l/y

(3.1.1)

for all x for which 1 + yx > 0, where JC* = sup{x : F(JC) < 1}.
Let X be a random variable with distribution function F and let F e T>(Gy) for
some real y. Then (3.1.1) tells us that for x > 0, x < (0 v
(-y))~l,
lim P ( ^ > x \X > t] = (1 + yjt)- 1/}/ .
That is, the conditional distribution of (X t)/f(t)
distribution, as t t **>
Hy(JC) := 1 - (1 + yx)~l/Y

(3.1.2)

given X > t has the limit

0 < x < (0 v ( - y ) ) _ 1 ,

(3.1.3)

where for y = 0 the right-hand side is interpreted as 1 e~x. This class of distribution
functions is called the class of the generalized Pareto distributions (GP). Figure 3.1
illustrates this class for some values of y.
Relation (3.1.1) means loosely speaking that from some high threshold t onward
(i.e., X > t) the distribution function can be written approximately as

1 - F{x) * (1 - F(t)) j l - Hy (j^)\

. x >t,

which is a parametric family of distribution tails. One can expect this approximation
to hold for intermediate and extreme order statistics. Let X\, X 2 , . . . be independent

66

3 Estimation of the Extreme Value Index and Testing


Y L
y=-l
i

y=q.

yv
y= l

1.0 -

---

0.8 -

s'

0.6 -

^^-~~-~

^^^^

~~~~

0.4 0.2 -

1.0

1.5

Fig. 3.1. Family of GP distributions: for y = 2 and y = - 1 the right endpoints are 1 and
0.5 respectively, for y > 0 the right endpoint equals infinity.

and identically distributed random variables with distribution function F, and Fn the
corresponding empirical distribution function, i.e., Fn{x) := n~l YH=x 1{X,<.*}- Let
us apply the last approximation with t := Xn-k,n, where we choose k = k{n) > oo,
k/n 0, n -> oo. Then
1_ F

(X'*:-"'")}

* (1 - F ( X _ M ) ) j l - Hy

and, since 1 - F(Xn-k,n) 1 - Fn(Xn-kyn) = k/n,

-'-;{'-^(i^4s))-

In order to make this approximation applicable, we need to estimate y and the


function / at the point n/k.
This approximation is valid for any x larger than Xn-k,n and can be used even
for x > Xnin, i.e., outside the range of the observations, and this is in fact the basis
for applications of extreme values.
Next we consider a similar application of relation (1.1.20) (Section 1.1.3), that
there exists a positive function a such that for all x > 0,
U(tx) - U(t) _
a(t)
~
r->oo
lim

XY

- 1
y

(3.1.5)

Relation (3.1.5) leads to the following approximation:


U(x) % U(t) + a(t)^

x> t .

(3.1.6)

This approximation is useful when one wants to estimate a quantile F <_ (1 p) =


U(l/p) with p very small, since this quantile is then related to a much lower quantile

3.1 Introduction

67

U(t) = F*~(l l/t), which can be estimated by an intermediate order statistic.


Hence we choose t :=n/k with k = k(n) -> oo, k/n - 0, n - oo. Then for large
y, for example y \/p with p small,
(3.1.7)

O>~(I)+'(I)H-

In order to make this approximation applicable, we need to estimate y, the function a


at the point n/k, and U(n/k). The latter quantity can be estimated by an intermediate
order statistic. Again the approximation will be used not only for y < n but also for
extrapolation outside the sample.
Let us go back to the examples of applications of extreme value theory discussed
in Section 1.1.4.
Sea Level
As mentioned previously, the Dutch government requires the sea dikes to be so high
that in any year a flood occurs with probability 1/10000. In order to estimate the
height of the dike that corresponds to that probability, 1877 high-tide water levels are
available, monitored at the coast, one for each severe storm of a certain type, over a
period of 111 years. The observations can be considered as realizations of independent
and identically distributed random variables. So, if F is the distribution of these
random variables (the ones taken during storms), we need to estimate U ((1877/111) x
104) /(17 x 104). The largest observation roughly corresponds to 1/(1878)
U(19 x 102). This means that we have to estimate the quantile function outside the
range of the available data, and for this the extreme value conditions can be used. So
we suppose F e V(GY) for some y e R. Then by (3.1.7),
(mxio4Y

_i

/(17x I0 )u(i)+a(^)+-'Kk/

\k

(3.1.8)

We propose the estimator

(3.1.9)
based on suitable estimators y,U(n/k), and a(n/k). In the rest of this and in the next
chapter we shall meet various estimators of these quantities for which the vector

rrl.

t/(f)-t/(f) 3(|)

11/ft

is asymptotically normal under suitable conditions. Using this relation we will prove
in Chapter 4 that

68

3 Estimation of the Extreme Value Index and Testing

#(17 x 104) - 1/(17 x 104) ,


suitably normalized, is asymptotically normal. This leads to an asymptotic confidence
interval for 17(17 x 104).
S&P500
On the basis of observations of loss returns of the S&P 500 index, we want to estimate
the probability that a certain given large loss is exceeded.
If F is the distribution of the loss log-returns and supposing F e V(Gy), relation
(3.1.4) suggests

w
.,
.ij
,(^
)}(I wr
I+
S
with y and a(n/k) suitable estimators. Recall the relation f(t) = a(l/(l F(t)))
(cf. Theorem 1.1.6) with a the positive function in (3.1.5). Then for t := Xn-k,n we
get /(*_*,) = 0(1/(1 - F(X_ M ))) a{n/k).
Again we see that the estimation of y is a crucial step, which is the main subject
in the present chapter. Next, in Chapter 4 we shall prove asymptotic normality of the
tail estimator suitably normalized.
Life Span
The life span of people born in the Netherlands in the years 1877-1881 is assumed to
be random. Based on life spans of about 10 000 people, we want to decide whether the
underlying distribution has a finite upper limit U(oo), and if so, we want to estimate
U(oo). The asymptotic normality of (3.1.10) provides a confidence interval for y that
enables us to test the hypothesis Ho : y > 0 versus H\ : y < 0. It will later turn out
that the null hypothesis is rejected. Then we want to estimate the finite value U(oo)
and for that we use the limit relation (1.2.14) of Lemma 1.2.9:
,. l/(oo) - 1/(0
lim
=
t-+oo
a(t)

1
,
y

i.e.,
/(oo) 1/(0

a(f)
y

We propose the estimator

- /n\

a (?)

U(oo) = u(-)!f.
and we will prove, using the joint asymptotic normality in (3.1.10), that U(oo)
U(oo), suitably normalized, is asymptotically normal. An asymptotic confidence interval for U(oo) ensues.

3.2 A Simple Estimator for the Tail Index: The Hill Estimator

69

The above examples show that it is useful to develop estimators of y, U(n/k),


a(n/k)9 U(l/p) with p small, U(oo) and 1 F(x) with x large. In this chapter we
discuss estimators for y (and occasionally for a(n/k)), and in the next chapter we
shall discuss estimators for the other quantities.

3.2 A Simple Estimator for the Tail Index (y > 0):


The Hill Estimator
In order to introduce the Hill estimator, a simple and widely used estimator, let us
start from Theorem 1.2.1: F e V(Gy) for y > 0 if and only if
hm = x L/y , y > 0 .
t^oo 1 - F(t)
In this case the parameter a := l/y > 0 is called the tail index of F. Theorem 1.2.2
gives an equivalent form of this condition:
lim -

= y .

Now partial integration yields

f
I

ds

(1 - F(s)) = I

(log if - log 0 dF(u) .

Hence we have
ft (log u log t) dF(u)
lim Jt
-f-l
= y .
(3.2.1)
t^oo
1 - F(t)
In order to develop an estimator based on this asymptotic result, replace in (3.2.1)
the parameter t by the intermediate order statistic Xn-k,n and F by the empirical
distribution function Fn. We then get Hill's (1975) estimator y#, defined by
/5
logu - log Xn-k,n dFn(u)
YH : =

l-Fn(Xn-k,n)

or

YH-\Y1

k-\

lQ X

g n-i,n - log Xn-ktn .

(3.2.2)

For the proof of the following theorems we need this auxiliary result.
Lemma 3.2.1 Let Y\, Yi,... be i.i.d. random variables with distribution function
1 l/y, y > 1, and let Y\,n < Y2,n < < ^n,n ^ ^ w^/z orJ^r statistics. Then
with k k(n),
lim Yn-k n = o
a.5. , n -> oo ,
provided k(n) = o(n).

70

3 Estimation of the Extreme Value Index and Testing

Proof. The strong law of large numbers implies


1

Y-1

n **

rr

"? " '

a.s. ,

n -> oo ,

for r = 1,2,... . We proceed by contradiction. Suppose that for some r we have


Yn-k,n < r infinitely often. This implies
k
i "
- =

i "
i
-l1liyi>yn-k,)>-l1hri>r}>Yr

infinitely often, which is the desired contradiction.

Theorem 3.2.2 Let X\, X%,... be Ltd. random variables with distribution function
F. Suppose F e V(GY) with y > 0. Then as n oo, k = k(n) oo, k/n -> 0,
*

YH-+Y
Proof By Corollary 1.2.10, F e V(Gy) with y > 0 implies
U(tx)
r
lim
= xY
^ o o i/(r)
for x > 0, i.e. (Proposition B.1.9), for JC > 1 and f > to,
U(tx)

(l-e)xy-'<^<(l+e)xy+'9
or equivalently,
log(l -e) + (y-

sf) log JC < log U(tx) - log l/(f)


< log(l + e) + (y + e') log * .

(3.2.3)

Let Y\, Y2,... be independent and identically distributed, with common distribution 1 - 1/v, v > 1. Note that 17(7/) = rf X,-, i = 1, 2 , . . . . So it is sufficient to prove
the result for yH := k~l ^
log /(F_,>) - log / ( r - M ) . Apply (3.2.3) with
t = Yn-k,n, x = Yn-ifn/Yn-k,n' Since by Lemma 3.2.1, yn_jt, -> 00 a.s., n -> 00,
we have eventually,
log(l - e) + (y - ef) log f ^ z i i " ) < log t/(y n _,. w ) - log f/(Fn_it,n)

< log(l + s) + (K + *') log ( ^ 5 - )


\Yn-k,n/
for 1 = 0 , 1 , . . . , k 1; hence

3.2 A Simple Estimator for the Tail Index: The Hill Estimator

log(l - e) + (y - e')~ Y)log ( -f-^K

71

J < yH

\Yn~k,nJ

i=0

< log(l + e) + (y + e')\ f > g f ^ d i ) .


K

,_n

\*n-k,nj

It now suffices to prove that as n - oo,


*-i

*I>(fe)

A:

ii.

This is part of a separate lemma (note that log Yf has a standard exponential distribution), which we give next.

Lemma 3.2.3 LetE, E\, Ei,... be i.i.d. standard exponential and let E\,n < E2,n <
< Enfn be the nth order statistics. Let f be such that Var/() < oo. Then

Vk

/(-,-, - -*,) " * / ( )


\

i=0

w independent of En-k,n and asymptotically normal with mean zero and variance
Vaif(E) as n -> oo, provided k = k(n) -> oo and k/n -> 0.
Proo/ Renyi's (1953) representation implies the independence statement and it gives
for each n,
/*

"n-7+1

\^ni,n ~~ Enktnj._Q =
j=i+l

i=0

with *, | independent and identically distributed standard exponential. It follows that the distribution of the left-hand side does not depend on n and that
[En-Un - En-k,n}i=0

{Ek-itk}ifc-i
i=Q

It follows that

= v* I /(,)-/())
since we take the average of all order statistics.
The result follows from the central limit theorem.

72

3 Estimation of the Extreme Value Index and Testing

A somewhat surprising converse of Theorem 3.2.2 was proved by Mason (1982).


The following is a somewhat stronger result, with different proof.
Theorem 3.2.4 Let X\,Xi,..
.be i.i.d. random variables with distribution function
F. Suppose that for some sequence of integers k = k(n) -> oo, k(n)/n -> 0, and
k(n -h l)/k(n) -> 1, as n -> oo,
P

YH-+Y

Then F e

> 0.

V(Gy).

Proof Let Fn be the empirical distribution function of X\, X2,..., Xn and Gn the
empirical distribution function ofY\,Y2,...,Yn,
which are independent and identically distributed 1 1/JC, x > 1. Then for each ,

1 F w

- " = 1 - G -(r^))

We write
du
di
u

n r
9H = T
d-Fn(u))
k
Jxn-k,n

= lf
k

du

(l-Gn(s)) d\ogU(s)

JYn-k,n

with Y\,n < Y2,n < - < Yn,n the order statistics
We are going to use the following results:

ofY\,Y2,...,Yn.

1.
plsups(l-Gn(s))>b\
(

inf

\i<s<Yn,n

s(l-Gn(s))

= i,
<a]

for b > 1,

l/a
<- e~-\/a
,

for 0 < a < 1

(Shorack and Wellner (1986), pp. 345 and 415).


2. Note that
inf
^-M<^<^,n

5 (1 -

G(J))

ks
= inf r-[jte],
0<5<1

\0<s<\

Yn-k,n J \n

where the two factors are independent. Hence this is the same in distribution to

CSV 1 ") lY-k'"

3.2 A Simple Estimator for the Tail Index: The Hill Estimator

73

with the F*'s independent of and equal in distribution to the Y 's, that is,
(

inf

s(l-Gn(s)))-Yn-k,n.

3. From Corollary 2.2.2,


lim P\\-8

< -y-jk, < 1 + e ) = 1 .

For sufficiently large t let n n(t) be the integer satisfying


n(t)
k(n(t)) ~

n(0 + l
- k(n(t) + 1)

Consider
r0

dlogU(s)

(l+e)

rflogl/fr)

n />

s(\-G
-G(*))
dlogl/fc)
H

The first inequality is true by definition. The second and fourth inequalities are true
with probabilities tending to 1 (for the second we use result (3); the fourth is true by
assumption). By results (1) and (2) the third inequality is true with probability at least
1 - e{\ - e)~l exp (-1/(1 - s)) > 0.
Hence we reach the conclusion that for each s > 0,
/

fdlogU(s)

P{t]t

-^<K+)>0

for t sufficiently large. Hence for sufficiently large t, t ft s~l d log U(s) < y + s.
We get the other inequality in a similar fashion.
Hence
lim t "dlogt/(s)
t-+oo /
Now by partial integration
f 1
f
ds
-d\ogU(s)
=t
logU(s) -j
- log U(t) .
s
s
Jt
Jt
Hence by Remark B.2.14(2) we find that for x > 0,
t

lim (log U(tx) - log U(t)) = y log JC .


t->oo

That is, U is regularly varying with index y, which implies (Proposition B.1.9) that
the function 1 F is regularly varying with index l/y.

74

3 Estimation of the Extreme Value Index and Testing

Next we formulate conditions that lead to asymptotic normality for j/#.


Theorem 3.2.5 Suppose that the distribution function F satisfies the second-order
condition of Section 2.3, i.e., for x > 0,
77$ "*y
lim " (t >
=Xy'-oo
A(t)

x-\
-,
p

(3.2.4)

or equivalently,
Um

117
JT^M-xzW

* uxP/r-1 f

=x-i/Y

(3 2 5)

\\-F(t))

where y > 0, p < 0, and A isapositive ornegative function with lim,-*.,*) A(f) = 0.
Then

with N standard normal, provided k = k(n) - oo, k/n * 0, n > oo, and
lim V*A ( ) = A.

(3.2.6)

with X finite.
Remark 3.2.6 It may happen that the convergence of U(tx)/U(t)
than any negative power of t, i.e.,
t^oc

\ U(t)

to xy is faster

for all x > 0 and a > 0. In that case, the result of Theorem 3.2.5 holds with the
bias part k/(l p) replaced by zero. A similar remark can be made for all the other
estimators.
Proof (of Theorem 3.2.5). We write the second-order condition as
lim

'-oo

T7(7)
A(t)

xp

- \
p

Since lim^oo A(t) = 0, this is equivalent to


xp - 1

log U(tx) - log U(t) - y log x


lim
'-oo

A(t)

.
p

We apply the inequality given in Theorem B.2.18: for a possibly different function
AQ, with A.o(f) ~ A(t), t -> oo, and for each e > 0 there exists a fo such that for
t >to,x
>l,
log C/(fJt) - log U(t) - y log x
Ao(0

xp - 1

< *"+* .

(3.2.7)

3.2 A Simple Estimator for the Tail Index: The Hill Estimator

75

As in the proof of Theorem 3.2.2, we note that


1 k~l

YH = X>g/(r-,>) - log U{Yn-k,n),


where the Yt's are independent and have common distribution function 1 1/JC,
JC > 1. Hence we work with this representation for YHApply (3.2.7) with t := Yn-k,n -* oo a.s. n -* oo (Lemma 3.2.1), and x :=
Yn-i,n/Yn-.ic,n' Then we get eventually, as in the proof of Theorem 3.2.2,
1 k~l
YH = > g J7(r-,>) - log /(y-M)
i=0

+ Op(i)|A0(r-M)liE ^ )

hence

Vk(yH -Y) = Y^k I 7 T ] log -^-^- - 1 )


\P

k-1 (&=Ls.Y - 1
1=0

*-1 / v
* n
i=0

. \ /+e

\Xn-k,n/

The first term, suitably normalized, is asymptotically normal by Lemma 3.2.3. As in


the proof of Lemma 3.2.3, we have
*-l (=**-)" - 1 , 1 *-* V
r

i=0

i=0

Hence (3.2.8) tends to ( 1 ^ - l ) / p = (1 - p )


Similarly,
1_ V"^ / Yn-i,n \ \

k^Mn-k,*)
,=o v--*.'

_1

nA-P

by the law of the large numbers.

4- EY** = ,

1-p-e

It remains to prove that


Ao(yn-t,n) P j

Mi)
This follows from Lemma 2.2.3, the fact that the function | An| is regular varying, and
Potter's inequalities (Proposition B. 1.9).

76

3 Estimation of the Extreme Value Index and Testing

Proof (Second proof of Theorem 3.2.5 via the tail quantile process). By the second
statement of Theorem 2.4.8, with {Wn(s)}s>o a sequence of Brownian motions and
for each e > 0,

log Xn-fain ~ logXn-k,n

+ J = (s'lWn(s)

= -ylogs

- W(l))

where the 0p-term tends to zero uniformly for 0 < s < 1. Hence
YH = /

(log Xn-[ksln ~ log Xn-kfn)

ds

Jo
= Yf

{-\ogs)ds

+^ - j

(s-lWn(s)-Wn(l))

ds

It follows that
Vk(yH-Y)

= YJ

(s-lWn(s)

- Wn(l)) ds + VkA0 ( ^ ) y - +oP(l)

The result follows, since


Var

If (s-1Wn(s)-Wn(l)) ds\ =

Remark 3.2.7 A third proof of Theorem 3.2.5, using an expansion for the tail empirical distribution function, will be given in Section 5.1.
Examples of distributions satisfying the second-order condition are abundant. For
example, the Cauchy distribution satisfies
1 - F(x) =

(JCTT)"1

- (37r) _1 jc" 3 + o(x~3) ,

00

Hence it satisfies the second-order condition (3.2.5) with y = 1 and p = 2. In fact,


it is easy to see that if
1 - F(x) = cix~l/y

+c2x-l/y+p/y(l

+0(1)) ,

x -> oo ,

(3.2.9)

for constants c\ > 0, c2 ^ 0, y > 0, and p < 0, then the second-order condition
(3.2.5) holds with the indicated y and p.
The second-order framework used in Theorem 3.2.5 provides the most natural
approach to the asymptotic normality of estimators like Hill's estimator. However,
next we discuss some of the problems related to second-order conditions of this type.

3.2 A Simple Estimator for the Tail Index: The Hill Estimator

77

The parameter p controls the speed of convergence to asymptotic normality of


YH Fo r instance, a distribution function F satisfying (3.2.9) satisfies the secondorder condition (3.2.5) with A(l/(1 - F(t))) = pY~lc2c^ltp^
and hence (3.2.4)
with A(t) = py~lC2C^~ tp. Moreover, if (3.2.6) holds with A, ^ 0, then a simple
calculation shows that this is true if and only if

Then the convergence rate in Theorem 3.2.5 is of order


Now consider three types of sequences:

np^l~2p\

1. Suppose Vic \A(n/k)\ -> oo. Then it is not difficult to see using the inequalities
in the proof of Theorem 3.2.5 that
-Y
A(|)

YH

l-p'

Since for large w, k must be much larger than W - 2 P / ( ! - 2 P ) 5 w e have for large n,
n/k much smaller than nl+2P/(i-2P) = w 1 /(i-2p) Hence the convergence rate
| A (n/k) | is slower than the rate np^l~2p) found after (3.2.10).
2. Suppose VkA (n/k) -> 0. Then
t(n) = o ( n - 2 ^ 1 - 2 ^ ) >
and the convergence rate l/Vk is again slower than np^x~2p\
3. Suppose VkA (n/k) -> X ^ 0, oo. Then by (3.2.10) the convergence rate l/Vk
is of order np^l~2p\ This is the optimal situation.
The above discussion leads to the question, what is the best choice for A? Theorem
3.2.5 tells us that if VkA (n/k) -> A, then
Vk(YH -Y)-^yN

+- ^ l-p

(3.2.11)

with N standard normal and hence


d

YN
YH - Y F +

A
YN
(!)
F ^ V + /i
\

,^*
(3.2.12)

We want to know for which choices of k k(n) this approximation is best, i.e., for
which k its mean square error
Y2
k

A2 ()
(1 - p) 2

78

3 Estimation of the Extreme Value Index and Testing

is minimal. For the time being we continue to consider the special case A(t) = ctp.
Write r := n/k. This leads us to
2
(ry2
c22rr2p
f> \
argrnm ( n / r ) = u ,... ( + _ ~ 2 J ,

and for simplicity we consider


2 12
(ty2
cch
t "" \
argmm, > 0 ^ + ~ 2 j .

The infimum is attained by setting the derivative equal to zero, i.e.,


y2 _ -2p c2t2P-x
n~
(l-p)2
'
or

/v2a_^2xl/(2p-D
/ Y U P) \
l/d-2p)

Equivalently, by setting t = r = n/k,


. 2 x l/(l-2p)

-[(^f)'

-2p/(l-2p)

where [x] means the integer part of x. Hence


,2,i

o(") =

.2x1/(1-2/)

(Y2(l-P)2Y

-2p/(l-2p)

(3.2.14)

and we call ko(n) the optimal choice for the sequence k(n) under the given conditions.
Now we go back to (3.2.12). We would like to consider min* E (y#y) 2 , but since
the expectation may not exist, we consider the minimum of the substitute expression
E
and the sequence ko that optimizes (3.2.15) will serve as the optimal choice for the
estimator y# too. Note that
lim VkoA ( f ) = lim c n k ^

= ****-<

(3 .2.16)

where sign(c) = 1 if c > 0 and sign(c) = 1 if c < 0. Hence for this choice of k we
have

^{yH-y)AN{^l,y2)

3.2 A Simple Estimator for the Tail Index: The Hill Estimator

79

and

A(|)

(YN
lim fc0min

\2

-= + ^ -

,. ,

. (Y2

= lim * 0 min Y +

^{D \
u /

by (3.2.13), which equals

by (3.2.16).
So far we have considered only the special case A(t) = ctp with p < 0. This is
often assumed in applications of extreme value theory. However, such simplification
is not feasible in the case p = 0. So next we shall consider the optimality problem
in the more general case of the second-order condition, not just the special case
A(t) = ctp.
As it will turn out in the end, if a sequence ko(n) is optimal then any sequence
k(n) ~ ko(n), as n -> oo, is also optimal. This implies that we can replace the
function A by any function A* with A*(f) ~ A(t), as t -> oo, without loss of
generality.
Similarly as before, we are faced with finding
argmin^o {

(3 2 17)

(T^)2 J

' -

where the function | A | is regularly varying with index p < 0. For p 0 it is reasonable
to assume that there exists a function A* with |A*(Y)| ~ |A(r)|, as / - oo, and |A*|
is monotone decreasing (see Theorem C.l in Dekkers and de Haan (1993)). In that
case we can assume without loss of generality that the function A2 satisfies
r

hm

A2{tx) - A2(t)

t^oo

q(t)

.
= log x ,

x > 0,

with q a suitable positive function.


Then for each p < 0 (cf. Proposition B.2.15) there exists a positive decreasing
function s RVip-i such that as t -> oo,
/OO

A2(f) ~ / s() dw .
/
We have for c > 1 and sufficiently large t,
,2

^-1

/oo

(3.2.18)

,,,2

f
A{t)
*y ^ c
f00 ^ A
y ^
2
n +
(1 - p)-~ J,/ J(W) d < n h (1 - p) 2

< ^ - + -- /
/2

(I

s() </ . (3.2.19)

80

3 Estimation of the Extreme Value Index and Testing

The infimum over t > 0 for the right- and left-hand sides can be calculated by just
setting the derivative equal to zero. For the right-hand side we get

y2d-pf

= s(t),

which is equivalent to
t= s

and the infimum is

Y ^ /y2a-P)2\
n
V cn
/
c

c f
s(u)
2
p)2 JsJs^{Y2(\-p)'1/(cn))
(1(1 - P)

du

\YW-PY
2

(l-p) j

cn

/-y 2 (l-p) 2 /(cn)

(1 - P) Jo

Cn

Js^(Y2i\-p)'1Kcn))

s^(u) du,

where for the last step we used


/oo
V5+

pv

s(u)du=
Js<-{v)

s*~(u)du.

(3.2.20)

J0

For the left-hand side of (3.2.19) we have the same result but with c replaced by c~l.
It follows that the infimum (3.2.17) is
y2(l-p)2/n

(1 -p)2Jo

s^(u) du,

and it is attained at
t ~ s

(**)-

i.e. (since t replaced n/k),


k(n)

( ) '
Hence an optimal sequence ko = ko(n) in the sense of minimizing y2/k
A2(n/k)/(l p)2 is given by
ko =

-(**) J"

What can we say about the asymptotic distribution of \/&o(y# y)? As before,

3.2 A Simple Estimator for the Tail Index: The Hill Estimator

81

-p

so we have to evaluate */k~oA (n/ko) for n large. By (3.2.18) and (3.2.20),


koA2

( J

y-r-

-rr /

s(u)

du

= n (r~ p)2/n ,-w * - *^,- (^ 2(i -^ 2 )i


^ ( y < i - p ) ) y

= K2(1-P)2

/j

/y2(1-^2/"J^()(fM
K 2 (l-P) 2

,-(**)

Now by Theorem B.1.5, since (l/s)*~ e fl Vi/(i-2p),

ta^_taC^l_i^

JC-+00

(note that s^(l/x)

(,221)

= (l/s)+~(x)). Hence for p < 0,

and

For p = 0 the limit in (3.2.21) must be interpreted as infinity. This means that
for p = 0, by minimizing the mean square error we get an optimal sequence ko for
which
T/ko(YH-Y) + bnAN(0,y2),
where bn is a slowly varying sequence tending to plus or minus infinity. This statement
is not useful for obtaining an asymptotic confidence interval for y. All we can say is
that
y/ko,
v P +
-7 (YH -Y)->1

If for p = 0 we take the sequence k(n) a bit smaller, we do get asymptotic


normality. Take kx := k\(ri) such that
lim kkA2 ( l )

= X2 > 0 .

(3.2.22)

Write f(t) :=X2t/ ft s(u) du with s as in (3.2.18). Then /(f) = X2 I f s(tu) du


is increasing and is R V\. Moreover, we have

82

3 Estimation of the Extreme Value Index and Testing

()

In contrast to the case p < 0 we have for the optimal choice

YHI-P)2
n

s(n/ko)

Now, the functions f(t) = X2t/ ft s(u) du and \/s(t) are both RV\, but by Theorem B. 1.5,
lim s(t) f(t) = 0 .
t-+oo

Consider as an example the distribution function for which U(t) = tY log t + 1.


Then
U(tx) - xyU(t) = tyxY log* ,
hence for x > 0,

U0.-xy

lim ^-:
^00
t-*oo

= xY logjc

logr

Consequently,

A(0=
and

logt
2

S(t) =

KlogO 3 '
It follows for the optimal sequence ko(n) that
*o(ft)~

dogn)3,

but the limit relation


holds for the sequence
k(n) ~ A log n .
Remark 3.2.8 An adaptive choice of ko(n) is possible. That is, one can obtain an
estimator ko(n) such that ko(n)/ko(n) -+p 1. We referto Drees and Kaufmann( 1998),
Danielsson, de Haan, Peng and de Vries (2001), and Beirlant, Vynckier, and Teugels
(1996). Similarly, adaptive choices of &o(>0 are possible for the other estimators of
the extreme value index discussed in the next sections, as well as for estimating high
quantiles and tail probabilities (cf. Chapter 4).
Somefinalwords about the Hill estimator (but this is true for most other estimators
too). As we have seen, in the case p = 0 of the second-order condition we have only
asymptotic normality of \fk(yu y) if k(n) grows with n very slowly. This means

3.3 General Case: The Pickands Estimator

83

that for even moderate sample sizes the estimator may give the wrong impression
since the bias takes over very rapidly.
Another disadvantage of the Hill estimator is the fact that the estimator is not shift
invariant. A shift of the observations does not affect the first-order parameter y but it
may affect the second-order parameter. Consider the special case
U(t) = co + city + C2ty+T + o (ty+T)

(3.2.23)

with c\ positive, co and C2 not zero, y > 0, and r < 0. Then


U(tX)

C0

_y

C2

xY ~ t yxy(x y - 1) H
tTxy(xT - 1) .
U(t)
c\
c\
For r > y the second term dominates and for x < y the first term dominates.
Hence in the second case (r < y) one can improve the rate of convergence by
applying a shift co to the observations so that the first term of (3.2.23) disappears
and the second-order parameter changes from y to x < y. This simple trick (due
to Holger Drees) works in surprisingly many cases and results in much less disturbing
bias. Of course the trick also works when tT is replaced by any r-varying function.

3.3 General Case y R: The Pickands Estimator


The simplest and oldest estimator for y is the Pickands estimator (1975):
n

/-i\-l i

yP := (log 2)

^nk,n ~ Xn2Jc,n

log

/0

.
^n2k,n

1X

(3.3.1)

~~ ^n4k,n

We shall prove weak consistency and asymptotic normality of yp.


Theorem 3.3.1 Let X\, X2,... be Ltd. random variables with distribution function
F. Suppose F e V(GY) with y R. Then as n -> 00, k = k(n) 00, k/n -* 0,
^

YP-+Y

For the proof we need the following auxiliary result.


Lemma 3.3.2 Let Y\, Y2,.. be Ltd. random variables with distribution function
1 \/x, x > 1. Then as n > 00, k = k(n) -+ 00, k/n -> 0, the random vector

V2* f I^!L _ l, ^ ^ L _ V2)


\^ Yn-2k,n

2> Yn-4k,n

is asymptotically bivariate standard normal.


Proof Since by Renyi's representation (with Yo,w = 1)

(3.3.2)

84

3 Estimation of the Extreme Value Index and Testing

j^Ml"

-{(^JV'J"

(3 .3.3)

with F*, F > ^n independent and with common distribution function 1 1/JC,
the two components of the random vector in (3.3.2) are independent. By restricting
attention to 0 < / < k in (3.3.3) one also sees that the distribution of
i*

*nk,n J / = i

does not depend on n. So the first component in (3.3.2) is equal in distribution to


y/2k(\rkt2k-l\

(3.3.4)

and the second component to

Now use the fact that by Lemma 2.2.3,


V2k (^Yk,2k - l)
is asymptotically standard normal (note that Yk,2k =d ^+i 2k w ^ ^ M
2.2.3). Similarly for the second component in (3.3.2).

as

*n Lemma

Corollary 3.3.3 Denote the limit vector of'(3.3.2) by (Q, R). Then

\ 4 Yn-4k,n

V2

Proof.
1 Yn-k,n _ - _ / I Yn-k,n _ A / 1 ^-2^,n _ A
4Fn_4M
"U^-2M
/
V2F n _4 M
/
/ l Fn-M
A/1^-2M
\
+1
\2FW_2M
)\2Ytt-4ktn
J'
Proo/ (of Theorem 3.3.1). We use the domain of attraction condition (see Theorem
1.1.6, Section 1.1.3)
hm
t-+oo

a(t)

x > 0.

Since U is monotone, the relation holds locally uniformly. It follows that locally
uniformly for 0 < JC, y < oo,

3.3 General Case: The Pickands Estimator


y

u(tx) - u(t)

lim
=
t-+oo U(ty) - U(t)

-i

.
yY - 1

(3.3.6)

As in the proof of Theorem 3.2.2 we can write U(Yn-i,n)


1, 2 , . . . , n, with the Yt 's from Lemma 3.3.2. Now observe that
Xnk,n ~~ Xn2k,n

Xnk,n ~~ ^n4k,n

Xn2k,n ~~ Xn4k,n

Xn2k,n ~ ^n4k,n

85

for

..

CX

\l\

u(^^nYn-4k,n)-U(Yn-4k,n)

Yn-4k,n -> oo a.s., ft -> oo, and that by Lemma 3.3.2,


Yn-k,n

*n 4k,n ->> 4

Yn-2k,n

P ~

a n d *n4&,n

si <> o\

>2.

(3.3.8)

Combining (3.3.6)-(3.3.8), we find that


Xn-k,n

- Xn-2k,n

Xn-2k,n

Xn-4k,n

- 1

-\=2y

2Y \

The result follows.


Remark 3.3.4 Note that (2y \)/y is the median of the limiting generalized Pareto
distribution (3.1.3) and (4y \)/y is its 0.75 quantile. Hence the Pickands estimator estimates y via the quantiles of the limiting distribution, in contrast to the
Hill estimator, which estimates y via a moment of the limiting generalized Pareto
distribution.
Theorem 3.3.5 Let X\, X2,... be i.i.d. random variables with distribution function
F. Suppose F satisfies the second-order condition of Theorem 2.3.8, i.e.,

lim

QyW

'-

= (COO)*"

HYAQ-\X))

with Qy(x) := (1 + yx)~l/y,


f some positive function, and a some positive or
negative function with l i m ^ * a(t) 0. Recall the equivalent relation in terms of
U := (1/(1 - F))*-:
U(tx)-U(t)

lim
t^oo

^
A(t)

, x

= HYP(x):=
Y
^

/ sy~l / up~l
Ji
Ji

duds

for all x > 0 with Dy(x) = (xy - \)/y, a(t) = /(/(*)), and A(t) = a(U(t)).
Then, for k = k(n) -> 00, k/n -> 0, and
lim y/kA ( ^ ) = X
n->oo

(3.3.9)

86

3 Estimation of the Extreme Value Index and Testing

with k finite,
Vfc(yp -y)-+N

(Xby,p,

varY)

with N standard normal, where


4-^((4y+P-i)-(2K+i)(2K+P-l))
p 2 y(y+p)(2>'-l)log2
b

Y,f>

' ^

l-2-P+l+4-^
= <
P 2 (log2) 2

< U

^ " '

P < 0 = y,
P = 0,

1,
ad

y 2 (2 2y+l + 1 )

4(log2)2(2>'-l)2 '

var v : =

4(log2) '

'

y =0.

Pro^/i We repeat the inequalities of Theorem 2.3.6: there exist ao and Ao such that
for any , 8 > 0 there existsfysuch that for t, tx > to,
u(tx)~u(t) _
ao(t)

xy-i

Ao(0

y p
0
& v -<5\
) = : 9y,p,,aOO
Vy,p(x) < ex + max(jc, A:"

with
r JC^+^-I
Y+p

y + P 7^ o, p < o,

log* ,

y + p = 0,p<0,

i*Mogx,

p = 0#y,

[(log*)2,

p=0=y.

*x.p(*) = {

It follows that

ao(yn-4ife,/i)
\*n-4k,n

= Vk

ao(Yn-4k,n)
\Yn-4k,n)

= V V

"'

+ VkAo(Yn-4kin)*YtP t^Lj^L)

(3.3. 10)
(3.3.11)

( "~~M J .
\ *n-4k,n
By Cramer's delta method and Corollary 3.3.3, we
have forJ thefirstterm in (3.3.10),
+ op(\)\/kAo{Yn-4k,n)qY,f>,e,&

3.3 General Case: The Pickands Estimator

87

Recall from Corollary 2.3.5, Theorem 2.3.6, and assumption (3.3.9) that if p < 0,
lim VkA0 (y) = lim Vk -A (y) = - ,
t-+oo

\k/

t-+oo

\k/

and if p = 0,
lim VfcAo (T) = Hm VkA (-) = X .
r-*oo
\fc/
f-oo
Vfc/
Hence the second term in (3.3.10) converges to (l{p<o}P~l + l{p=0}) 4~ p X *I>y>/0(4).
Finally, it is easy to see that (3.3.11) is asymptotically negligible. Therefore
^

fU(Yn-k,n)~U(Yn-4k,n)
V

_ 4^ - 1 \

0O<7-4JM)

( G + 7 f ) + (l{p<0}^ + l { p = o } ) 4 - ^ * y , p ( 4 ) .

^ ^

(3.3.12)

Similarly,
^

/t/(yn-2Jk t n)-t/(yn-4Jk t i,) _ 2? - 1 \


V
tfo(Tn-4fc,n)
Y )
A iy-xR

+ (\{p<0}~

+ l{p=o}) 4 ^ ^ * y t P ( 2 ) .

(3.3.13)

Combining (3.3.12) and (3.3.13) we get

Vk (2* -2r)
V

= Vk(

UiYn k n)

->

~ U(Yn-4k'n)

\U(Yn2k,n)-U(Yn-4k,n)

^ 1 )

2Y-1J

ao(Yn-4k,n)
V
U{Yn-2k,n) ~ U(Yn-4Kn) V l
ao(Yn-4k,n)
\
1
U(Yn-2k,n)-U(Yn-4k,n)2y-l]ao(Yn-4k,n)
\\

ao(Yn-Ak,n)
V

2YY

>

Y
Y

ao(Yn-4kjn)

)
Y

[ U{Yn-2k,n) ~ U(Yn-4k,n) V ~ P '


1
ao(Yn-4k,n)
Y
i (l-^V

(fi + ^ ) + (lfp^oij +

(W

Y
)

1(P=0))

4-"X* y . p (4))

88

3 Estimation of the Extreme Value Index and Testing

^ 1 (ir-lR

l2

+ ( l { p < 0 ) i + l { p = o})4-^*,, p (2))

2 ^0?- ' Q-2- R)


1Y - 1
^*y,.(4) -

^*K,P(2)

Now apply Cramer's delta method. It follows that


^*y,p(4) -

^*y.(2)

M^)'
The result follows. The
particular+case
y = 0 and p < 0 is left to the reader (cf.
+ (l{p<0}^
l{p=0))4-'A
Exercise 3.6).

Proof (Second proof of Theorem 3.3.5 via the tail quantile process). Rewrite (3.3.1)
as
YP

(iog2rHog(x"-^"-Xn-^n)
V Xn-[k'/2],n

~ *n-k',n

for some sequence of integers k' = k'(n), where k! = 4k. Using Theorem 2.4.2 with
{Wn(s)}s>o a sequence of Brownian motions, s = \ and s = \,
Xn-[k'/4],n

Xn-[k'/2],n

Xn-[kf/2],n ~~ Xn-kf,n
4Y _ 2 K

- 2 y + 1 Wn Q ) + VFAo ( ) (*K,P(4) - *y,p(2)) + 0i>(l)) J

-W(l) + VFA 0 (p)* y ,p(2) + Op(l))) '

=2

" {' + T F ^ (2"+2w ( i ) + w" W - 2<2" + ^ G))


+A

( p ) 2^=7 ( ^ (* (4) " *^" ( 2 ) ) ~ *^" ( 2 ) ) + p{l)]

3.4 The Maximum Likelihood Estimator

89

Hence,
1

(Xn~[k'/4],n - Xn-[k'/2\,n \

Xn-[k'/2],n - Xn-k',n J

= y^2+jp^-L

(2y+2w (J) + w^ - 2<2X + w* (\))

+ A

(p) 2 ^ T ^ (*x-p(4) ~ **>(2)) ~ * ^ ( 2 ) ) + p(l)

so that as n - oo,
^(yp

-y)

- ^ i j s n (2,+2"'" (J) + " , " <i -2<2'+,)w- G))


- ^ (F) < y - W <2"y (*>(4) - *- ( 2 ) ) - *'<)
The result follows.

3.4 The Maximum Likelihood Estimator (y > \)


The class of distribution functions satisfying F e V(Gy), for some y e R, cannot
be parametrized by a finite number of parameters; hence a straightforward maximum
likelihood estimator does not exist. However, let us look at the limit relation (3.1.2)
given in the introduction of this chapter: for 0 < x < ( O v ( - y ) ) " 1 and / a positive
nondecreasing function,

(x-t

hm P I > x
tfx* V f(f)

y
> t\ = 1 - Hy(x) := (1 + yx)-l'i/y
.
')-

(3.4.1)

This relation suggests that the larger observations (reflected in the condition X > t)
approximately follow a generalized Pareto (GP) distribution. Since the class of GP
distributions is parametrized by just one parameter y, this suggests that if we apply the
maximum likelihood procedure to the largest observations using the GP distribution
as a model, we could obtain a useful estimator for y.
This idea that we are now going to explain in detail will lead to what is generally
called the maximum likelihood estimator of y in extreme value theory (although
sometimes a slightly different definition is used). After determining the estimator,
we shall (and have to, since the general asymptotic theory of maximum likelihood
estimators does not apply for this approximate model) prove asymptotic normality. In
order to use the condition "X > t" in (3.4.1) properly we need the following lemma.
Lemma 3.4.1 Let X,X\,Xi,...
,Xnbe Ltd. random variables with common distribution function F, and let X\,n < X2,n < < Xn,n be the nth order statistics.

90

3 Estimation of the Extreme Value Index and Testing

The joint distribution of{Xifn}^n_k_{_l given Xn-k,n = t, for somefc= 1 , . . . , n 1,


equals the joint distribution of the set of order statistics {X*k}*=l of i.i.d. random
variables {X*}*=1 with distribution function
Ft(x) = P{X < x\X > t) =

F(x) - F(t)
\_F(t)
>

* > '

Proof Let E\,n < < F n , n be the order statistics from an independent and identically distributed sample, with standard exponential distribution, P(E > x) = e~x,
x > 0. Then it is easy to see that the conditional distribution of (En-k+i,n,...,
Fn,n)
given {En-k,n = t] equals the distribution of (E* k,..., E%k) with
P(F* > JC) = e-(x-'\

x> t .

(3.4.2)

Hence with V := ( - log(l F))*~, the conditional distribution of (V(Fn_fc+i,n),


. . . , V()) given (V(JETn-it) = V(0) equals the distribution of (^(F*^),

...,V(2>.
Now for JC > V(0
P (V(F*) >

JC)

= P (E* > - log(l -

F(JC)))

which, by (3.4.2), equals


r{-log(l-FWH =

( 1

F { j c ) )

and with t = V*~(V(0) = - log(l - F(V(t))) equals


1 - F(x)
1 - F(V(0)

= P ( X > J C | Z > V(t))

Hence we have proved the lemma for any distribution function F that is continuous
and strictly increasing. We leave the more general case to the reader.

Let X i , . . . , Xn be an independent and identically distributed sample with common distribution function F . As in the previous sections, to estimate y we shall concentrate on some set of upper order statistics (Xn-k,n. %n-k+i,n, Xn,n) or, equivalently, On ( Z o , Z\, . . . , Zfc) : = (X n _jfc, n , Xn-k+l,n

~ Xn-k,n,

> Xn,n -

Xn-k,n)-

The likelihood function is obtained from the conditional distribution of (Z\,...,


Zk)
given Zn = f, which, according to Lemma 3.4.1, equals the distribution of the
fcth order statistics from a sample ( Z * , . . . , Z) with common distribution function
Ft (t + JC) = (F(t + JC) - F(f))/(1 - F ( 0 ) , JC > 0. That is, we disregard the marginal
information contained in Zo = Xn-k,n (this approach is commonly referred to as the
conditional likelihood approach). Then recall that the order is irrelevant to the likelihood and since the X/'s are assumed to be independent, the Z*'s are independent as
well. Consequently we consider the resulting k independent and identically distributed
random variables with distribution function F,(JC 4-1) = ( F ( J C + 0 - F ( 0 ) / ( 1 - F ( 0 ) ,
JC>0.

3.4 The Maximum Likelihood Estimator

91

Now consider the usual asymptotic setting, where k = k(n) -> oo and n/k ->
oo, as n -> oo, and hence Xn-k,n -* ** a.s. Then in view of the generalized
Pareto approximation we apply the maximum likelihood procedure to the limiting
Pareto distribution, which is explicit. Hence, the maximum likelihood estimator of
y (and consequently of the scale) is obtained by maximizing with respect to y (and
<T) the approximative likelihood nf=i hy^izi) with Zi = xn-i+i,n *n-k,n and
hy,a(x) = dHy(x/cr)/dx.
Note that this approximative conditional likelihood function tends to oo if y <
1 and y/a | (Xn>w Xn-k,n)~~l> so that a maximum over the full range of
possible values for (y, a) does not exist. We shall concentrate on the region (y, a) e
(1/2, oo) x (0, oo), since the maximum likelihood estimator behaves irregularly if
y<-\.
The likelihood equations are given in terms of the partial derivatives
d\oghy,0(z)
dy
dloghyt(T(z)
da

i + S*

where for y = 0 these should be interpreted as


L(L\2

- i + 4,.
The resulting likelihood equations in terms of the excesses X n _i+i t
follows:
k

J ] log ( l + -(X_ / + i, _ / j_
Vy
'

Xn-k,n are as

Xn-k,n))

\
^(Xn-i+hn-Xn-k,n)
Y
) \+
-{xn-i+hn-xn-k,n)

E/l

I , = 1 VK

'

(3.4.3)

%(Xn-i+\,n-Xn-k,n)

/ 1 + (Xi.-'+i.n " *-*,) ~

(with a similar interpretation when y = 0), which for y ^ 0 can be simplified to


1 *
- J ] log ( l + ^-(X_ /+ i, - X _ M ) ) = y,
i *
l

-T-

I * ti

(3.4.4)
+

<*-'+i. - *-*>

y +!

Note that the maximum likelihood estimator of y is shift and scale invariant, and the
maximum likelihood estimator of a is shift invariant.

92

3 Estimation of the Extreme Value Index and Testing

Theorem 3.4.2 Let X\, X2,... be i.i.d. random variables with distribution function
1}
F. Suppose F satisfies the second-order condition of Theorem 2.3.8 with y > ~v
or equivalently,
U(tx)-U(t) _
a{t)

lim
t-*oo

xY-\

A(t)

= f

sy~l

Jx

f up~x duds

(3.4.5)

Jx

for allx > 0 and with y > \. Then, for k = k(n) > 00, k/n -> 0 (n -> 00), and
lim VkA (r)

= *

(3.4.6)

with X finite, the system of likelihood equations (3.4.3) has a sequence of solutions
(YMLE, MLE) that satisfies
Vk (yuLE - y, ^jfy - l ) 4 N(kbYtP9 E) ,

(3.4.7)

with N standard normal,


' P <
p = 0

Y,P = J ( ( W X l + y - p ) ( l - P ) ( H y - P ) )

(1,0),

ant/ f/ie matrix w given y

/ ( 1 + y) 2 -(1 + y) \
V-(l + y ) l + (l + y ) 2 ; '
Moreover, for any sequence of solutions (/MLE* ^MLE) for which the convergence
(3.4.7) does not hold, one must have V^IKMLE ~ Y\~>P or Vk\a^LE/a(k/n)

l|-*poo.
Recall the relation between the parameter a and the function / in (3.4.1): this
function was first introduced in Theorem 1.1.6 (Section 1.1.3), from where it is known
that fit) can be chosen as a(l/(l F)(t)). Then we see that a must be close to a(n/ k)
as n - 00.
We now give the line of reasoning for proving Theorem 3.4.2. A detailed proof
will be given only for y > 0. The proof for the other cases is similar.
For the true y > 0 we rewrite equations (3.4.4), which we want to solve as
[ /^i

/ 1 , Yf Xn-[ks],n - Xn-k,n\

l o g ( l + -7

rl ( 1 +y' xn-[ksln-xn-k,nyl

L v ^5(f))

)ds = y,

ds=

(3.4.8)

vn>

where ao is a suitably chosen positive function (more specifically the one from Theorem 2.3.6) and <7Q := <x//aoin/k).

3.4 The Maximum Likelihood Estimator

93

Under the second-order condition given in Theorem 3.4.2, from Corollary 2.4.6,
we have that

(xn-[ksU-xn-k,n\

/s~y - l | zn(S)\

(349)

where {Zn(s)}se(o,\] is an asymptotically Gaussian process with known mean and


covariance function, and y is the true parameter. Then we write
Yf Xn-[ks],n ~ Xn-k,n _ -

flo(^)

/ f S~Y - 1

Z(j)\

~~

o\ Y
Vk )
(y'
\s~y-l
y' Zn(s)
L
+ a^T ^7 j~r^ (3A1
= *~Y + ( -7-Y)~ Y )
(3A10)

o Vk

When multiplied by 5 , this becomes

GH

1+

- ^

(K-Y)+S>

_Yy'Zn(s)

0 V*

Hence

,^,(l

, ^ _ x ^ ) h ( ^ y ) i -y
s

\-sy

yy'Zn(s)

o Vk

Now
y = / log(l +
Jo
\ <*Q

)ds,

o(f)

and hence

y -y=l l o 8 ^ ( l + ^

(%-,)

I'1^,,+

~(Y'
\ X
\ -y)TT

V^o

/1 +y

))*>

rL ['trip*

Jfl s
yz^s)A
+
Y
~7Tds

Jo

Vk

Starting again from (3.4.10), for the second equation in (3.4.8) we have

Y'xnHksln - xn-k,nyl

_1

/ V
y _ /V

\\<Y-s y
2

_2vy'Zn(s)

and so
"1 XY -JY

+1

V^o / i o

y' + l
%

v>

fl , , Z(S)

^ / o Vk
~7f

U-yj(7TlK27TT)+5/ios^^-

3 Estimation of the Extreme Value Index and Testing

94

Summing up, we show that equations (3.4.8) are equivalent to linear equations in
the unknown parameters y' and CTQ, which can be solved readily.
For the proof of Theorem 3.4.2 we start by proving some auxiliary results.
Lemma 3.4.3 Assume (3.4.5) with y > 0 and (3.4.6). Let (y', CTQ) := (y'(n), a^(n))
be such that
(3.4.11)
-

&

Then

P\\ +

Xn[ks],n

~~

^nk,n

s e [{2k)

>Cns-

(3.4.12)

-> 1
- " )

n > oo, for some random variables Cn > 0 such that \/Cn = Op (I).
Proof. Let Uk,n, k = 1 , . . . , n, denote the order statistics from an independent and
identically uniform (0, 1) sample of size n. By Shorack and Wellner (1986) (Chapter
10, Section 3, Inequality 2, p. 416),
sup
1/(2*)<J<1

nU[ks\+\tn
~ ^
= Op(l) ,
ks

ks

sup
0<s<l

0P{\)

(3.4.13)

nU[ks]+l,n

as n -> oo. Combining these bounds with the bounds given in Theorem 2.3.6, for
some functions ao(t) ~ a(t) and Ao(t) ~ A(f), t -> oo, for all JCO > 0 and 8 > 0,
we obtain
^(PEHu)^^)

sup

K+/0+(5

.1

4>(f)

J[l/(2*),1]

{nU[ks]+l,n) ~l

,P

\nU[ks]+l,n J

oP(l).

Next use this approximation simultaneously for s [(2k) *, 1] and 5 = 1. Then we


have
Tj(

ao(?)

Hence

i/

rj(

o(r>

i/

* y

/n\

3.4 The Maximum Likelihood Estimator

y' 1 /

/n\

fc

95

-^(i)^(s^:) + -(-'*-'")
=/ + // + /// + /V+V+W.
By (3.4.13), 5K//7 is bounded away from zero uniformly for s e [(2kn)~l, 1]. We
will show that all the other terms tend to 0 uniformly when multiplied by sy, so that
assertion (3.4.12) follows with Cn := inf se[(2k)~1 l] sYIH n for a suitable sequence
en | 0 .
By the asymptotic normality of intermediate order statistics (see Theorem 2.2.1),
part / is 0P(k~l/2), hence syI = oP{\). By (3.4.13) and assumption (3.4.11), part
// is Op(A: -1 / 2 ), so that syII = oP(l). Next note that s^Vy^s'1)
= o{s~1/2) as
y
s i 0. This combined with (3.4.6) and (3.4.13) gives that s IVand ^ V a r e oP(l).
Finally, s yVI = op(l), provided one chooses 8 < \. Hence we have proved (3.4.12).
Define

:=

^ /vw, *-M _ ^ 2 \
V

o(f)

(3A14)

(read (s K l ) / y as log s, when y = 0). Then, from Corollary 2.4.6, for suitably
chosen functions ao and Ao, and for all e > 0,
Zn(^)=^-1Wn(5)-Wn(l)
+ VkA0 ( ) *y,p(s~l)

+ oP(l)s-y-l/2~e

(3.4.15)

as n -> 00, where {Wn(5)}J>o is a sequence of Brownian motions and the op-term
is uniform for s (0,1]. Moreover, under the conditions of Theorem 3.4.2, for all
s >0,
Zn(s) = Op(l)s-y-^2-
,
(3.4.16)
as n 00, where the Op term is uniform for s (0,1].
Proposition 3.4.4 Assume condition (3.4.5) with y > 0 and (3.4.6). Then any solution ()/, (7Q) 0/ (3.4.8) satisfying (3.4.11) admits the approximation
Vk(yf -y)-

( y + 1)

/" ( > - (2 K + 1 ) ^ ) Z n (j) ds = o P ( l ) ,

V*(o - 1) - ^ 2 /* ( ( y + i ) ( 2 y + \)s2y - sy)Zn(s) ds = oP{\) , (3.4.17)


y Jo
as n > 00. Conversely, there exists a solution of (3.4.8) f/utf satisfies (3.4.17), and
hence also (3.4.11).

96

3 Estimation of the Extreme Value Index and Testing

Remark 3.4.5 Though we prove Proposition 3.4.4 only for y > 0, in fact the statement is true for any y > ^. For more details see Drees, Ferreira, and de Haan
(2003).
Proof (of Proposition 3.4.4). We start by obtaining an expansion for the left-hand
side of the first equation of (3.4.8). Rewrite it as

fl

+
= /i + y ( l - 0 ( t - 1 -l 1o g i k ) ) + / 2 .
First we prove that I\ is negligible. Since Xn-[ks],n is constant when s e
(0, (2A:)-1], from (3.4.12), with probability tending to 1,
i +

)/Xn-[ksln-Xn-k,n

*o

x +

- Xn.k,n

o(f)
-1

for all s e (0, (2A:) ], so that -h


(3.4.14), (3.4.16), and (3.4.11),

^o

y'Xn,n

a
"o(V
0Vp

(2k)YCn

o(f)
l

< (2k)~ OP(logk).

On the other hand, from

0P(kr+s).

Hence it follows that I\ = op(k 1 / 2 ) .


Next we turn to the main term h. We will apply the inequality 0 < xlog(lH-jc) <
2
JC /(2(1 A ( 1 + *))), valid for all x> - 1 , to
x = s

y (

V^o

^ ^

n - i ^ n - X

Then, from (3.4.12) it follows that 0 < 1/(1 A (1 + *)) < 1 v 1 / C = 0P{\) with
probability tending to one. Moreover, note that relation (3.4.16) implies
r(2krl
/
syZn(s)

/ C(2kyl
ds = 0Pl

s-l/2~e

\
ds J = Op((2k)-l,2+s)

for s e (0, \ J. Hence from (3.4.14) and (3.4.16), as n -> oo,

oP(l),

3.4 The Maximum Likelihood Estimator

97

'(U($-')1T:^H,*)
=((^)^

+ o

'

<

*-"

2 < a ^ , ,

+0P (k-1 +k-l(2k)2e +

k-l(2k)-1/2+^

where for the last equality we choose e < | . To sum up, we have proved that

Jo

l>

flo(f)

= K + ( 4 - y ) TTT: + 4*~1/2 [

sYZ

*^ds + oP(k-x'2).

This means that thefirstequation of (3.4.8) is equivalent to

/ = Y + (4 - y ) - 4 T + ^*~1/2 /" j y z w ^ + p'l/2)

a
V^o
/ y +l
o
Jo
The second equation of (3.4.8) can be treated with somewhat similar arguments.
Then one gets

fVi

-l

. y' Xn-[ks],n - Xn-k,n\

,
1

(y + l)(2y + 1)

-to

Hence, under the given conditions, the system of likelihood equations (3.4.8) is
equivalent to

y + (K-y)
W

!~r +CT4*~1/2 [ *yz(*) ^ + H*-1/2) = / ,

/ y +1

><>

4*-1/2 / ' * 2 ^ ) ^

(K-y)
Y+\

\aQ

) (y + l)(2y +1)

a
+ 0/,(A;-

Jo
/ ) = - T L - . (3.4.19)
y' + 1

1 2

98

3 Estimation of the Extreme Value Index and Testing

Then, in view of (3.4.11) and (3.4.16), (3.4.19) implies

5-r + yk~1'21 ^z(*) ds + oP(k-x'2) = / ,

y + (^7-y)
\o

/ Y+1

Jo

- T T - ( ^ - y ) f -u n o 4 . n ~ ^
Y +1
V^o
/ (y + n ( 2 K + n

0p{k X 2)
t ^ " ^ ds +
''
^o
= - 7 T T - (3-4.20)
K'+ 1

The first equation and (3.4.11) show that \y' y\ = Op(k~1^2); hence \y' y\2 =
o/>(*- 1/2 ). Therefore l / ( y + 1 ) - l / ( x ' + 1) = ( / - y)/(y + l ) 2 + o(ifc_1/2), and
so (3.4.20) implies
T 7 " k~1/2Y

Y'-r-(K-r)

rih

-(K-Y)

(y + 1 ) 2

V^o

I *yZn(s) ds + oP(k~1/2) = 0,

* 4.n -*~!/2>' / ' ' ^ w ds+op(k~1/2)

/ (Y + V(2Y +!)

Jo

= 0 . (3.4.21)
Straightforward calculations show that a solution of this linear system in y' y and
Y'/CTQ - y satisfies (3.4.17).

Since conversely a solution of type (3.4.17) obviously satisfies condition (3.4.11),


it is easily seen that it also solves (3.4.19) and thus (3.4.8).

Proof (of Theorem 3.4.2). We shall prove the theorem only for the case y > 0. The
case \ < y < 0 requires somewhat similar arguments. The proof in the case y = 0
requires longer expansions but the arguments are also similar. For the complete proof
we refer to Drees, Ferreira, and de Haan (2003). A different proof but only for the
case y > 0 can be found in Drees (1998), and for a slightly different approach we
refer to Smith (1987).
Hence suppose y > 0. Let ao(n/k) and Ao(n/k) from Theorem 2.3.6. From
Proposition 3.4.4 and (3.4.15) the sequence of solutions of (3.4.4), (KMLE, ^MLE) say,
satisfies
Vk(yMLE
-i

and

- y) (K + 1)

(Y + 1 }

VkAofy

(sy - (2y + l)s2y)

(sV - (2y + l)*2*) %,P(s~l)


(s-r-xW(s)

- W(l)) ds

ds

3.4 The Maximum Likelihood Estimator

99

- ^ ^ VkA0 () J ((K + l)(2y + l)*2'' - sy) %,P(s~l) ds


d y+ l

((y + l)(2y + l)s 2)/ - s5') (s"''- 1 W(j) - W(l))

ds,

as n - oo, and the convergence holds jointly with the same limiting standard Brownian motion W.
Next from Corollary 2.3.5 and Theorem 2.3.6 it follows that
flo(0

lim
t-+oo

1 1-l/p,p<0

A(0

-l/y,p=0/y \
0,
p = 0=y

=:L

and

A0(0 - A(r) ( l { p < 0 } - + l{p=0}J,


as t > oo. Hence the above relations in terms of a and A become

^ fr + 1 ) 2

Hi-

d (.y + D 2

/ (sy - (2y + \)s2y} (s-y-lW(s)

- W(l)) ds

and

x (!{p<0}^ +
-i ^

1{P=O})

jf ((X + D(2y + 1)*2>/ - ^ ) *YtP{s-1)

ds\

((y + l)(2y + 1)^2>/ - j y ) ( s ^ - 1 W(.y) - W(l)) ds .

Therefore, the components of the left-hand side of (3.4.7) minus deterministic


bias terms converge to certain integrals of a Gaussian process that are normal random
variables. If VkA(k/n) -> X, the bias term of V(KMLE y) tends to

(l{P<o)p-1 + l{p=o}) ^y~\y + l)2 J (sy - (2y + l ) ^ ) %A^~l) ds.


Using the definition of ^Y,p, the result follows by simple calculations. The asymptotic
bias of the second component can be derived similarly.

100

3 Estimation of the Extreme Value Index and Testing

To calculate the variance of the limiting normal random variable of V^(]PMLE K),
let
X(s) := Y'\Y

+ D 2 (sY - (2y + l)s2y>) (s'y'x

W(s) - W(l)) .

Then straightforward calculations show that


V a r / / * X(s)ds\=

= (y + l ) 2 .

f E(X(s)X(t))dsdt

Likewise, to obtain the asymptotic covariance of


Vk(&MLE/a(n/k) - 1), let
Z(s) := K " 1 ^ + 1) ((y + l)(2y + \)s2y - ^

A/(KMLE

(W"

Y)

with

^ ) - W(l)) .

Then

Cov I /f X(s)
ds, fI Z(s)
ds\ = /I /I (X(s)Z(t))dsdt
X(s)ds,
Z(s)ds)=
E(X(s)Z(t))t
\Joo
Jo
/J Jo Jo

=-I

- y .

The limiting variance of the scale estimator is obtained similarly.

3.5 A Moment Estimator (y e K)


Next we want to develop an estimator similar to the Hill estimator but one that can
be used for general y e R, not only for y > 0. In order to introduce the estimator let
us look at the behavior of the Hill estimator for general y. We look at a slightly more
general statistic.
An immediate problem with applying the Hill estimator for the case y < 0 is that
U(oo) < 0 is possible, in which case the logarithm of the observations is not defined.
In order to deal with this we shall assume throughout that U(oo) > 0, which can be
achieved by applying a shift to the data. However, one should be aware that this shift
influences the behavior of the estimator.
Lemma 3.5.1 Let X\, X2,... be i.i.d. random variables with distribution function
F and suppose F e V(GY), x* = U(oo) > 0, i.e., for x > 0,
hm
t-^00

a(t)

(3.5.1)

Define for j = 1,2,


l*-1

M(nj) := - J2 0g Xn-Un ~ log Xn-k,n)J


Then for k = k(n) -> 00, k/n 0, n > 00,

(3.5.2)

3.5 A Moment Estimator

M(nj)

P A
r -+ V\-

101

(3-5.3)

WIY/I y_ = min(0, y).

Proof. For y > 0 relation (3.5.1) simplifies (cf. Corollary 1.2.10) to


hm
= xy,
*->o t/(r)

x > 0,

hm

= log x .

f-*oo

For y < 0 we have (cf. Lemma 1.2.9) lim^oo U(tx)/U(t)


Hence (3.5.1) is equivalent to
log /(;*)-log E/(Q _ xy

*-2S>

a(0//(0

= 1 for all x > 0.

-i

Summarizing we get that relation (3.5.1) is equivalent to


lim W * l ; ^ ^ >
^oo
a(t)/U(t)

= * ^ i
y-

(3.5.4)

with y_ := min(0, y) and (cf. Lemma 1.2.9)


hm

= y+

(3.5.5)

with y+ := max(0, y).


Next we use the inequalities of Theorem B.2.18: for each s > 0 there exists to
such that for t > to, x > 1,
_^_+

<

log ( ) - log !/( _ * > ^ 1


qo(f)
y-

<

^_+

where go is a positive function satisfying #o(0 ~ a(t)/U(t), as r -> oo.


Let yi, F2 in be independent and identically distributed with distribution
function 1 1/jt, JC > 1. We apply these inequalities with t := yn_fc,w (tending to
infinity a.s., as n oo) and x := Yn-ifn/Yn-k,n We then have eventually, uniformly
forO < i < k- 1,

iogt/(yn-/,n)-iogf/(yrt-M)
qo(Yn-k,n)

(te)
y-

_- 11

/y
<

^-+e

\Yn-k,n)

It follows by adding the inequalities for / = 0 , 1 , . . . , k 1 that

102

3 Estimation of the Extreme Value Index and Testing


\ Eto

lQ

g U(Yn-i,n) ~ log U(Yn-k,n)


qo(Xn-k,n)

k-l (^Ls.YK

_1

k-l ,

,=0

.y_+
X

i=0\

n-k,nJ

/=1

i=l

withFj*, 7^' ^ independent and identically distributed with distribution function


1 1/x, x > 1, by the reasoning from the proof of Lemma 3.3.2. By the law of large
numbers the right-hand side converges in probability to the mean, i.e.,
VY- _ 1

+ sE Yy'+e = - +
1 y1 y- e

A similar lower bound applies.


Next by squaring the expansions (3.5.6) we find that
l o g U ( t x ) - logU(t)\ 2
qo(t)

is bounded above by

and a similar lower bound applies. Starting from these inequalities, we follow the
same reasoning as before. This leads to (3.5.3) for j = 2.

It follows from Lemma 3.5.1 (cf. (3.5.5)) that the Hill estimator converges to zero
for y <0; hence this estimator is noninformative in this range. However, this lemma
helps us to find a consistent estimator of Y for y < 0, since under its conditions,

K) 2

l - -2 K _
zu -Y-)y-)

(3.5.7)

As mentioned before we also>have


have
* P
yH-+ Y+

(3.5.8)

This leads to the following combination of the Hill estimator and the statistic in
(3.5.7):

u\_m
v

2\

f(l)
yM:=MX>
+ \--\\-^-\

2I

for which we have proved the following:

MP

"I

(3.5.9)

3.5 A Moment Estimator

103

Theorem 3.5.2 Let X\, X 2 , . . . be i.i.d. random variables with distribution function
F. Suppose F e V(GY) andx* > 0. Then
* P
YM^Y

for y e E provided the sequence k is intermediate, i.e., k = k(n)


as n -> 00, i.e., YM is consistent for y.

00, k/n -> 0

Remark 3.5.3 The estimator y^f is called moment estimator. The name stems from
the fact that the left-hand side of (3.5.3) converges to E(Yy- - l)j/yL j = 1, 2,
which is the j\h moment of the limiting generalized Pareto distribution. In contrast,
remember that the Pickands estimator is a quantile estimator (Remark 3.3.4).
Next we prove that yu is asymptotically normal under appropriate conditions. In
particular we need a second-order condition for the function log U(t). From Lemma
B.3.16 (see Appendix B) we know that under the usual second-order condition for
u(tx)-u(t) _
a(t)

lim

xy-\

= fxsy~l

f u^-1 duds,

(3.5.10)
A(t)
for all x > 0, and ify^p
and p < 0 if y > 0, a second-order condition for log U(t)
holds:
t-+OQ

lim

io g E/(fjc)-iogt/(0 _
1(0

Q(f)

xy--\
Y-

= f sy~~l f up'~l duds =: Hy_tP>(x)

(3.5.11)
with y- := min(0, y) as before, q := a/U a positive function, and Q not changing
sign eventually with Q(t) - 0, t -> 00. When y > 0 and p 0 the limit (3.5.11)
vanishes. One possible Q(t) is
A(t) ,

(2(0 =

y < p < 0,
^~ , P < Y < 0 or (0 < y < p and / ^ 0) or y = p,

jffjMt)
A{t) ,

, (0 < y < - P and / = 0) or y > -p > 0,


y > p = 0,
(3.5.12)

where / := lim,_*oo(^(0 ( 0 / K ) (cf. Appendix B).


Then, according to Theorem 2.3.6 with suitably chosen Qo,
cty- ,

p < 0,

qo(t) := { -y_(log l/(oo) - log U(t)) , y_ < p = 0,


(log tf(f)r+ dog U(t)f,

(3.5.13)

y_ = p = 0

(see the theorem for the meaning of (log/(f)P and (log/(0)~ ) and c :=
lim,-xx) t~y~q(t) > 0, we have that for each e,8 > 0 there exists to = fo(, 5) > 0
such that for all t,tx > to,

104

3 Estimation of the Extreme Value Index and Testing


xy--\

\ogU(tx)-\ogU(t)
qo(0

-V

Qo(t)

Y-,P (X)

< exy-+p max(x8, x~8) .

Theorem 3.5.4 (Dekkers, Einmahl, and de Haan (1989)) Let Xh X2, . . . be i.i.d.
random variables with distribution function F withx* > 0. Suppose the second-order
condition (3.5.10) holds with y ^ p. If the sequence of integers k = k(n) satisfies
k -> oo, k/n -> 0,
lim VkQ(y)=
X
(3.5.14)
with Qfrom (3.5.11) and X finite, then
*Jk(YM -y)~> N(Xby,p, vary)

(3.5.15)

with N standard normal, where


(1-K)(1-2K)

by,p :=

< Q

y(i+y)
(l-K)(l-3y)
Y
(1+K)2 '
y-yp+P
Pd-P)2 '

0 < y < p and I ^ 0,

1,

y > P = 0,

< y < 0,

(0 < y < p and

and

[ y2 + l ,
varv :=

(3.5.16)

l^0)ory> - p > 0

y > 0,
(3.5.17)

1 (l- K )2(i_2 K )(l- y +6 K 2 )


.
[
(l-3y)(l-4y)
' > ^

Q
U

'

For the proof we start with an extension of Lemma 3.5.1. Since in the proof we
use the result of Theorem 2.3.6 (the uniform inequalities connected to (3.5.11)), we
use the function qo from that theorem for the formulation of the lemma.
Lemma 3.5.5 Assume the conditions of Theorem 3.5.4. Write Xt = U(Y(), i =
1,2,..., where Y\, Y2,... are i.i.d. with distribution function 1 1/JC, x > 1. With
the notation of Lemma 3.5.1:
1> ify<0orp^0the

random vector
(2)

M"n
1 ~ y- ql<Xn-k,n)

\qo(Yn-k,n)

(l-y-)(l-2y_)

(3.5.18)

converges in distribution to a random vector, (P, Q) say, normally distributed


with mean
* ( V < 0 } ^ 7 + V = 0 } ) (E*y-,AY),
K(

p' \l-y--p"
x

(__\

1(0,0),

IE ( 1 ^ - ^ 1 * ^ , ( 7 ) ^

2(2-2y_-p')

(1-K_)(1-/_-P')(1-2K--/O')/

2(2-3y_)

,
'

Q
<

'

' - fW v

/o' = 0 = Y- ,

3.5 A Moment Estimator

105

and covariance matrix


1
( l - y _ ) 2 ( l - 2 y - ) I __4

4(5-lly_)
I>
\ l-3y_ (l-2y_)(l-3y_)(l-4y_) /

recall (cf. (2.3.16))

* y - . / / ( * ) ==

y_+p'
'
^jc^-logx,

P <U'
/O' = 0 > K _

2. i/y > 0 and p =0the random vector


r-l
Af(1)
Vt[-^=

M (2)
1 , - ^

\
2

(3.5.19)

converges in distribution to a random vector, (P, Q) say, normally distributed


with mean (0,0) and covariance matrix

(420)
Proof. The proof is somewhat similar to that of the corresponding result for the
Hill estimator (Theorem 3.2.5). Theorem 2.3.6 tells us that one can choose functions
qo > 0 and Qo such that for any s > 0 there exists to such that for all t > to and
x > 1,
*-

+ Go(0*y..P'to - lGo(OI* K - + / / +
<

log U(tx)-

log t/(Q

4o(0
<

^ i l + eo(o*y-.P'(*) + eieo(oi*y-+p/+fi,,

(3.5.20)

Let us concentrate on the upper inequality. We apply this inequality with t replaced by Yn-k,n (tending a.s. to infinity) and x replaced by Yn-iin/Yn-k,n for
i = 0 , 1 , . . . , k 1. Then we eventually get that

MP ^ig.ter- 1
qo(Yn-k,n)

k^

y.

i=0

1 k~l /Y

\ln-k,n/

\y-^

+s|c.-,.|iE(fe)

106

3 Estimation of the Extreme Value Index and Testing

As in the proof of Lemma 3.5.1, theright-handside is equal in distribution to

k
Y

+ Qo(Yn-k,n)-j- J2 %-A i)

\Qo(Yn-k,n)\

/= 1

J^VF)*-*'**
1=1

with y*, y*, y , . . . , Y independent and identically distributed with distribution


function 1 1/JC, x > 1, and independent of Yn-k,n- Hence

rjl

1_\

M?
\qo(Yn.k<n)

l-y-J

1 *

+*Vk \Qo(Yn-k,n)\

- Y^(Y*)y-+>'+
K

=i

One easily verifies that the conditions of the central limit theorem are fulfilled for
the first term and the conditions of the law of large numbers for the other two terms.
Then the last term vanishes in the limit. Moreover, since by Corollary 2.2.2
k

Jtl

-tn-k,n

-* 1

n
and since g o is a regularly varying function, we have
Qo(Yn-k,n) P Co(f)

""*

Recall from Corollary 2.3.5, Theorem 2.3.6, and assumption (3.5.14) that if p' < 0,
H m V S e 0 ( J ) = H m ^ f i ( ; ) = At
*-oo
\k/
t-+oo p' \k)
p1
andifp' = 0,
lim VkQo (y) = lim VkQ (7) = k .
Hence, since a similar lower bound applies, as n - 00,

\qo(Yn-k,n)

l-Y-J
-

\kf^
k

Y-

Y-

(V<0]^7 + V=0}) ^*K-.P'( F *) "^ 0 (3.5.21)

3.5 A Moment Estimator

107

Since in particular M^ }/qo(Yn-k,n) -> 1/(1 - Y-) and

\q%(Yn-k,n)

\l-Y-J

*Al)

_rJ

1 \(

\qo(Yn-k,n)

\-Y-J

\qo(Yn-k,n)

1 \
1 - Y- ) '

we also have that as n -* oo,

q^Yn-k,n)

(1 - Y-r

_ 2Vk
1

/ 1 y . (Y*y- - 1 _

- Y- \

~{

(Y*y--i\

y-

r-

- YZ^Z ( V < o } ^ + V=o})

*y-,P'( y *)"^0 (3.5.22)

Now we turn to M . The inequalities (3.5.20) yield


Qogt/frt) -logt/(Q) 2 /**- - 1 V , x *?- - 1 ,
?o(0

V y-

y-

+2e|i2oWI^-^^-+p,+
y-

+Qo(0 ( < W ( * ) + sign2o) e^- 1 ""'-") 2 .


Hence eventually,

ql<X,
k

1 J ^ (F*)y- 1
+ 2Go(K n - M )- ^ y-- %_,P>(Y*)
1=1

+2* ig0(yw-M)i 1 E ( 1 ? ) y " " * (^) K -^ +


1

i=l

Again we can apply the central limit theorem to the first term on the right-hand
side and the law of large numbers to the other terms. The last two terms vanish in the
limit. A similar lower bound applies. We conclude that as n -> oo,

108

3 Estimation of the Extreme Value Index and Testing

Vk

MP
( ^(Yn-k,n)

(1 - y - ) d - 2y_) )

- 2X (l{p><o)jf + V=o } ) ( ( n ^ " 1 V ^ < r > ) ^ (3 - 5 ' 23)


In view of (3.5.22) and (3.5.23) it suffices for the proof to find the joint limit
distribution of

which can be found in a routine way by applying the Cramer-Wold device, Lyapunov'S
theorem, and the central limit theorem.
The proof of the second statement is similar.

Corollary 3.5.6 Under the conditions of Theorem 3.5.4,

V*

U(>-m

-Y-

M< 2 >

IV

= v*

M (2)

('40-^H^D|
(1 - y-)(l -

-i (1 - 2y_)(l - K_)2

KM

e-2P

w/iere (P, 2 ) is the limit vector of Lemma 3.5.5. Hence the limiting random variable
is normal with mean XbYfP where
-

a-X)(l-2r)

(fe&>
by,p :=

1
(1+K)2
1
(1-p) 2

10,

<O<0

P<Y<O,
0 < y < p and I ^ 0,
(0 < y < p and l =
y > p = 0,

0)ory>p>0,

3.5 A Moment Estimator

109

and variance
Y >0,
2
3
varv := < (l-y)^2 (l-2K)(l-ll>/+48)/
" -2y)(l-ll)/+48"22-44
-^"
) / )^

(l-3y)(l-4y)

< Q

The bias of the limiting random variable in terms of (y_, p') is given in Exercise 3.10.
Proof. The result is straightforward from Lemma 3.5.5 and application of Cramer's
delta method.

Remark 3.5.7 In subsequent chapters we shall sometimes work with

Proof (of Theorem 3.5.4). It remains to study the first part of the estimator, i.e.,

We know that

W!)

'->-/

with P the first component of the limit vector from Lemma 3.5.5. Therefore

^ (Jtf -, + )-^{.g)( I ^ + i + .,())- w }


= r^-((i)-^)+(r)',+'"<1)From the relations between the functions qo and q given in Corollary 2.3.5 and
Theorem 2.3.6, and from qY,p = lim r ^oo(^(0 X+)/Q(0 given in (B.3.46), Appendix B, we have
o
,. 4o(t)-y+
qr. n = hm
*Y*P

,_,

Q(t)

-l,
I
= { l,

< y < 0,

> P = o,

0,

otherwise .

Then the limit distribution of Vk(yM y) is the distribution of

l(K)

X^-f^- + K + P + (1 - 2y-)(l - y_) z {( - - y_ ) Q - IP \ .

(3.5.24)

From this and Lemma 3.5.5 one gets the asymptotic distribution by straightforward
but lengthy calculations.

110

3 Estimation of the Extreme Value Index and Testing

3.6 Other Estimators


In this section we briefly review two further estimators of y, the probability weighted
moment estimator and what we call the "negative Hill estimator."
3.6.1 The Probability-Weighted Moment Estimator (y < 1)
First let us consider the probability-weighted moment estimator of Hosking and Wallis
(1987). The starting point is the observation that if V is a random variable with a
generalized Pareto distribution, i.e., with distribution function
yx\-i/Y
/
yx\-l/y
HYt0e(x) := 1 - h + j
,

0<JC<

a
Ov(-y)

where a > 0 and y are real parameters, then for y < 1,


EV=

(1-H y t (jc)) dx = .
l - y

Jo

(3.6.1)

Moreover,
1

E [V (1 - HYta(V))}

rl/(0v(-y))

= -J

(I - Hy,a(x)f

dx =

2{2_yy

(3.6.2)

which can be called a probability-weighted moment.


We can solve relations (3.6.1) and (3.6.2) and obtain
EV-4E{V(l-HY,a(V))}
EV-2E[V(l-Hy,a(y))}
and
a =

2EV E {V (\ - HYy a(V))\


LV
* ''I .
EV-2E{V(l-Hy,a(V))}

(3.6.4)

A sample analogue of the right-hand sides of (3.6.3) and (3.6.4) will provide estimators
for y and a.
Consider independent and identically distributed random variables Xi, X 2 , . . .
with distribution function F and suppose that F is in the domain of attraction of an
extreme value distribution GY. Then we know (e.g., Section 3.1) that for x > 0 and
a suitably chosen positive function / ,

If we want to build analogues of (3.6.3) and (3.6.4), we can try to replace E V by


fl-F(t

J0

+ xf(t)) J

i-Fm

dx =

W)l
1

[ 1 - F(u)

du

f00

T=m
u-t

3.6 Other Estimators

111

wdE{V(l-HYta(V))}by

2 Jo \

1-F(0

dX

2/(0 X u - F ( o / "
=-77T /
/(OJr

("-0

~F(M),
(l-F(f))2

(3.6.6)

^F(M).

Next we need sample analogues of (3.6.5) and (3.6.6). These can be obtained
by replacing t with the intermediate order statistic Xn-k,n and replacing F with the
empirical distribution function Fn. Then (3.6.5) becomes after normalization (note
that 1 - F ( X _ M ) = k/n),
Pn := J. J^ *\ ~ Xn-k,n,

(3.6.7)

and (3.6.6) becomes


. k-\ .
i=0

This leads us to consider the estimators


-l

YPWM

-= -f^Wn

=l

-\OQn-1)

(3 69)

and

: = ^T2G:= p fe- l )

(3610)

which are the probability-weighted moment estimators.


In fact, the definition encompasses variants whereby in the definition of Qn the
factor i/k is replaced with (i + c)/(k + J), which may improve the finite sample
behavior but does not affect the asymptotic behavior. Hosking and Wallis mention
the choice c = 0.35 and d = 0.
In order to derive the asymptotic behavior, note that by Corollary 2.4.6, under the
second-order condition for y < j ,

*(4-jt , c H
= f s-r-lWn(s)~Wn(l)ds

+ y/kAo(^)J

VY,p(s-l)ds + oP(l)
(3.6.11)

112

3 Estimation of the Extreme Value Index and Testing

with {W(.y)} a sequence of standard Brownian motions. Similarly,


Qn

Vk

i:~<)

= I s-yWn(s)-sWn(l)ds

+ VkA0(^

f sVy,p(s-l)ds

op(l).
(3.6.12)

For the above relations in terms of the functions a and A instead of ao and Ao,
see the proof of Theorem 3.4.2. Then by working out the probability distribution of
the corresponding right-hand sides and applying Cramer's delta method, one gets the
following result:
Theorem 3.6.1 Let X\, X2,... be Ltd. with distribution function F.
1. IfFe

V{GY) with y < 1, then


P
YPWM -> Y

,
and

&PWM P ,
m\ ~ * l

provided k = k(n) > 00, k(n)/n 0, n > 00.


2. If the second-order condition (2.3.5) is fulfilled with y < | , k = k(n) > 00,
k/n > 0, and limw^oo VkA (n/k) = K
VA: yp^M - y,

- 1 I

w asymptotically normal with mean vector


(1.MVy.p)((i-y)(2-y),-rt

A(1,0) ,
A(I,-^)

,/><o,
y ^ = 0,

y = /> = 0 ,

anJ covariance matrix


(l-y)(2-y)2(l-y+2y2)

(2->/)(-2+6K-7)/2+2K3)

(1-2 K )(3-2 K )

(l-2y)(3-2y)

(2-K)(-2+6K-7>/2+2K3)
(1-2K)(3-2K)

31-94y+102>/ 2 -126K 3 +144?/ 4 -80y 5 +16>/ 6


(l-2 K )(3-2>/)

Remark 3.6.2 For | < y < 1 the convergence of pp WM to y is slower than that for
y<\Remark 3.6.3 The statistic Pn is commonly called the empirical mean excess function. Its main use seems to be to distinguish between subexponential and superexponential distributions. It plays a role in insurance theory; see, e.g., Embrechts,
Kluppelberg, and Mikosch (1997). The statistic is often discussed in books on extreme value theory; see e.g., Falk, Hiisler, and Reiss (1994) and Beirlant, Teugels,
and Vynckier (1996).

3.6 Other Estimators

113

3.6.2 The Negative Hill Estimator (y < - )


The "negative Hill estimator" was proposed by Falk (1995). It can serve as a complement, to be used when y < \, for the maximum likelihood estimator of Section 3.4.
One starts by observing that if the distribution of X is in the domain of attraction
of Gy with y < 0, then the upper endpoint of the distribution of X, x* is finite, and

is in the domain of attraction of G-Y (cf. Theorem 1.2.1). Hence we could apply the
Hill estimator to X, but this way we do not obtain a statistic, since x* is not known.
Fortunately, for y < ^ the endpoint JC* can be very well approximated by the largest
order statistic Xn^n (cf. Remark 4.5.5 below). This leads to what we call the negative
Hill estimator
k-i
1O

YF'-=TJ2

8(

"' " x-.) " l0S (*. ~ X - ^ ) *

(3A13)

Note that this estimator is shift and scale invariant.


For the asymptotic analysis of this estimator we once again use the theory of
Section 2.4, in this case Corollary 2.4.5: for y < \ and 0 < $ < 1,

--

v -X>n,n

Xn[ks],n

Mir-

= 1 + ~~= (s-lWn(s) + VkA0 ( ) sy*y,p(s-1)

+ op(\) s" 1/2 - )

with {Wn(s)} a sequence of standard Brownian motions.


One is tempted to take logarithms on both sides and then expand the right-hand
side. However, the second term on the right-hand side does not go to zero uniformly in
s. Nevertheless, the convergence is uniform for sn <s< 1 withs^ | 0 an appropriate
sequence. Then one can prove that
fT f

l io g I Xn,n

^io l

~ Xn-[ks],n \

ood) r

= Y I s- Wn(s)ds^yVkA0(^)

f syVYiP(s-l)ds

+ op(l),

which, together with a similar expansion with Xn-[ks],n replaced by Xn-k,n, leads to
the following result:
Theorem 3.6.4 Let X\, X 2 , . . . be i.i.d. random variables with distribution function F.

114

3 Estimation of the Extreme Value Index and Testing

1. IfF V(Gy)

with y < - \ , then


~

provided k = k(n) -> oo, k/n 0, k^/logn > oo, for some small rj, as
n -> oo.
2. If the second-order condition (2.3.5) is fulfilled with 1 < y < | , k =
k(n) -> oo, &/n -> 0, k^/logn -> oo, for some small n and y/kA (n/k) - A,
as oo, then
Vk(yF-y)
has asymptotically a normal distribution with variance y2 and mean

Xy J syVy,p(s-1)ds =
Jo

p(l+y)(l-p)
A,

'^<0'
p = 0.

We give the proof of the asymptotic normality. The proof of the consistency is
left to the reader (Exercise 3.11).
Proof (via the tail quantile process (Holger Drees)). We shallfirstconsider the sum
in the definition of yp for i = 1 , . . . , j , where j = j(n) is some sequence with
1 < j(n) < k(n) - 1. Note that
^ 1 V^i
" k ^ g

/ Xn,n ~ Xni,n \
\Xn,n - Xn-ktn)
'

J ~~ 1 .
f Xn,n ~~ Xn-l,n
k
g V 1/(00) - Xn-k9n)

\
'

From Theorem 2.1.1,


Xn,n ~~ Xnl,n

Xn^n ~" n

^n1,/t ~~ "n

a{n)

a(n)

a(n)

0P{\)

for some positive sequence a(n) e RVY. Consequently log(Xn> Xn-\yn)


OpQogn), n -> oo. On the other hand, from Lemma 1.2.9 and Theorem 2.4.1,
[/(oo) - Xn-Kn

U(oo) - /(f)

(f)

Xn-k,n - E/(f) p

<)

*(f)

~* y '

hence log(/(oo) Xn-k,n) = Op(log(n/k)). Therefore

i|>'(M)- 0 '(W- <"14>


Next we consider
k-i
1 \~^ ,

/ Xnn

IZS1O&\Y
K

~~ Xni,n \

y
A

\ n,n

/,/* IOg (

~~ nk,n /

I
=

/
Jj/k

I ^n,n ~~ AB-[fcs],w \ ,

Z\^
\

T^)

d s

n,n ~~ ^nktn /

00(f) J dS ~Ll08 (

o,(f) ) *

(36 15)

3.6 Other Estimators

and take j = k~*+\ i.e., j/k = k~s with 0 < 8 < (-(ly)'1
A (1 + Is)'1),
8
some s < \. Then from Corollary 2.4.5, for k~ < s < 1 we have

l0

H"^

o,(f)

115

for

= log A + -= j * " 1 W(*) + VfcAo ( ) ^'Py.pC*-1) + oj(*- 1/2 )})


= ^= \s-lWn(s)

+ ^ A 0 ( ) ^ y . p ^ - 1 ) + 0/>(* _1/2_e )) (1 + op(D),

where {W (s)} is a sequence of standard Brownian motions and the op -term is uniform
in s.
Hence if, moreover, 5 > \ so that / 0 log(ysy) ds = o(l/*Jic), then

. l ,_ lot(rt+ .(^) + ^lj i (.-. w . w


+ V*A0(J)*)'*y.p(s~1)+O|(*-1/2-,)j(l+O|(l))</*
= 7 7 = / (s- 1 W(5) + ^/fcAoQ5>'* y ,p( S - , )) <fc

-log(-K) + y + * ( - L ) .
The second term in (3.6.15) is similar but simpler. Again using Corollary 2.4.5,
we obtain

.(,-t-oj h ,(, + ^., + .,(-^))- bf( - K ,


_ ( ,_0(^ir.(i) + .,(-J c ) -.og<-,,)
Y_ (l)-log(-y)
= ^W
n
sfk

Hence

+ oP(-j=\

116
I

3 Estimation of the Extreme Value Index and Testing


i

I Xn,n ~ Xn[ks],n \ ,

/ l0gl~?

= Y+
+

d s

7ni s~x ^Wn(s) ~ Wn{l) + ^Ao sY^-p(s~l))

o,(-L).

(3.6.16)

Combining (3.6.14) with (j/k)\ogn = k~8logn = o(l/Vk),


(-(2y)~l A (1 4- 2s)'1), s < \, and (3.6.16), we arrive at

Vk(yF-y)-f

ds

l
Ys- Wn(s)-yWn{\)

for some \ < 8 <

+ ^fkA0(^YSY%As~l)ds

as n -> oo, from which the asymptotic normality follows.

= oP(l) ,

3.7 Simulations and Applications


We consider the various estimators of the extreme value index introduced and discussed in the previous sections: Hill (/#), Pickands (yp), maximum likelihood
(^MLEX moment ( / M X probability weighted moment (KPWM) and negative Hill (yp).
Recall that under appropriate conditions, the Hill estimator is consistent only for
positive values of y, the MLE is defined for y > \, the Pickands and moment
estimators are defined and consistent for all real values of y, the PWM is consistent
fory < 1, and the negative Hill for y < \. Moreover, note that the Hill and moment
estimators are not shift invariant though they are scale invariant, and the Pickands,
MLE, PWM and negative Hill are shift and scale invariant.
We start by comparing the asymptotic properties of these estimators. For instance,
we shall see that for an important range of values of y, the Pickands estimator has
larger asymptotic variance than the others, whereas the moment and maximum likelihood estimators compete with each other when y is around zero. Afterward, we
give some simulation results for some common distributions. Finally, we apply the
extreme value index estimators to the three data sets: sea level, S&P 500, and life span.
3.7.1 Asymptotic Properties
In the previous sections, under appropriate conditions we have proved the asymptotic
normality of all the estimators. That is, for an independent and identically distributed
sample of size n and with k = k(ri) an intermediate sequence,
y/k(y

- y) % JV2^N

+ kbytP

with N standard normal, where the constants A, var^, and bYiP are known (cf. Theorems 3.2.5, 3.3.5, 3.4.2, 3.5.4, 3.6.1, and 3.6.4).

3.7 Simulations and Applications

117

For the asymptotic normality of the Pickands, MLE, PWM, and negative Hill
estimators we require the second-order condition of Section 2.3, (2.3.5), with auxiliary
second-order function A e RVp<o, and that the intermediate sequence k satisfy
As is most common for the asymptotic normality of Hill's estimator, we require the
second-order condition (2.3.22) of Section 2.3 with auxiliary second-order function
A* e RVp*<o, say, and that the intermediate sequence k satisfy \fkA*{n/k) > A.*,
say, with A* e R.
Finally, for the asymptotic normality of the moment estimator we require the
second-order condition (2.3.5) with p ^ y. Then we have the second-order condition
in terms of log U; cf. (3.5.11)recall U := (1/(1 - F))*~ with F the underlying
distribution functionwith auxiliary second-order function Q e RVp> with p' known
(cf. TheoremB.3.1) and the intermediate sequence k = k(n) satisfying VkQ(n/k) ->
A/, say, A/ e E.
Therefore, to compare the estimators we should first of all compare the orders of
k. In Table 3.1 are the relations among the several second-order auxiliary functions
and respective indices, for some combinations of y and p. We have (i) for some
cases the auxiliary functions are all of the same order but (ii) for some other cases
A(t) = o(A*{t)) or A(t) = o(Q(t)), t -> oo. In terms of the growth conditions
for k, in case (i) they are the same for all the estimators, but in (ii), for instance
\fkA(n/k) -> k > 0 corresponds to k of larger order than if VkA*(n/k) -> A.* > 0
or VkQ(n/k) - A/ > 0, meaning a slower rate of convergence for the former. We
shall come to this point later on, in the optimal mean square error analysis.
Table 3.1. Second-order items related to the nondegenerate behavior of the estimators;
l:=\imt-+oo(U(t)-a(t)/y)

^ YMLE>YPWM
yp>
/v

2nd-order condition

/s

(2.3.5)

2nd-order
auxiliary
if
function
for fc's

0 < y < p and/ ^ 0


0 < y < p and / = 0

A
A
A
A

growth

y>-p^0

index of
2nd-order
auxiliary if
function

y <P
<y
0 < y
0< y

P
P
P
P
P

y <P < o
<y < o

<0
<0
< p and / ^ 0
< p and / = 0

Y>-P=O

^
yn

^
yu

^
YF

(2.3.22) (3.5.11) (2.3.5)

A
A

yA_
Y+P

-y
p
p

A
A
pA
Y+P
pA
Y+P

P
y
-y
p
p

A
A

:
P
P

P
P

118

3 Estimation of the Extreme Value Index and Testing


MLE
Mom

rllll
rlCK
i

-2

-1

PWM
Falk
i

_..

25 20 -

\
\

15 -

10 5 -

^ ^ ^ ^ ^

0 H

I
4

Fig. 3.2. Asymptotic variances.

Next, in Figure 3.2 we compare the asymptotic variances. The Hill estimator has
systematically the smallest variance in its range of possible values. Hill's and the
moment estimators have the smallest asymptotic variances for positive values of y.
The MLE and negative Hill estimators have the smallest asymptotic variances for
negative values of y.
It is more complicated to compare the bias of the estimators, since in general the
bias depends on both parameters y and p among other characteristics of the underlying
distribution. Nonetheless, we state some general comments:
When y = 0,
[ 2-2P-2-P+l+\
p2l0g22

1,

'

'

P = 0,
1

bo,p(YMLE) =

(1-P)2'

bo,p<o(YM) = 0 ,

*0,p(KPWM) = { { ^

(see also Figure 3.3). When p = 0,


by(>0),o(YH) = byto(j>p) = bY{>-\/2),oiyMLE)

= y(/0),o(/M)

= ^>/(<1/2),0(PPWA/) = y(<-l/2),0(yF) = 1

When p > oo,

3.7 Simulations and Applications

Pick"

MLE

Mom

119

"PWM

Fig. 3.3. Comparison of bias when p < 0 and y = 0.


by(>0),-oo(YH) = x(>-l/2),-oo(KML) = ^y(<l/2),-oo(KPWA/)
= y(<-l/2),-oo(/F) = 0

bY,-ooiyp) = oo ,
(l-y)(l-3 K )

' ^ -

by,-oo(YM) = i "(Info '

y>0and/#0
y > 0 and / = 0 ,

0,

where / := limt-+oo(U(t) a(t)/y).


Finally, we determine an optimal sequence ko(n) following the reasoning presented in Section 3.2 for the Hill estimator. Consider the representation
N

Xby.
y 0
y,p

Vk

Vk

Y y/VZTy =. +

/n\

with k = k(n) an intermediate sequence such that \fkA{n/k) -> X e R and N


standard normal. For simplicity suppose A(t) = ctp, for some constants c real and
p < 0. We look for a sequencefcfor which
varv
2 :
it
\k>
is minimal. Similarly, as in Section 3.2 we get

+^

1/(1-2/0)

k0(n) =

I (#

vvar.
ar

where [JC] denotes the integer part of x.


Then for this choice of k we have

l\

-2p/(l-2p)

120

3 Estimation of the Extreme Value Index and Testing

and

,.

. ^/vKyN

A(f) \ 2

1 \

Hence note that under the given assumptions, a comparison of this quantity for the
different estimators reduces to a comparison of the asymptotic variances.
Applying this reasoning to all the estimators we have that for some cases depending
mainly on y and p, the order of the optimal sequence is the same for all the estimators.
But for some other cases, namely when A(t) = o(A*(t)) or A(t) = o(Q(t)),t -> oo,
the optimal order ko is smaller (cf. also Table 3.1).
3.7.2 Simulations
To illustrate the finite-sample behavior of the estimators we give some simulation
results for the distribution functions given in Table 3.2, namely standard Cauchy, normal, and uniform. Note that the uniform distribution satisfies the first-order extreme
value condition (2.1.2) but does not satisfy the second-order condition (2.3.5) because
the rate of convergence in (2.1.2) is too fast (cf. Exercise 1.3).
Table 3.2. Extreme value index and second-order parameter.
Distributions y
Cauchy
Normal
Uniform

1 -2
0 0
1 oo

We generate pseudorandom independent and identically distributed samples of


size n = 1000 and replicate them r = 5000 times independently. For some independent estimates y\,...,
yr obtained from some estimator of y, in the following by
mean square error we mean r~l Y?i=i (Pi ~Y)2- Recall that all the above estimators
use k upper order statistics out of the initial sample of size n.
Figures 3.4-3.6 show the so-called diagram of estimates, i.e., averages of yt estimates, i = 1 , . . . , r, for each number of upper order statistics k, and the corresponding
mean square error.
The maximum likelihood estimates are obtained with numerical methods, since
the maximization of the likelihood function does not have an explicit formula. We
just used a naive modified Newton method with a user-supplied Hessian matrix (i.e.,
something that the reader can implement by itself and does not depend on any particular package). In the case of the Cauchy distribution, though very rarely, it happened
that the algorithm did not reach any solution. The maximization seems to be not so
straightforward for the normal distribution, so we decided to omit the results for this
case.

3.7 Simulations and Applications

Pick

Hill

50

121

Moml

MLE

100

Fig. 3.4. Standard Cauchy distribution: (a) diagram of estimates of y (the true value 1 is
indicated by the horizontal line); (b) mean square error (see the text for details).

Pick

- Mom

PWM

300

Fig. 3.5. Standard normal distribution: (a) diagram of estimates of y (the true value 0 is
indicated by the horizontal line); (b) mean square error (see the text for details).

3.7.3 Case Studies


Next we apply the estimators, Hill (y#), Pickands (yp), moment (/MX and probability
weighted moment (PPWM) to the three case studies introduced in Section 1.1.4 and
further discussed at the beginning of this chapter.
Sea Level
As described before, we have 1873 observations from the sea level (cm) at Delfzijl,
the Netherlands, corresponding to winter storms during the years 1882-1991. The
diagram of estimates of the extreme value index, i.e., the estimates against the number
of upper order statistics k, is shown in Figure 3.7. Note the variability of the estimators
for small values of .

122

3 Estimation of the Extreme Value Index and Testing


Pick

Mom

PWM

Falk

200

1
300

r
400

Fig. 3.6. Standard uniform distribution: (a) diagram of estimates of y (the true value - 1 is
indicated by the horizontal line); (b) mean square error (see the text for details).
From the asymptotic theory one obtains the correspondent asymptotic confidence intervals. The most common approach is to assume VkA(n/k) -* 0 (or
VkQ(n/k) -* 0 in case of the moment estimator), so that the limiting distribution has zero mean. This avoids the bias estimation, which generally requires the
estimation of the second-order parameter p (for more on this we refer to Ferreira
and de Vries (2004) and for the estimation of p see, e.g., Fraga Alves, Gomes, and
de Haan (2003)). The (1 a) 100% approximating confidence interval is then given by

V?

Y - Za/2

<Y <Y + Z<*/2

where varp is the respective asymptotic variance with y replaced by its estimate and
Za/2 is the 1 a / 2 quantile of the standard normal distribution.
In Table 3.3 we give the 95% asymptotic confidence intervals for some values of
k. The value zero belongs to all these confidence intervals, which does not contradict
the hypothesis that the extreme value index is zero.
Table 33. Sea level data: 95% asymptotic confidence intervals for y.
25

50

100

Pickands (-0.42, 1.06) (-0.93, 0.04) (-0.56, 0.14)


Moment (-0.78, 0.11) (-0.54, 0.04) (-0.21, 0.18)
PWM (-0.64, 0.32) (-0.50, 0.18) (-0.17, 0.30)

S&P 500 Total Returns


Recall that from 5835 daily price quotes pt (from 01/01/1980 to 14/05/2002) we analyze the log-returns rt = log(pt/pt-i). We focus on the loss log-returns comprising
2643 observations.

3.7 Simulations and Applications

I
i

rlCK
i

PWM

M01T1
i

123

1.0 0.5 0.0 -0.5 -1.0 -

IP"*'

*"""

150

250

300

it

Fig. 3.7. Sea level data, diagram of estimates of y.


Hill

Pick

Moml

Fig. 3.8. S&P 500 data, diagram of estimates of y.


Economists believe that this kind offinancialseries is heavy tailed, i.e., the underlying distribution function is in the domain of attraction of some GY with y positive.
Henceforth it is believed that therightendpoint of the underlying distribution is infinite. Moreover it seems that moments up to third order might exist but probably not
for higher order. This means that y should be approximately larger than ^.
From the diagram of estimates shown in Figure 3.8, we observe a large variability
of the estimates for small values of k, and then the dominance of the bias over the
variance for large values offc,though more evident for the Pickands estimator. Recall
Table 3.4. S&P 500 data: 95% asymptotic confidence intervals for y.
50

100

300

Hill
(0.25,0.45) (0.24, 0.36) (0.28, 0.36)
Pickands (-0.90, 0.07) (-0.72, -0.04) (-0.70, -0.31)
Moment (0.16,0.77) (0.24, 0.67) (0.23, 0.47)

124

3 Estimation of the Extreme Value Index and Testing

Pick

200

Mom

400

600

PWlfl

800

1000

Fig. 3.9. Life span data, diagram of estimates of y.


that the Pickands estimator has large variance for positive y, when compared to the
others. The Hill and moment estimators clearly give positive estimates for y. The
confidence intervals given in Table 3.4 also point in this direction.
Life Span
The data set consists of the total life span (in days) of all people born in the Netherlands
in the years 1877-1881, still alive on January 1, 1971, and who died as resident of
the Netherlands. The size of this data set is 10 391.
We want to decide whether the right endpoint of the distribution of the underlying
population is finite, and for that we test whether the extreme value index is negative: Ho : y > 0 versus H\ : y < 0. The diagram of estimates for the Pickands,
moment, and probability-weighted moment estimators is shown in Figure 3.9. The
null hypothesis is rejected if for a significance level of 5%, yp < -2.96/\fk for the
Pickands estimator (cf. Theorem 3.3.5), J>M < 1.64/VT for the moment estimator
(cf. Theorem 3.5.4), and > W M < 1.90/V& for the probability-weighted moment
estimator (cf. Theorem 3.6.1). For example, for k = 400 the null hypothesis is rejected for the three estimators. The moment estimator is rather clear in this respect,
and for practically all k the null hypothesis is rejected. The Pickands estimator is not
so clear. Though the probability weighted moment has slightly larger variability than
the moment estimator they behave similarly.

Exercises
3.1. Note that the generalized Pareto distributions HY (Section 3.1) satisfy the following property: if X is a random variable with probability distribution Hy there exists
a positive function a such that for x > 0 and all t with HY (t) < 1,
P (^^-

V a(t)

> x\X >t)=

P(X >x) .

3.7 Simulations and Applications

125

Prove that this property characterizes the class of generalized Pareto distributions (cf.
proof of Theorem 1.1.3).
3.2. Let Xi, X2,... be independent and identically distributed random variables with
distribution function in the domain of attraction of some extreme value distribution
with y > 0. Let X\,n < X2,n < < ^n,n be the nth order statistics. Prove that if
k = k(n) -> 00, k/n > 0, n > 00,
Xn-k,n
j- > p X
P _1
n-k,n
^ rk Xnin

3.3. Can you prove an asymptotic normality result for this estimator?
Hint: see Theorem 2.4.8.
3.4. Assume the conditions of Exercise 3.2. By using the methods in the proof of
Theorem 3.2.2 and Lemma 3.2.3, prove that
log Xn,n - log Xn-k>n P
log A:
and that the distribution of
log Xn,n - log Xn~k,n ~ Y log k
converges to Go; note that in the notation of the proof of Theorem 3.2.2 the distribution
of log Yk,k log k converges to Go(x) = exp(e~x). Is this estimator better or worse
than the one in Exercise 2.19?
3.5. Define
y := 1 - 2" 1 ( l - ( m ) 2 / m ^ ) " 1
with nin = k l X^?=i(^n-i,n Xn-k,nV for j = 1,2. Prove that y is consistent
for y provided y < \ and that y/k (y y) is asymptotically normal for y < | under
appropriate conditions. Calculate the asymptotic variance and bias.
3.6. Prove Theorem 3.3.5 for y = 0. Check that the variance and bias of the limiting
random variable are the same as taking the limits of the given variance and bias of
Theorem 3.3.5 when y converges to zero.
3.7. Check that when p = 1 the Pickands estimator, conveniently normalized as in
Theorem 3.3.5, has asymptotic bias \bY,-\, where

bY-\ =

U ) 1

(y-l)(2y-l)log2

' ^

2,

y= h

'

126

3 Estimation of the Extreme Value Index and Testing

3.8. Let {Wn(s)}s>o be a sequence of Brownian motions.


(a) Using (3.5.11) and under the conditions of Corollary 2.4.6 show that
sup
o<s<i

iogxn-kin
min(l,sy-+l^)\Vk(l0gXn-[ksln-]
' I
\
qoKk>

- s-y-xWn(s)

+ W(l) - VkQo (j)

%_,P>{s-1)

s-y--v

Yp

0.

(b) With the notation of Lemma 3.5.5, prove that

*(S-i4)- 4 '
with P(Wn) = / J s-yl Wn(s) - Wn(l) + VkQo(n/k)VY_tf/(s-1)

ds.

3.9. Prove the second statement of Lemma 3.5.5.


3.10. (a) Check that the expected value of the limiting random variable of Corollary
3.5.6, as a function of (/_, pf), satisfies
E ((1 - 2 K _)(1 - Y-)2 { Q - K-) Q - 2P^j
^

(1-K_)(1-2K_)
(l-K_-pO(l-2K_-pO,

where the random variables (P, Q) are defined in Lemma 3.5.5 and the mean values
and covariance matrix are given in terms of (}/_, pr).
(b) Use this to provide the asymptotic bias of the moment estimator in terms of y
and p1.
3.11. Assume the first-order regular variation condition for some y < \. For a
sample of size n, prove the consistency of the negative Hill estimator for some intermediate sequence k with ky~n2e -+ 0, s > 0.
Hint: use the methods used in the proofs of the consistency of the Hill or the Pickands
estimators.
3.12. Assume the second-order regular variation condition for some 1 < y < j .
For a sample of size n9 prove the asymptotic normality of the negative Hill estimator
for some intermediate sequence k with ky~sn2e -> 0, e > 0, using the methods used,
for example, in the first proof of Theorem 3.3.5.

Extreme Quantile and Tail Estimation

4.1 Introduction
With the sea level case study introduced in Section 1.1.4 and further discussed in Section 3.1, we illustrated the role of extreme value theory in extreme quantile estimation.
In the sequel we explore this example a bit further.
The Dutch government requires the sea dikes to be so high that in a certain year
a flood occurs with probability 1/10000. In order to estimate the height corresponding to that probability, available are 1877 high tide water levels, monitored at the
coast, one for each severe storm of a certain type, over a period of 111 years. The
observations are considered as realizations of independent and identically distributed
random variables. In Figure 4.1 is the empirical distribution function Fn based on
these observations, i.e., we assign mass \/n to each observation where n represents
the sample size.
One possibility to estimate a quantile is via the empirical quantile, that is, one of
the order statistics. As shown in Figure 4.1 for the 0.9 quantile, following the curve this

0.8 A

*0.6 -J
0.4 A
0.2 A

100

200

300

400

500

Sea level (cm)

Fig. 4.1. Empirical distribution function of the sea level data.

128

4 Extreme Quantile and Tail Estimation

i
i

2 -i

1 -^

J^

0 -1 -

2.0 -

- ^

1.5 -

- $

1.0-

- ,2

0.5 0.0 -

-? i"
0

v=l|

Y 0

-y=-l
i

' /

^"'''

/
i

i
2

Fig. 4.2. Extrapolation function: (a) for U(t); (b) for - log(l - F).
quantile is just the level corresponding to the given probability. Now we are interested
in estimating the sea level, say w, with probability (111/1877) x 10~~4 ^ 6 x 10~6
of being exceeded, that is, 1 F(u) ^ 6 x 10~ 6 . So this is the probability of a flood
during one windstorm of a certain (severe) type. But for the given data set the highest
order statistic corresponds roughly to F*~ (1 1/1878), that is, 1 F(X\s77,isn) =
1/1878 ^ 5 x 10~4. Hence in order to give a nontrivial answer one needs somehow
to extrapolate beyond the range of the available data.
Recall that we want to estimate the level u such that 1 F(u) ^ 6 x 10~6. In terms
of the function U = (1/(1 -F))<~ this means that u /(l/6x 106) /(17x 104).
Remember that (cf. (1.1.27), and also (3.1.6) and (3.1.8))
U ( l 7 x 10 4 ) = U(tx) U{t) + a(t)

XY-I

(4.1.1)

We see that the function (xy \)/y plays a crucial role (cf. Theorem 1.1.6, Section
1.1.3). Basically, the extrapolation beyond the quantile U(t) is via this function multiplied by the scale factor a (t). Roughly speaking, apart from a scale factor, the function
(xy l)/y approximates U = (1/(1 F))*~, or log((jcK l)/y)*~ approximates
log(l F). Figure 4.2 shows these functions for y = 1, 0, 1. We see that the
real parameter y determines their shape; for instance, in the log(l F) scale one
gets a straight line when y equals zero.
Hence (4.1.1) motivates the quantile estimator
(l7xl04xk\y

_ i

where n is the sample size and k is an intermediate sequence, i.e., k -> oo and
k/n -> 0.
Figure 4.3 displays the empirical distribution on a log-scale, i.e., the step function
log(l Fn) (which is a convenient scale when one is interested in the largest values

4.1 Introduction

129

14 ->
CM

_-'"'

00

j * - * ^

^
^ ^ ^ ^
.^^^
i
i

^
I

300

400

500

Sea level (cm)

Fig. 4.3. Step function log(l Fn) of the sea level sample and estimated model attached to
the intermediate order statistic X1699,1873of the sample) and the quantile we are interested in. Moreover, it shows one possible
model fitted to the tail of the distribution, which gives the following estimate of the
sea level for a failure probability of 6 x 10~6: if we take k = 174,
/1873\
#(17 X 104) = Xi873-174,1873 + 5 f ^ - J

/l7xl0 4 xl74V _ !l

1873

Then for instance using the moment-type estimators discussed in Section 3.5 above
and Section 4.2 below, with k = 174, we get y = yM = 0.02 and 5(1873/174) =
aM 40.3, hence
U(\l x 104) = 286 + 40.3 (

cl04xl74\QQ2 _ !
l
1873
)

O02

= 715.6

The adjusted model in Figure 4.3 represents this formula with the quantile as a function
of the given probability and with the other components fixed. It is attached to the
empirical function at the intermediate order statistic Xi873-n4,i873 = ^1699,1873 =
286. More details on the data analysis are given in Section 4.6.
Thus we see that a key issue in quantile estimation is the estimation of the extreme
value index y. Moreover, we have to deal with the estimation of U(t) and the scale
function a(t). Recall that in Chapter 3 we have already discussed two estimators
of the scale: the maximum likelihood estimator (cf. Section 3.4) and the probability
weighted moment estimator (cf. Section 3.6). In the next section we discuss another
possibility, this time related to the moment estimator of Section 3.5.
In Section 4.3 we develop some limiting theory of extreme quantile estimation.
The dual problem of the estimation of tail probability is discussed in Section 4.4.
A related problem is the endpoint estimation, which we also address in Section 4.5.
Finally, in Section 4.6 we give some simulations and continue discussing the three
case studies: sea level, S&P 500, and life span.

130

4 Extreme Quantile and Tail Estimation

4.2 Scale Estimation


Suppose F is in the domain of attraction of an extreme value distribution, i.e., for
sequences an > 0 and bn, and some y e l ,
lim Fn(anx + bn) = exp - ( 1 + yx)~l/y , 1 + yx > 0 .
n-+oo
Next suppose, for some positive function a and A a function not changing sign
and such that A(t) - 0, t - > oo, the second-order condition (2.3.5),
lim
'-*oo

U{tx)-U(t) _ JC^-1
a(<
>
*
A(f)

=i(^"-

A*

(4.2.1)

for JC > 0 with p < 0. Then for y 7^ p and p < 0 if y > 0 w e know that a
second-order condition for log U(t) holds, namely
\ogU(tx)-\ogU(t) _

lim

a{,),m

/-oo

lf

xy--\

^ - = -, ( -

P- #\ ! .Y-+
. . .P
/

Q(t)

- -

Y-

-)

I)

(4.2.2)

with >/_ = min(0,)/) and Q not changing sign eventually with Q(t) -> 0, f -> 00
(cf. Lemma B.3.16 in Appendix B). When y > 0 and /> = 0 the limit (4.2.2) vanishes
for all g satisfying A(f) = 0(Q(t)), t -+ 00.
We now study an estimator for the scale a, related to the moment estimator of y
discussed in Section 3.5. Recall the notation introduced in that section,
k-i

M(nj) := - J2 fag Xn-itn - logX_ M ) y


1=0

for j = 1,2. Define

. I - I U *M<2>
2

(4.2.3)

Note that Y- + YH = YM with YM the moment estimator of y . We define the estimator


aM := Xn-t^Hl

- Y-)

(4.2.4)

In the next theorem w e give the consistency and asymptotic normality of &M.
Note that for the nondegenerate limit the conditions are the same as those in Theorem
3.5.4, which states the asymptotic normality of YinTheorem 4.2.1 Let Xi, X2, ...be Li.d. random variables with distribution function F.

4.2 Scale Estimation

1. IfFe

131

V(GY), x* > 0, then


<*M

>l

provided k = k(n) -> oo, k/n -* 0, as n -+ oo.


2. Suppose F e V(Gy), x* > 0 and r/iaf f/ie second-order condition (4.2.1) Aa&fc
vWf/i y T^ p. 7/7ne sequence of integers k = A:(n) satisfies k -* oo, fc/n -> 0,
(4.2.5)

lim V*G ( 7 ) = A
wiYA G : = A from (4.2.1) ify>0
Xfinite,then

and p = 0 anJ Qfrom (4.2.2) otherwise, and

41

(4.2.6)

N(kbYtPivaiy)

wiYA N standard normal,


|-(l-y-p)(l-2y-p)>>/<^^0>
y

(l-2y)(l-3y

i,

P < y < 0,

lim^oo /(*) - a(t)/y ^ 0 ant/0 < >/

2
bYiP := <1 (l+y)p '

<

P,

Qimt^ooU(f)-a(t)/y=0,

(1-P) 2 '

and 0 < y < p) ory>

[o,

y >

p > 0,

= 0,
(4.2.7)

and

V + 2,
vary := <

y>0,
2

2-16y+51y -69y +50y -24y


(l-2y)(l-3y)(l-4y)

(4.2.8)

5
ny

' Y

<

n
U

'

Proof The proof of the consistency is left to the reader. Next we prove the asymptotic
normality.
First observe that with q := a/U9
<*M d

< )

Mx(i)

a(Yn-k,n)

qWn-k,n)
1}

MJ

*(f)

(1

ao(y n - M )a(y n - M )

?o(y-*.) ?(yn-*.)

a(f)

(i - ?-),

where the function go is from Theorem 2.3.6, which gives the uniform inequalities
connected with the second-order regular variation condition for the function log U.
Then for thefirstfactor we know from Lemma 3.5.5 that,

v* /(i-y-)*?>
\ qo(Yn-k,n)

A
)

(1 -

y-)P,

(4.2.9)

132

4 Extreme Quantile and Tail Estimation

where the random variable P is normally distributed. For the second factor use Corollary 2.3.5, Theorem 2.3.6, and (4.2.5) to get
A _p _

rr (qo(Yn-k,n)

A.l{)^o}

For the third factor note that

-^(^-e'-)>^((M'-)
The second term in (4.2.10) converges to yB with fl a standard normal random
variable (Corollary 2.2.2). By inequalities (2.3.18) of Theorem 2.3.6,

K(a<Xn-k,n)

(k

V\

-VSA(i)(i + ^i,,(^) y fi5^)lzl


/k\

(k

+ VkA\-\(i

\y+P

+ o(i))l-Yn-k,n)

xmaxM -Yn-kA

A-Yn-kA

J .

Hence, since A = O(Q) by Lemma B.3.16 and kYn-kfn/n -+p 1 by Corollary 2.2.2,
the first term on the right-hand side of (4.2.10) tends to zero in probability as n -* oo.
Finally, for the fourth factor, from Corollary 3.5.6 we know that
^ ( j ^ - l )

2(1 - 2 y _ ) ( l -)/_)/> - I ( 1 _ 2 K _ ) 2 ( 1 - K _ ) e ,

where the random variables F and Q are normally distributed, and P is the same as
in (4.2.9).
Hence the limiting distribution of ^(&M/a(n/k)
1) is the distribution of
(1 - y_)(3 - 4y_)P - 1 ( 1 2

K _)(l

- 2y_) 2 0 + yB -

Ww<0or'<",
p' + y l ^ o }
(4.2.11)
where the random variables P and Q are from Lemma 3.5.5 and B is standard normal
independent of P and 2 (recall that {Yn-kH,n/Yn-k,n}^Zo *s independent of Yn-ky,
cf. proof of Lemma 3.2.3).

The following corollary is now easy to obtain.

4.3 Quantile Estimation

133

Corollary 4.2.2 Under the conditions of Theorem 4.2.1(2),


/lit

&M

,,

1 Xn-k,n-U{j)\

where the random vector (R, S,T) has a multivariate normal distribution with mean
vector X(byMp, b^fp, 0), where by*?p and by**p are respectively given by (3.5.16) and
(4.2.7), variances given by (3.5.17), (4.2.8), and 1, and
\ y - 1,
y > 0,
Cov(#, S) = i (l->/)2(-l+4K-12y2)
ft
I
(l-3y)(l-4y)
> / < u>
Cov(P, T) = 0, andCow(S, T) = y.
Proof. From Theorem 2.4.1,

^x_M-^(f)_^

(42i2)

"(f)
where B is a standard normal random variable. Moreover, from the proof of Theorem
3.5.4 we know that Vk(y>M y) has the distribution of
X

T^I+

y+p + (1

" 2/ - )(1 " y~)2 \(\~y-)Q~

2P

\'

(4,213)

where the random variables P, Q, and B are the same as those in the proof of Theorem
4.2.1. Combining (4.2.11), (4.2.12), and (4.2.13), the result follows.

4.3 Quantile Estimation


Suppose F is in the domain of attraction of an extreme value distribution, i.e., for
sequences an > 0 and bn, and some y e R,
lim Fn(anx + bn) = exp (-(I
n-*oo

+ yx)~l/y)

1 + yx > 0 .

Then for some positive function a and moreover taking b(t) = U(t) F^~(l 1/0
we have
,. u(tx) - u(t)
xy - I
hm
=
'-oo

a(t)

for x > 0 (Theorem 1.1.6).


Next suppose the second-order condition (2.3.5),

*-x

A(0

P\

r+p

y J

134

4 Extreme Quantile and Tail Estimation

holds for x > 0 with p < 0 and A a function not changing sign and such that
A(t) -* 0, t -> oo.
Let be an intermediate sequence, i.e., k = &(w) -> oo, k(n)/n -+ 0 ( oo).
Suppose that for suitable estimators p, a(n/k), and b(n/k),

with (T, yi, 5 ) jointly normal random variables with known mean vector possibly
depending on y and p and known covariance matrix depending on y (not on p).
Now we are ready to consider extreme quantile estimation. Note that in the examples we needed to estimate a (1 p) quantile on the basis of a sample of size n,
where in fact p is much smaller than 1/n. Let xp := U (l/p) be the quantile we want
to estimate. We are particularly interested in the cases in which the mean number
of observations above xp9 np equals a very small number. This means that we are
looking for a number xp that is to the right of all (or almost all) observations, or what
is the same, we want to extrapolate outside the range of the available observations.
Since this is the central issue in our problem, we want to preserve in the asymptotic
analysis the fact that np should be much smaller than any positive constant. Hence we
are "forced," when applying asymptotic methods, to assume that p in fact depends
on n, p = pn, and that
lim pn = 0 .
W->00

That is, we want to estimate xPn with 1 F(xPn) = pn, or equivalently, xPn =
U(l/pn) with pn -> 0, as n -> oo .
Theorem 4.3.1 Suppose for some function A with A(t) -> 0, t > oo, the secondorder condition (4.3.1) holds. Suppose:
1.
2.
3.
4.

the second-order parameter p is negative, or zero with y negative;


k = k(n) -> oo, n/k -> oo, and \fkA(n/k) - k e R, n - oo;
condition (4.3.2) holds for suitable estimators ofyy a(n/k), and U(n/k);
npn = o(k) andlog(npn) = o(y/k), n -> oo.

Define

Then, as n -* oo,
a{l)qY(dn)

Y-+P

withdn := k/(np), y~ '= min(0, y) and where fort > 1,


qy(t) := I sy~l logs ds .

4.3 Quantile Estimation

135

Corollary 4.3.2 The conditions of Theorem 4.3.1 imply

hence an equivalent statement is


X

V C Pn~
\ Pn*

. 22D
, >i
^d r^ + ,(K_)
fi-y_A

^~

This version is more useful for constructing an asymptotic confidence interval


foxxPn.
Remark 4.3.3 Note that as t -> oo,
y >0,

^log/,

y=o,

^(logO ,
1/K 2 ,

Y <0.

Moreover, from the definition of qY (t) it is clear that qy (t) is increasing in t and that
<7y(0 is also an increasing function of y when t > 1.
Remark 4.3.4 Condition /?n = o{k), i.e., ;? <3C k/n, is quite natural since if it is not
satisfied, nonparametric methods can be employed (Einmahl (1990)); see also Remark
4.3.7 below. In particular, when npn -> 0 the condition log(npn) = o(Vk), i.e.,
pn > n~le~*k for each > 0 and sufficiently large n, means that the extrapolation
cannot be pushed too far.
Note that by checking the components of the asymptotic variance in the theorem,
one sees that for y near zero the uncertainty in the estimation of xPn is to a large extent
caused by the uncertainty in the estimation of y and not so much in the estimation of
b(n/k) or a(n/k).
For the proof we need the following lemma:
Lemma 4.3.5 If (4.3.1) holds with p < 0 or p = 0 and y < 0 then
U(tx)-U(t)

lim
,-,

y
xY l

<'>

(t)

-,

L-.
P + y-

Proof Use the inequalities of Theorem B.3.10 for p < 0, or p = 0 and y < 0.
Proof (of Theorem 4.3.1).

*P.-^.=*+^nri-l'()

136

4 Extreme Quantile and Tail Estimation

-*G)-GH(i)Pr-^)
n\\ dvn - 1

-Hi)-"-*?1
Hence
l

Pn

Vk^

Kk

I (6 part)

Vk

II (y part)

Kk)

>
a (I)

VkI (4-1
a(g)

III (a part)

qy(d)

,\

4T-l\|

dZ-i

a (l)

)yqy(dn)

Vk (U(idn)-U(i)
qy(dn)
(f)

IV (nonrandom bias)

<#-l\

Recall Remark 4.3.3 and that as t oo,


tY-1

logt ,

y=0,

-7.

K<-

(4.3.5)

Hence from (4.3.2), as n - oo, I->^(y_)2Z? and III>^ y_^l with y_


min(0, y).
Next consider II, which is basically

Vk (4-1

d%-\\

qY(dn)
We write this as
Vk(y-y)
qy(dtW

[d

)]o s
le^'y ^ -l

Jl

(y

\(y -y)logs\

< \Vk(y

-y)logs

logs ds .

Since for any 1 < s < dn,


-y)

logdn P
yfk

"*

4.3 Quantile Estimation

137

by assumption, we have for n - oo,


\e(y-y)logs

sup

_ i

1.

(y - y ) l o g , s

It follows that

ut-\

Vk

di~\\

qy{dn)
has the same limit distribution as

Vk(y - y) ,
i.e., T.
Finally, we deal with part IV:

(uM)-u(i)
qY(d)

tf-A
y

(f)
n\

= -V~kAQ

dYn-\

U(dnl)-U{j)

S^T-1

<i)

yqy(dn)

^(i)

Recall (4.3.5). So part IV converges to Ay_(y_ + p)


sumption (2) of Theorem 4.3.1.

by Lemma 4.3.5 and as

Proof (of Corollary 4.3.2). Start with


\qy(dn)

\qY(dn)

-1

tfn sy~l log sds

Itn

ffn sr~l logsds

sy-y

sY

log s ds

f{" sr-1 log sds


(4.3.6)

We have for any 1 < s < dn,


\(y -y)logs\

< \y -y\\logdn\

= \Vk(y

-y)

logdn

p
0

y/k

Hence for sufficiently large n, with high probability


^-K-l|<2|y->/||log4l
It follows that (4.3.6) is at most 2 | y y | |log dn \, and hence it tends to zero.
Remark 4.3.6 Note that when y < \, if one takes the sample maxima Xn,n to
estimate an extreme quantile (npn = 0(1)) one gets a better rate of convergence than
that in Theorem 4.3.4 (cf. Exercise 1.15).

138

4 Extreme Quantile and Tail Estimation

Remark 4.3.7 For quantile estimation in a less extreme region (e.g., if npn/k -> c e
(0, oo)) one can use the results of Section 2.4 (in particular (2.4.2); see Exercise 2.17).
On the other hand, it is possible to relax the condition npn = o(k) to npn = 0(k)
in Theorem 4.3.1. That is, under the same conditions of Theorem 4.3.1 but with
dn = k/(npn) -> r > 0, one can show by similar arguments that

a(%)qY(r)

qy(r)

y qy(z)

p qy(r) \

y +p

/ '

as n -> oo. So one could follow the approach of Theorem 2.4.2 or that of Theorem
4.3.1. The rates of convergence are the same for both cases.
A simpler version is valid when y is positive:
Theorem 4.3.8 Suppose for some function A, with A(t) -> 0, as n - oo,
Km JM
t-+oo

=XY5.

A(t)

i.
p

Suppose:
1. the second-order parameter p is negative,
2. k = k(ri) -> oo, n/k -> oo, and *s/kA(n/k) -> A G R, ->oo,
3. npn = o(k) andlognpn = o (Vk), n > oo.
Define
xPn := Xn-k,n (
)
\npnj
Then as n oo,
log ^xn \vxp^n

and xPn := U I ) .
\PnJ

wiYA dn := k/(npn).
Proo/ The proof is similar to that of Theorem 4.3.1. First note that

l0gdn V

Next note (Theorem 2.4.1) that

and finally that (use Theorem 2.3.9)

4.3 Quantile Estimation

139

(W
_ j
d-r a
u(n}
n
lim
..
-^oo
A (I)

Hence

1
Vk
//J_)log4i\

\PnJJ
dYnU($)

V*

U(f) 7
y/k

log 4 VV"

/)

logd
\ogdn

A (|)
A()

The result follows.


Corollary 4.3.9 Under the conditions of Theorem 4.3.8,
^L^l

(n-*oo).

The previous results are quite general in the sense that they are valid for any
estimators of y, a(n/k), and b(n/k) satisfying (4.3.2). For the estimation of the
location b(n/k) = U(n/k) the natural estimator is its empirical counterpart Xn-k,nThen, from Theorem 2.4.1,

/r *(!)-") <

-ft)

^'

with 5 a standard normal random variable independent of (T, A). Next we shall find
the parameters of the limit distribution in Theorem 4.3.1 for some of the estimators
for y and a(n/k) introduced before. A similar exercise will be done in Sections 4.4
and 4.5.
4.3.1 Maximum Likelihood Estimators
We start with the maximum likelihood estimators of Section 3.4. Recall from Theorem
3.4.2 the joint limit distribution of (F, A) for y > \.
Hence if y := XMLE and a ( | ) := &MLE in (4.3.3) are maximum likelihood

estimators, under the conditions of Theorem 4.3.1 with y > 5,

140

4 Extreme Quantile and Tail Estimation


Xpn

Vk

*^

converges in distribution to a normal random variable with mean

[0,

K < 0 = />,

and variance
(l + y)2,

y > 0,
2

(4.3.8)

1 + 4)/ + 5y + 2y + 2y , )/ < 0 .
4.3.2 Moment Estimators
Another possibility is to use in (4.3.3) the moment estimator of Section 3.5. To estimate
the scale use &M = Xn-k,nMn (1 y~), introduced earlier in Section 4.2, and take
as usual b(n/k) = Xn-k,nWhen using the moment estimator in Theorem 4.3.1 one needs to take into account
that the conditions of Theorem 3.5.4 (asymptotic normality of moment estimator) and
the conditions of Theorem 4.3.1 (asymptotic normality of quantile estimator) are not
the same. For the asymptotic normality of the moment estimator (and for the scale as
well) the extra conditions are U(oo) > 0, so that the estimator is well defined, and
y ^ p, so that a second-order condition for log U holds, with auxiliary second-order
function Q. Besides, one needs

^o

AR,

n -> oo ,

(4.3.9)

instead of VkA(n/k) -> A (for more details see Remark 4.3.10 below).
Consequently, under the given conditions, if y := }>M and a(j) := &M in (4.3.3)
are the moment estimators, then
V > ~ * *

(4.3.10)

MqyMydn)

converges in distribution to a normal random variable with mean


*4><\~Y)
(Y+PX1-Y-P)(I-2Y-P)

V<D<0
'

'

-(j^ji.

limr_>Oo^W-a(0/y/0and0<y < - p ,

^ffiffi .

(Hm^oo C/(0 - a(t)/y = 0 and 0 < y < -p),


or y > /0 > 0,
(4.3.11)-

4.4 Tail Probability Estimation

and variance

[K 2 + I ,
{
,
7
1 (l-y) 2 d-3y+4^)

K>O,

[ (l-2y)(l-3K)(l-4y) '

141

(4.3.12)
0

< U

'

Remark 4.3.10 From Theorems 3.5.4 and 4.2.1 and (4.2.12) we have that ( r , 71, #)
has distribution
( - ^ ^

+ y+P + ( l - 2 y - ) ( l - K - ) 2 { Q - K _ ) G - 2 p } ,
+ (3 - 4y_)(l - Y~)P ~ \(\ ~ 2 y - ) 2 d - Y-)Q + K^

where 5 is a standard normal random variable and the random vector (P, Q) is the
one from Lemma 3.5.5 (Section 3.5). Further, B and (P, Q) are independent.
As mentioned above, Theorem 4.3.1 is not straightforward with respect to the
moment estimators. Recall that for the asymptotic normality of Vk(yM y) and
*Jk(oM/a{n/k) 1) we require VkQ(n/k) = 0(1), as n -> oo, where Q is the
auxiliary second-order function in the second-order condition for log U. In contrast,
in Theorem 4.3.1 we require VkA(n/k) = 0(1), where A is the auxiliary secondorder function in the second-order condition for U. From the proof of Theorem 4.3.1
we see that the second-order condition for U is necessary for the b part (I) and for
the nonrandom bias part (IV). Hence, if one wants to use the moment estimator in
quantile estimation, assume y/kQ(n/k) -* k e R, n -> oo, and Lemma B.3.16
provides the (finite) limit of A(t)/Q(t). Then the limiting distribution of (4.3.10) is

r + (y-.fB-y-A-y-U{^0}

Y-+P

Use Corollary 4.2.2 to obtain the bias and the variance given in (4.3.11) and (4.3.12).

4.4 Tail Probability Estimation


On the basis of an independent and identically distributed sample of size n, now we
solve the dual problem: given a large value JC, how can one estimate
p = 1 - F(x) .
As in quantile estimation take (4.3.1)-(4.3.2) as our point of departure and the same
considerations on p = pn. Hence in particular, x should in fact depend on n, x xn,
and pn = 1 F(xn) -> 0. In the sequel an important auxiliary quantity is
dn-

n{\-F(xn)Y

where as usual, A: is an intermediate sequence.

142

4 Extreme Quantile and Tail Estimation


To estimate the tail probability we use (cf. (3.1.4))
,
kl
pn := - \ max
n

-1/9

(...,*#)f

with xn known.
Theorem 4.4.1 Suppose for y > ^ and some function A with A(t) - 0, t > oo,
the second-order condition (4.3.1) holds. Write as before dn = k/(npn). Suppose:
1. the second-order parameter p is negative, or zero with y negative;
2. k = k(n) > oo, n/k -> oo, and \fkA{n/k) -> k e R, n oo;
5. dn -^ oo and wy(dn) = o(y/k), n -> oo, where for t > 0,

u ; y ( 0 : = r y /" j ^ l o g j d j ;
4. condition (4.3.2) holds for some estimators ofy, a(n/k), and U{n/k), say y,
a (n/k), and b(n/k), respectively.
Then, as n -> oo,
- ^ -

f^

- l ) 4 . r + ( K _) 2 * - (y_) ^ - X-*=-

w y ( 4 ) \/>

(4.4.2)

Y-+P

with Y- ' min(0, y).


Remark 4.4.2 Note that as t -+ oo,

Wy(t)

logf,

y>0,

i(log0 2 ,

y = 0,

&-y,

y<o.

I Y

Moreover, since
Wy(t)

= J s~l-y(logt

- logs) ds=

s'l'yds

u~l du ,

it is clear that wY(t) is increasing in t and that wy(t) is a decreasing function of y,


whenf > 1.
Remark 4.4.3 Result (4.4.2) is valid only for y > -\\ for y < 0 condition (3) of
Theorem 4.4.1 implies k~1/2dnY = k~l/2~y (npn)y - 0, as n -> oo. Hence, since
y < 0 and npn = 0(1), we must have k~xl2~y - 0. This implies y > | . In fact,
for y < \ no result of this type exists.

4.4 Tail Probability Estimation

143

Corollary 4.4.4 Condition (3) of Theorem 4.4.1 implies what we may call consistency:

Ml.
Pn
Corollary 4.4.5 The conditions of Theorem 4.4.1 imply

^(4) p
Wy(dn)
l

where dn := k(npn)~ . Hence an equivalent statement is

A)n) \Pn
Wy{d

Y- + P

The latter form of the result is more useful for constructing a confidence interval
for/v
Note that the limiting random variable is the same as the one in Theorem 4.3.1,
as could be expected from the Bahadur-Kiefer representation. For the proper nondegenerate limit distribution of the difference of the normalized left-hand sides of
(4.3.4) and (4.4.2) we refer to Einmahl (1995).
Condition dn -> oo means that we extrapolate outside or near the boundary of
the range of the available observations; cf. Remark 4.3.4 above.
Finally, note that condition (3) of Theorem 4.4.1 implies for all real y that
log</ n =o(Vfc) .

(4.4.4)

Proof (of Theorem 4.4.1). Define

Then
-1/9
Pn
, I
= dn {max
Pn

= dn \ max I 0 , 1 + y
,y

/ Xn
r -~X
r n

ri.LdYn
- Il \\ \\ l1

*(f)

vn \\ II
~X
. /v v
xnn

= dn \ max I 0 , dl + y

y
-l/y

))\

144

4 Extreme Quantile and Tail Estimation


Now, as in Theorem 4.3.1,
*n

*n

Vk Z
,"
a(fjqY{dn)

A r +

x2

>

A
(Y-)D2B-Y-A-X

Y-

Y-+P

by assumption
9y(dn)

and

Hence /5w//?n - ^ 1 and we can expand


Pn
Pn

_1

The result follows.

^L>V

Xn

Xn

a{l)qY{dn)

0P{\))

Proof (of Corollary 4.4.5). Consider


Wy(dn)

Wy(dn)

Wy(dn)

Wy(dn)

Wy(dn)

Wy(dn)

The first factor converges in probability to 1 by Corollary 4.3.2 and assumption (3)
of Theorem 4.4.1 (cf. (4.4.4)). The second factor also converges in probability to 1
since dn/dn >p 1 and the function wp (t) is regularly varying.

Remark 4.4.6 For estimation of exceedance probabilities in a "less extreme" region


(e.g., if npn/k -> c e (0, oo)) one can use the results of Section 5.1 below (in
particular (5.1.12)). On the other hand, it is possible to relax the condition npn = o{k)
to npn = O (k) in Theorem 4.4.1. That is, under the same conditions of Theorem 4.4.1
but with dn = k/(npn) -> r > 0,

* (-0

%w

^r+'x*

w;K(r)

y wy(r)

pwy(r)\

y+p

/ '

as w - oo. So one could follow the approach of Theorem 4.4.1 or that of Theorem
5.1.2. The rate of convergence is the same in both cases.
A simpler version is valid when y is positive.

4.5 Endpoint Estimation

145

Theorem 4.4.7 Suppose for some function A, with A(t) > 0, as n -> oo,
lim ^ - = xy
/-oo
A(t)

.
p

Suppose as well that:


1. the second-order parameter, p is negative;
2. k = k(n) - oo, n/k -> oo, and y/kA(n/k) -> A R, n - oo;
3. n/?n = tf(fc) and log npn = 0 ( V ^ j , n -> oo.

Pn : = - I v

and

Pn:z=U

[ )

TTien, as n -> oo,


\ogdn \Pn

with dn := k/(npn).
The proof of the theorem is left to the reader.
4.4.1 Maximum Likelihood Estimators
Recall that the limiting distributions of the suitably normalized quantile and tail
probability estimators, i.e., the left-hand sides of (4.3.4) and (4.4.2) respectively, are
the same. Therefore if the maximum likelihood estimators YMLE and <TMLE are used
in (4.4.1), then under the conditions of Theorem 4.4.1 the limiting random variable
(4.4.2) is normal with mean (4.3.7) and variance (4.3.8).
4.4.2 Moment Estimators
When using the moment estimators (cf. Sections 3.5 and 4.2) one needs extra conditions; the same considerations of Section 4.3.2 apply here. Then, if the moment
estimators YM and &M are used in (4.4.1) the limiting random variable (4.4.2) is
normal with mean (4.3.11) and variance (4.3.12).

4.5 Endpoint Estimation


Next we turn to the problem of estimating the endpoint of the distribution function
F. We assume F e T>(Gy) for some negative y, since in this case the endpoint x* is
known to be finite (cf. Lemma 1.2.9).
An estimator of x* can be motivated from the quantile estimator. Replacing pn
by zero in (4.3.3) we get

146

4 Extreme Quantile and Tail Estimation


x

=b(k)-9--

(4 51)

As in the previous sections, we assume in the following that (y,a,b) when suitably normalized are asymptotically normal. Denote the limiting random vector by
( r , y l , * ) ( c f . (4.3.2)).
Theorem 4.5.1 Suppose that for some function A(t) > 0, / -> oo, the second-order
condition (4.3.1) holds with y negative. Suppose k = k(n) > oo, n/k oo, and
> X e R, n > oo. Then,
^

^ r + y2B-yA-

^T-

()

X-^

(4.5.2)

Y+P

Corollary 4.5.2 Under the given assumptions an equivalent statement is


2
2
VkY
B-yA-X
-x* < +. y2 D
... , Y
2x*-^V-^r

This version is more useful for constructing asymptotic confidence intervals for x*.
Corollary 4.5.3 Under the conditions of Theorem 4.5.1,
" > *

s k

JC* - > J C * .

For the proof of Theorem 4.5.1 we need the following lemma:


Lemma 4.5.4 If the second-order condition (4.3.1) holds with y negative then
U(oo)-U(t)

iim

. 1
+

y =

1 .

t-+<*
A(t)
y(p + y)
Proof From the second-order condition (4.3.1) and the second-order condition
(2.3.7) for the function a, it follows that

(um - ^ ) - (u(t) - f)
t-*oo

a(t)A(t)/y
U(tx)-U(t)
=

_ xy-\

a(tx) _

rY

_ W . L _ _ Um
A
'-oo
A(t)/y
t-H>o A(t)
+
xy f> -1
n m

Y + P
Now Lemma 1.2.9(2) implies that lim^oo (U(t) - a(t)/y) exists. This limit must
be U(oo), since the function a is regularly varying with negative index and hence
tends to zero. Lemma 1.2.9(2) also implies
tf (oo) - (U(t) - Sip.)
v
7
Iim

-a(t)A(t)/y
hence the result.

j
y+ p

4.5 Endpoint Estimation

147

Proof (of Theorem 4.5.1).

'--"'(iHGHGKK)
-('(iHGHH^-""?)Hence by (4.3.2) and Lemma 4.5.4,

(f)

(f)

(!)

VK

K/

Remark 4.5.5 As pointed out in Aarssen and de Haan (1994), for estimating x* when
Y < \ it is more efficient to use different fc's for the various estimators involved in
Jc*. For instance, the authors suggest, with k\ fixed and k = k(n) -> oo, k/n -* 0 as
usual, to take

**:=*()~ aT with a (^) : = x "-^ M S i ( 1 _ 5 ? M ) '

(453)

where M^ \ is like M but with k fixed equal to k\, b (n/k\) = Zn_^1>rt, and YM
is the moment estimator of Section 3.5 with the intermediate sequence k(n). When
Y < \ this estimator converges more quickly than the one from Theorem 4.5.1
(Exercise 4.6).
Another way to estimate x* when y < \ is simply to use the sample maxima Xn,n, similarly as in Remark 4.3.6. Under the natural conditions the rates of
convergence of x* in (4.5.3) and Xn>n are the same (cf. Corollary 1.2.4).
4.5.1 Maximum Likelihood Estimators
Note that for y negative the limiting distribution in (4.5.2) is still the same as that of
Theorem 4.3.1 on quantile estimation. Hence if the maximum likelihood estimators
YMLE and <TMLE are used in (4.5.1), then under the conditions of Theorem 4.5.1 with
Y > \ the limiting random variable (4.5.2) is normal with mean (4.3.7) and variance
(4.3.8).
4.5.2 Moment Estimators
When using the moment estimators (cf. Sections 3.5 and 4.2) one needs extra conditions; the same considerations of Section 4.3.2 apply here. Then, if the moment

148

4 Extreme Quantile and Tail Estimation

estimators YM and &M are used in (4.5.1) the limiting random variable (4.4.2) is
normal with mean (4.3.11) and variance (4.3.12).
Another option would be to use j / _ , i.e., (4.2.3), to estimate y since we assume
the latter negative, but it turns out that this is no better than using YM> The extra
conditions needed are the same as for the moment estimator. Then, if y := p_ and
a(n/k) := &M are used in (4.5.1),
V k y l ^ ^

(4.5.4)

converges in distribution to a normal random variable with mean


PQ-Y)
'(y+p)(l-y-p)(l-2y-p)
n y(l-3y+3^)

and variance

'

y < o < 0
r < P - U'
n<V<0

(4.5.5)

(1-K)2(1-3K+4)/2)

(4.5.6)
(l-2y)(l-3^(1-4)/)
The variance of the limiting variable is the same as when one uses the moment
estimator YM> The bias is also the same for y < p < 0; otherwise, the bias is larger
(in absolute value) than the one with YM> Therefore this latter option of using p_
shows no advantage.

4.6 Simulations and Applications


4.6.1 Simulations
In what follows we show simulation results for quantiles corresponding to pn =
l/(n log n) = 1.086 x 10~4, from independent and identically distributed samples of
size n = 1000, with a number of independent replications r = 5000, for the Cauchy,
normal, and uniform distributions (cf. Table 3.2). For some independent estimates
xPni\,...,
xPntr obtained from some estimator of the quantile, in the following by
mean square error we mean r _ 1 XT=i (*Pn,i ~ xPn)2We use the following notation: Hill means the quantile estimator restricted to
positive y (as in Theorem 4.3.8) with the Hill estimator, Mom means the quantile
estimator from Theorem 4.3.1 with the moment estimator and the scale estimator
from Section 4.2, PWM means again the quantile estimator from Theorem 4.3.1 but
with the probability-weighted moment estimators (3.6.9)-(3.6.10), and finally Falk
means the quantile estimator from Theorem 4.3.1 with the negative Hill estimator
(3.6.13) and the scale estimator the same as for Mom. Recall that all the estimators
use k upper order statistics out of the initial sample of size n. In Figures 4.4-4.6 are
shown the diagram of estimates of the quantile, i.e., averages of the 5000 quantile
estimates for each number of upper order statistics k, and the corresponding mean
square error.

4.6 Simulations and Applications

Moml

MLE

Hill

149

10000
S8000

5 6000 H
1,4000

400

Fig. 4.4. Standard Cauchy distribution: (a) diagram of estimates of the quantile (the true quantile
2931.7 is indicated by the horizontal line); (b) mean square error (see the text for details).

I
1

<*>
&
|

a
8

4.5
4.0

\
\

3.5

/ \

30

&

2.5

Mom
i

2.5 2.0 -

PWM
i

- w
_ 2
\

1.5 1.0 0.5 0.0 -

100

200

300

40()

100

200

300

400

Fig. 4.5. Standard normal distribution: (a) diagram of estimates of the quantile (the true quantile
3.62 is indicated by the horizontal line); (b) mean square error (see the text for details).
4.6.2 Case Studies
Sea Level
We continue the data analysis of Section 3.7. Recall that we want to estimate the
quantile corresponding to a tail probability of 1/17 x 10~ 4 , on the basis of 1873
observations of the sea level during severe storms. We give results for Mom (the
quantile estimator from Theorem 4.3.1 with the moment estimator and the scale
estimator from Section 4.2) and PWM (the quantile estimator from Theorem 4.3.1
with the probability-weighted moment estimators (3.6.9)-(3.6.10)). Moreover, from
Section 3.7 we know that y is close to zero. Then one can consider the following
options: use the quantile estimator as given in Theorem 4.3.1 or assume y = 0
and use
K

-'+'(>()

Pn

150

4 Extreme Quantile and Tail Estimation


[

Mom

PWM

Falk

0.0010 - j

1.02

0.0008 H

1.01

0.0006 -|
; 1.00

0.0004

"\

[j 0.99

0.0002

J- S o.o

0.98
200

300

400

i^
100

200

300

r
400

it

Fig. 4.6. Standard uniform distribution: (a) diagram of estimates of the quantile (the true
quantile .99985 is indicated by the horizontal line); (b) mean square error (see the text for
details).

200

1
400

1
600

r
800

Fig. 4.7. Sea level data, diagram of estimates of the quantile (cm).
The correspondent diagram of estimates for both options are shown in Figure 4.7. As
expected, one finds less volatility when y is fixed to 0.
In any case one has to estimate the scale. The corresponding diagram of estimates
is shown in Figure 4.8.
Under the conditions of Theorem 4.3.1 with A = 0 we have

Vk-*fc~Xp" i y/vaiyN
a(l)qf(d)

with N standard normal. The approximating (1 a) 100% confidence intervals are


given by
XPn - Za/2 a (^j q}(dn)J-^-

< Xp < XPn + Za/2 a ( ^ )

qy(d)^-^,

4.6 Simulations and Applications

151

Fig. 4.8. Sea level data, diagram of estimates of the scale.


where varp is the respective asymptotic variance with y replaced by its estimate
and Zce/2 is the 1 a/2 quantile of the standard normal distribution. In Table 4.1 are
confidence intervals for the quantile corresponding to a tail probability of 1 /17 x 10~4,
for some values of k and considering y = 0 (cf. Exercise 4.7 for PWM).
Table 4.1. Sea level data: 95% asymptotic confidence intervals for quantile.
100

200

300

Moment (615., 764.) (656., 778.) (669., 773.)


PWM (579., 737.) (641., 779.) (655., 774.)

S&P500
We continue the S&P 500 data analysis of Section 3.7. Recall that we focus on
the log-loss returns comprising 2643 observations. A short summary of the largest
observations is in Table 4.2.
Table 4.2. S&P 500 data.
3rd Quantile Xn_\n
0.009

Xn,n

0.086 0.228

Next we show estimates of the probability that the log-loss return exceeds the
value 0.20, using the quantile estimator from Theorem 4.4.7 with the Hill estimator
(we simply call it Hill) and the quantile estimatorfromTheorem 4.3.1 with the moment

152

4 Extreme Quantile and Tail Estimation


I

Mom

0.00020 0.00015 0.00010 -

H,

0.00005 0.0 -

300

Fig. 4.9. S&P 500 data, diagram of estimates of the tail probability.
estimator and the scale estimator from Section 4.2 (we call it Mom). The diagram of
estimates is in Figure 4.9.

Fig. 4.10. Life span data, diagram of estimates of the right endpoint.

Life Span
We continue the life span data analysis of Section 3.7. The data set consists of the
total life span (in days) of 10 391 residents of the Netherlands, and we are interested
in the estimation of the right endpoint of the underlying distribution. In Section 3.7
we analyzed its existence by not rejecting the hypothesis that the underlying extreme
value index is negative. So now we assume that the right endpoint exists and proceed
with its estimation.
We give results for Mom (the quantile estimator from Theorem 4.3.1 with the
moment estimator and the scale estimator from Section 4.2) and PWM (the quantile estimator from Theorem 4.3.1 with the probability weighted moment estimators
(3.6.9)-<3.6.10)). In Figure 4.10 is the diagram of estimates of the right endpoint.

4.6 Simulations and Applications

153

Table 4.3. Life span data: upper limit of 95% asymptotic confidence intervals for the endpoint
(in years).
k

100 200 400

Moment 115. 130. 125.


PWM 121. 208. 141.

Similarly as before, the approximating (1 a) 100% one-sided confidence interval


is given by
y2

V k

where varp is the respective asymptotic variance with y replaced by its estimate and
ZQL is the 1 a quantile of the standard normal distribution. In Table 4.3 we give the
one-sided confidence intervals for the endpoint for k = 100, 200,400 (cf. Exercise
4.7 for PWM).

Exercises
4.1. Prove the consistency of &M, that is, the first part of Theorem 4.2.1(1).
4.2. If one replaces pn by cpn (c > 0) in Theorem 4.3.1, how does the limit result
change?
4.3. Prove that wy(t) t~yqy(t) = HYio(t) with qY from Theorem 4.3.1, wy from
Theorem 4.4.1, and HYP from Corollary 2.3.4.
4.4. Let y < 0 or p ^ 0. Verify that the expected value of the limiting random
variable of Theorem 4.2.1(2), (1 - y_)(3 - 4y_)P - 2~1(1 - y_)(l - 2y-)2Q ^l{x#o and (x<0 or /0<0)}(p/ + Y 1{^=0})_1 as a function of (y_, p'), equals - A / / ( l y_ pf)~l (1 2y_ p')~l (consider the random variables (P, Q) of Lemma 3.5.5
and recall that in the statement of this lemma the mean values and covariance matrix
are in terms of y_ and pf).
4.5. Prove Theorem 4.4.7.
4.6. (Aarssen and de Haan (1994)) Let Xi, X2,... be i.i.d. random variables with
distribution function F. Suppose U(oo) > 0 and (4.2.2), i.e., the second-order condition for log U, with y negative and auxiliary second-order function Q. Let k\ be
fixed and k = k(n) -> 00, k/n -* 0, and >/kQ(n/k) -* 0, as n -> 00. Then, with
the notation of Remark 4.5.5,

converges in distribution to a random variable of the form

154

4 Extreme Quantile and Tail Estimation

+<

st-v&r

where Z\, Z 2 , . . . are i.i.d. random variables with a standard exponential distribution.
4.7. Consider the quantile estimator of Theorem 4.3.1 with J^WM and a(n/ k) = <JPWM
from Section 3.6.1. Check that the variance of the limiting random variable in (4.3.4)
is given by
(l-yX2-y)2(l-Y+2y2)
(l-2 K )(3-2y)
2

Y >0,
4

I ^ 2-2y+12y -38y +47y -66y +74y -40y +8y*


lZ
(l-2y)(3-2y)

^n

'

<

Advanced Topics

Chapters 1^4 constitute the basic probabilistic and statistical theory of one-dimensional
extremes. In this chapter we shall present additional material that can be skipped at
first reading. It is not used in the rest of the book.
Section 5.1 is a mirror image of Sections 2.3 and 2.4: it offers an expansion of the
tail empirical distribution function rather than the tail quantile function as in Section
2.4.
Section 5.2 offers various ways to check the extreme value condition in facing a
data set. Some procedures use the tail quantile function and others the tail empirical
distribution function.
Section 5.3 uses an expansion for the tail distribution function (not empirical
distribution function) developed in Section 5.1 in order to obtain uniform speed of
convergence results in the convergence of maxima toward the limit distribution. This
also leads to a large deviation result.
Some classical results are presented in Sections 5.3.1 and 5.4: convergence of
moments and weak (in probability) and strong (a.s.) behavior of the sequence of
maxima.
In Sections 5.5 and 5.6 the conditions "independent and identically distributed"
that we used throughout Chapters 1-4 are relaxed: in Section 5.5 the assumption
of independence of the initial random variables is relaxed and in Section 5.6 the
assumption of stationarity is relaxed.

5.1 Expansion of the Tail Distribution Function and Tail


Empirical Process
Let us go back to the approximation of the inverse tail empirical process of Theorem
2.4.2. This theorem implies in particular (for simplicity consider y > \ and k large
enough so that the bias component vanishes)

(XnHks]

-U(l)

ao(f)

_ s^_-l\

rY_x

156

5 Advanced Topics

in Z)(0, 1] with ao a suitable positive function and {W(s)}o<s<i standard Brownian


motion. We are going to translate this result into a result for the empirical distribution
function. Let us invoke a Skorohod construction and pretend that (5.1.1) holds almost
surely in Z)(0, 1].
This result has the following structure: for some nonincreasing functions Hn(s)
(i.e., (Xn-[ksin U(n/k))/ao(n/k)) and some function M (i.e., (s~Y \)/y) with
continuous negative derivative we have
hm

= P(s)

n-oo

(5.1.2)

Sn

locally uniformly in (0, 1] where P is a continuous function (i.e., s~y~lW(s)) and


Sn a positive sequence tending to zero (i.e., 8n = l/Vk). From (5.1.2) and Lemma
A.0.2 (Vervaat's lemma) we get
lim ^ W - M - W

Specializing to the mentioned random function, this implies


^

[I E

UiXi-UWWaotn/k^x}

- (1 + YX)~l/Y

J ~> W ((1 + K X ) " 1 ^ ) ,

a.s. locally uniformly for those x for which 1 + yx > 0. The conclusion is that

^ U

^-^/WM*/*)-}

- (i + x*r1/}/ j -i^(a + Y*rl/y)

(5.1.3)
in Z)(0, l/(max(0, ->/))).
The present section aims at obtaining a weighted uniform version of (5.1.3). That
is, we discuss expansions for the (empirical) distribution function similar to those
of Section 2.4. Since the proofs are quite technical and lengthy, some parts will be
sketched rather than carried out in detail. For a full account of the theory and full
proofs, see Drees, de Haan, and Li (2006). We begin with an expansion of the tail of
the distribution function analogous to Section 2.3.
Theorem 5.1.1 Let F be a probability distribution function. Suppose the secondorder condition (2.3.5) holds for the function U, the inverse of 1/(1 F). Letao, bo
and, Ao be the functions defined in Corollary 2.3.7.
1. If not y = p = 0, for all c> 8 > 0, then
hrn^

sup

((1 + yx)-l/yy

exp ( - |log(l +

t (1 - F(b0(t) + xa0(t))) - (1 +
A0(t)

- a + Yxrl'y-lVytPi

yxy1/y\)
yx)-l/y

+ K*) 1/K ) = 0 (5.1.4)

5.1 Expansion of the Tail Distribution Function and Tail Empirical Process

157

with

P <0^y

p < 0 = y + p,

log* ,

* y , p W :=

+ p,

^Hogx,

p = 0^y,

awd
Dt,p,8,c

2. Ify

{JC : (1 + y x ) - 1 ^ < cr" 5 + 1 } ,


0

p < 0,

{jcia + ^r^^iAowr }, P = O.

= p = 0, / o r a// c, 5 > 0, f/ren

lim
t-+OG

up
sup
~

-e|log(f{l-F(MO+*ao(0)})l
e
,x-e\x\
I (
min
,e

{l-F(b0(t)+xa0(t))}

*A

t(l-F(bo(t)+xao(f)))-e

A0(t)

= 0.

(5.1.5)

Proof. Here is a proof for p < 0 and F continuous and strictly increasing. As usual,
we revert to the properties of U rather than F to obtain the necessary approximations.
Hence we define
1
y := t{l-F(b0(t)+xa0(t))}
'
so that
x =

U(ty)-b0(t)
a0(t)

This leads to the following expansion, where the notation g(x) := (1 + yx)
qt(x) := (U(tx) - bo(t))/ao(t) - (xy - \)/y is used:

-.w

= v- K -%(v)-(i + }/)j

and

-I/K1

,-l/K

oo(')

yxyl/y

f(1 - F(bo(t) + xa0(t))) - (1 +

l y

)-*VT-)\
J (i + y ( -

+ MJ)

^*-

Now, the integrand function (1 + y ((yy X)/y + u)) l/y 2 always lies between
its value for u = 0 and for u = qt (v), i.e., it lies between (\ + y{yy \)/y)~lly~2
=
y~l~2y and (1 + y((yy - \)/y + qt{y))Txly-2
= v" 1 " 2 ^! +
yy-yqt(y)r1/y-2.

158

55 Advanced Topics

By examining
lining the uniform inequalities of Corollary 2.3.7 for the quantile function
one sees that for ai
any c, 8 > 0 (recall p < 0),
lim

sup

y~Y \qt(y)\ = 0 .

(5.1.6)

Hence
ce

for y/ > cts~i and t sufficiently large, and we obtain


-(i + yxr1/y

\t(l-F(bo(t)+xao(t))-

-y-l~Yq,(y)\

<2\l + y\y-l-2rq?(y)

(5.1.7)

for all11 y > ct8~l and t sufficiently large. Upon multiplying the left- and ri
sides in (5.1.7) by y we see that these two statements easily lead to
lim

sup

\y(l(1 + yx)~
yx)-ll/y/y
\y

-1l | = 0

(5.1.8)

and hence (by checking the definition of ^Y,p) to


lim

yl~pexp(e\logy\)

sup

X |(1 + YXr(MM*y,p

( d + Y*)l/Y) -

y-(l+Y)*Y,fi,O0|

=0.(5.1.9)

The rest is straightforward: by (5.1.8),


((1 + Yxyl,YY'1

exp (-s |log(l + yx)-l/y


x

\t{l - F(bo(t)+xao(t))}

|)
- (1 +

yx)-l/y

A0(t)

- (1 + yxrl'y-lVYtP

((1 + yjc) 1/y )

yl-Pe-e\logy\

,, ,.
+
+

k i - F(AO(O+*<io(0)) - (i + Y*rl'Y - y " 1 " ^ ^ )

- p ee--e\fozy\
|l0.,|
y lyi-p
yl-Pe-e\logy\

'

^ W

_ y-O+X)^(y)|

Anil)

l ^ ^ )^*K.pO')
o o -- (I(1 ++ yxr^y-^y^ai
+ Yx)l/y)t
y*)_1/,'-1*y
(5.1.10)

By (5.1.7) the first term on the right-hand side is attmost


i

5.1 Expansion of the Tail Distribution Function and Tail Empirical Process

2|1 + y\ (y-^e-^*

159

|fg[) {y^qt(y)}

For the last factor we use (5.1.6). The uniform boundedness of the other factor on
y > ct8~l is again a result of the uniform inequalities of Corollary 2.3.7 for the
quantile function.
The second term on the right side of (5.1.10) converges to zero uniformly on
v > ct8~l: this is just Corollary 2.3.7.
The third term on the right side of (5.1.10) converges to zero uniformly on v >
ct8~l by (5.1.9).
Finally, note that v > ct8~l if and only if x e E>t,p,bycforp < 0.

This result will be used later in the chapter but also for the proof of the next result:
an expansion of the tail empirical distribution function, that is, the tail empirical
process.
Theorem 5.1.2 Let Xi, X 2 , . . . be i.i.d. random variables with distribution function
F. Let Fn be the empirical distribution function, based on X\, X2, ..., Xn. Suppose
that the function U, the inverse of 1/(1 F), satisfies the second-order condition
(2.3.5) with y R, p < 0. Let k = k(n) be a sequence of integers such that k -> 00
and /icAo(n/k) is bounded, as n > 00, with Aofrom Corollary 2.3.7. We also use
#0 and bo from that corollary. Then the underlying sample space can be enlarged to
include a sequence ofBrownian motions Wn such thatfor all XQ larger than the lower
endpoint of the limiting extreme value distribution
l/(yv0):
1. If not y = p = 0, then as n -> 00,
sup

yxr^y1/2+

Ul +

*o<*<i/((-y)vO)

'

|v^(>-^+,.))-o+>-)
- Wn ((1 +

lly

Yxy

- VkA0 (I) (1 + yx)-Vr-l*y,p ((1 + Yx)1/y)U 0 .


2. Ify = p = 0, then as n -> oo for all r > 0,

^SPoom(l,,)*|VS{J(l-F,(fto(J)+xfl0(J)))-e-J

-Wn(e-*)-VkAoQje-

4o.

Proof The result is well known in the case of the standard uniform distribution (see,
e.g., Einmahl (1997), Corollary 3.3) and this result will be our point of departure. Let

160

5 Advanced Topics

Un be the uniform empirical distribution function. Then the underlying sample space
can be enlarged to include a sequence of Brownian motions Wn,
sup/- 1 / 2 *-* 110 *"
* ( & ( * ) - ) -

t>0

-+0,

(5.1.11)

asn -> oowithA: -> oo,k/n 0. By the well-known quantile transformation, lFn
has the same distribution as Un{\ F). Hence by (5.1.11) for suitable versions of
Fn,
(zn(x)rl/2e-llogZn(x)l

sup
{x:zn(x)>0}

p
0 (5.1.12)
k* { (i ~Fn (b + x a ) ) " Zn(x)\ ~Wn(znM)l

with

*=-F('-'K) + ~G)))

In order to get the result of the theorem, we are going to replace zn(x) by (1 +
yx)~xly in (5.1.12) and we shall see how this can be done for p < 0.
First note that by (5.1.8) for 0 < 8 < 1 and sufficiently large t with XQ >
-l/(y
v0)andc>0,
Uo,

) = [x : 0 < (1 + yx)-1^
C [x : (1 + yxyl/y

<qmthq

< cr*+l]

:= (1 + yxo)'l/y

= Dt,pAc

< oo}
(5.1.13)

with D as in Theorem 5.1.1.


It follows that we can replace the supremum over {x : zn(x) > 0} in (5.1.12) by
the supremum over {x : XQ <x < l/((y) v 0)} and that we can use the inequalities
of Theorem 5.1.1 in this range of ^-values.
Also, (5.1.8) allows us to replace the factor (z n (jt)) -1 / 2 exp(s\ logzrt(jc)|) by
((1 + yx)-l/Y)-l/2+e
(since we know that (1 + yx)~l/y
is bounded). The result of Theorem 5.1.1 along with the condition "VkAo(n/k) bounded" allows us
to replace the term zn(x) in the curly brackets by (1 + yx)~lly + Ao(n/k)(l +
Finally, we need to prove that as n -> oo,
sup

((l +

yxr1/y)~

xo<x<l/((-y)vQ)

-I'-lil'-'W^-ffl"''^^"-'")!^(5.1.14)
First note that by the law of the iterated logarithm, lim^oo t
for s > 0. Hence by time reversal for s > 0,

e l 2

^ W(t) = 0 a.s.

5.1 Expansion of the Tail Distribution Function and Tail Empirical Process
lim

s-l/2W(s)

161

(5.1.15)

0 a.s.

Now by (5.1.8) and (5.1.13) we have

o^O-'Wi)))

lim
sup
k\
V \k
~ ^o<^<V((-y)vO)
Hence for (5.1.14) it is sufficient to prove that if

n >00

tn(s)

lim sup
^ 0<s<s0

= 0.

(5.1.16)

= 0,

n oo

then

W(tn(s)) - W(s)
(5.1.17)
lim sup
= 0 a.s.
Ot(*)) 1 / 2 -
^0<S<So
Take a sequence sn - so > 0, n - oo. Then by (5.1.16), also tn(sn) -> so,
n - oo. For so > 0 by continuity of Brownian motion (5.1.17) is true. For so = 0 by
(5.1.15) and (5.1.16) both W(tn(sn))/(tn(s))l'2-e
and W(s)/(tn(s))l'2~e
converge
to zero and (5.1.17) follows.

Remark 5.1.3 The Brownian motions in Theorem 5.1.2 (on tail empirical distribution
functions) are the same as the Brownian motions in Theorem 2.4.2 (on tail empirical
quantile functions). This can be seen most easily by applying Vervaat's lemma (see
Appendix A) to the functions in Theorem 5.1.2, restricted to a compact interval.
The result of Theorem 5.1.2 can be simplified in the case y > 0 and reads as
follows.
Theorem 5.1.4 Let X\, X 2 , . . . be i.i.d. random variables with distribution functions
F. Let Fn be the empirical distribution function based on X\, X2, ..., Xn. Suppose
that the function U, the inverse of1/(1 F), satisfies the second-order condition of
Theorem 2.3.9, hence in particular y > 0. Let k = k(n) be a sequence of integers
such thatk -+ 00, \fkAo{n/k) bounded, n - 00 with Aofrom this theorem. Then the
underlying sample space can be enlarged to include a sequence of Brownian motions
Wn such that for all XQ > 0,

S xa/-wU|J(i-F.(xi,(I)))-x-*j
X>XQ

VY

-Wn(x-Vr)-VkAoQx-

x^y

-1

-^>0,

(5.1.18)

yp

as n -> 00.
Proof. The proof follows the line of the proof of Theorem 5.1.2 but now we use the
inequalities
l-F(tx) _ -\/y
X
1-^(0
_
a(f)

x-\/Y

XYP

- \

p/y

<6x-l/y+f>/ymsix(x8,x-8)

(cf. Theorem 2.3.9) and the result of Theorem 2.4.8.

162

5 Advanced Topics

Example 5.1.5 As an example let us apply this result to get another proof of the
asymptotic normality of the Hill estimator:
l*" 1

YH '= - ] log *_;, - l o g *_*,


/=0

n C
= - /
k

(log s - log Xn-k,n) dFn 0 )

JXn-k,n

f
= /

ds
(l-F(*))-

Jxn-k,n/U{n/k)kV

Vt//i

Hence

= : I + II.
For part / note that by Theorem 2.4.8,

v ^ f e ^ - l U ^(1)4-0.

(5.1.19)

Hence Xn-k,n/U(n/k) -+p 1. Using the approximation of Theorem 5.1.4 for the
integrand, we see that

Vkf1

g{i_ F l ( (^(J))}_,-i/x)*4o.

(5.1.20)

When we combine (5.1.19) and (5.1.20), we get

Vkf

lU-Fn(sUQ)\dS

+ YWn{l)U.

(5.1.21)

Next we consider II. By the uniform convergence in (5.1.18) we can interchange


limit and integral when applying (5.1.18) to II. The result is

5.2 Checking the Extreme Value Condition

163

It follows that
ds

V* (YH -Y) = -ywn(i) + f wn (s-vy)


+ VkA0 ( - ) /

ds
X,Y , s ^ - l
+ opiX) .
Ys~ 7
p/Y

Now

-ywn(D + j wn (*-/>') y = -y f w(i) - jf' wn{u) ^ j


has a normal distribution with mean zero, and its variance is
Y2EI-W(1)+
o(
\

f W(u)^-\(-W(l)+
fl fl
Jo Jv

= Y2[l+2

"

W(v)^-\

du dv
u v

v^-2
JO Jv

Jo

f1
Jo

dv\
v J

v^\=Y2

V )

5.2 Checking the Extreme Value Condition


The extreme value condition is the only realistic framework for estimating quantiles
and distribution tails outside the range of the available data if one does not have
sufficient previous knowledge of the underlying distribution function. However, the
condition is not always fulfilled. Hence it is useful to develop a test to see whether
the tail of the empirical distribution function really looks like one of the generalized
Pareto distributions (GP) that we want to use for extrapolation. Such a goodness-of-fit
criterion will also be useful to guide us in deciding which part of the tail empirical
distribution function (or tail empirical quantile process) can be well approximated
by the corresponding distribution function or quantile function of a GP distribution.
That is, it helps us to decide from which intermediate order statistic onward the
approximation can be trusted.
We shall discuss four tests, three based on the quantile function and one on the
distribution function. We start with the former.
Let X\, X2, X 3 , . . . be independent and identically distributed random variables
with distribution function F and let X\,n < Xi,n < < Xn,n be the nth order
statistics. Suppose F is in the domain of attraction of an extreme value distribution
GY,Y

^-

Recall the representation of the tail quantile process via a special construction
(Corollary 2.4.6) that holds under the second-order condition: for e > 0,

164

5 Advanced Topics

Xn-m,n

- Xn-k,n _ s^^-1

ao ()
+ VkA0Q)

1 ls-y-iWis)

Wn(l)

Vk I

(*K,p(j-1)-*K,p(l))+o/.(l)max(l,5-''-1/2-)j)

(5.2.1)

where o/>(l) tends to zero uniformly for 0 < s < 1. Since the left-hand side is small
uniformly in s, we use it for die test. However, it contains die two unknown quantities
y and ao. We replace tiiese by estimators y and d(n/k). The test statistic becomes

h,n := [ (*"~[faa"(|)*"~~*'" - ' - ^ r 1 )

s2 ++lds

"

'

(5 2 2)

--

with j/+ := max(0, y). The weight function is necessary to ensure that all integrals
converge.
We are going to see that h,n - > p 0 under the extreme value condition, so that it
can be used as a goodness-of-fit criterion, and that klk,n has a nondegenerate limit
distribution under the second-order condition.
Theorem 5.2.1 Let X\, X2,... be i.i.d. random variables and suppose that their distribution function F is in the domain of attraction of some extreme value distribution.
1. Letk = k(n) -> 00, k/n -> 0, asn -+ 00. Assumeyn-^p
->p I. Then

anda(n/k)/a(n/k)

/*,- 0.
2. Suppose moreover that the second-order condition (2.3.5) holds and that
VkA(n/k) -> 0, as n * 00, where A is the second-order auxiliary function.
Further assume that

-y'-Hl-l)J*\Y-Y.
4 f r - 1 I - (r(W), or(WW) 4- 0

(5.2.3)

where T and a are measurable real valued Junctionals of the Brownian motion
from (5.2.1). Then
klk,n -> ly
where
Iy:=

J [s-y-lW(s)-W(l)-a(Wn)S

*
y
2

+ T(Wn) I U~y-llOgudu\
with W Brownian motion.

S2y++lds

5.2 Checking the Extreme Value Condition

165

Remark 5.2.2 The reader may want to check that (5.2.3) holds for all the estimators
discussed in Chapter 3.
For the proof we need some lemmas. We have seen the following one previously:
Lemma 5.2.3 For s > 0,
lim sup s~l/2+sW(s)=0
QWo<s<$

a.s.

with W(s) Brownian motion.


Lemma 5.2.4 Ify is any consistent estimator ofy, then
s-y

s~y - \

= ()/ > / ) / u

* log w du

+ | K - K | t f p ( l ) ( s ~ l ? ~ K l - l ) f u~y-11 logII| du,


where the Op (I) term is bounded uniformly for s e (0, 1].
Proof Since for J C G I ,

ex - 1

< ew - 1

(check by series expansion), we have


\s~y-i

s~y-i

{9 ~ y) I u

logu du

f u~y-1 (uy-y - 1 + (y - y)log) du

= \9 -y\

\ (y - y ) l o g w

<\9-r\f
< \9-y\(s~l9~yl-l)

11u

log u du

(u-ly-y\-i)u-y-l\\ogu\du
I

u-y~l\\ogu\du.

Lemma 5.2.5 Under the assumptions of Theorem 5.2.1(1), for s > 0,

a(|)

where the op (I) term tends to zero uniformly for 0 < s < 1.

166

5 Advanced Topics

Proof.
S~Y ~ 1

Xn-[ks],n - Xn-k,n

o(l)

(5.2.4)

(Xn-[ks),n - Xn-k,n __ S

*(i) I

>(f)

- 1\

Y )

(5.2.5)
(5.2.6)

j y

(f) W ! )
s~r - 1

s-r -

(5.2.7)
!

For (5.2.5) we proceed as follows: for convenience we replace Xi, X2,... by


U(Yi), U(Y2),..., where U := (1/(1 - F))^ and YUY2,... are independent and
identically distributed 1 - l/x, x > 1. Then (XUn,..., Xt) =d
(U(Yit),...,
(U(YH,n)). We write
U(Yn-m,n)

- U(Y-k,n)

s-r - 1
( Yn-[ks],n \

ao(Yn-k,n)
( Yn-[ks],n \

(5.2.8)

We start with the second term on therightside of (5.2.8) and use Lemma 2.4.10.
As in the proof of Corollary 2.4.10 one sees that the supremum in Lemma 2.4.10 can
be taken over 0 < s < 1. By combining the expansions for Yn-[ks],n and Yn-k,n in
Lemma 2.4.10 we get
( yn-[ks],n \

sup s

K+l/2+e

Vk

\-Y^7)

~l

s-r-l

0<s<l

-s-r-lWn(s)

s-rwn(l) = op(l)

(5.2.9)

(check separately for y > 0, y < 0, and y = 0). It follows by Lemma 5.2.3 that

sup s
0<s<l

y+l/2+e

' (Yn-{ks),nV
\ Yn-k,n )

s-y-i

= oP{\)

(5.2.10)

5.2 Checking the Extreme Value Condition

167

For the first part of the right-hand side of (5.2.8) we use the uniform inequalities
for extended regularly varying functions (Theorem B.2.18): there exists ao(t) ~ a(t)9
t > oo, such that for all e > 0 there exists to(s) such that for t > to and x > 1,
-y-e

U(tx) - U(t)

xY - 1
< .

<*o(t)

We apply this with t := Yn-k,n (-* oo a.s., as n - oo, cf. Lemma 3.2.1) and
x := Yn-[ks],n/Yn-k,n and get
U(Yn-[ks),n) ~ /(?-*,>.)

n-k,n

1
= Op(l)

(Yn-[ks],n\Y

0(^n-it,n)

with the op (I) term tending to zero uniformly for 0 < s < 1. Next we apply (5.2.10)
to get for 0 e R,
s-e

V Yn.k,n )

+ op(i)s-e-v2-

We have

u(Yn-m,n) - t/(r-t,B)

oP(l)s-v-V2-.

ao(Yn-k,n)

For (5.2.6) use a(n/ifc)/a(n/&) ~ a(n/k)/a0(n/k)^p


5.2.4. We obtain
Xn-[ks],n ~ ^n-k,n

~~ 1

= (1 +Op{\))Op(\)s-Y-V2-S
+

1.For(5.2.7) use Lemma

+ (1 +

0P(l))Op(l)

S - Y - l

oP(l)s-s-y\logs\.

Lemma 5.2.6 Under the assumptions of Theorem 5.2.1(2), for s > 0,


IT I Xn-[ks],n ~ Xn-k,n

= i - ^ - 1 Wn(s) - Wn(l) -a(Wn)-

>
Y

- + T(W) f
Js

+ oP(l)s-r+-V2-,

u-r-hogudu
(5.2.11)

where the op(l) term tends to zero uniformly for 0 < s < 1.
Prao/ Firstnotethatsince(2.3.5)holdswithareplacedwithao,wehaveao(0 ~ (0
and even ao(t)/a(t) - I o(Ao(t)), t -> oo. Hence (5.2.3) implies

168

5 Advanced Topics

Vk\ '-(I)

Ml)

lj-a(W)4-0.

The left-hand side of (5.2.11) is, according to (5.2.1) and Lemma 5.2.4,

aojl)^
a

(i)

/x.-m

,n-Xn-k,n _ s^-l\

\
o(f)
y J
J^fklm-\L

(5.2,3)

_^(fll-f2zl)

(5.2,4)

= (1 + Oj.(l)) {^ ,/ - 1 Wn(s) - Wn(l) + VkA0 ( )


x ( * K , P ( S - 1 ) - * K , P ( 1 ) ) + opd)*"''- 1 / 2 -}
-(l+oP(l))a(W)
+0P(1) (s-lf-yl

u~y-llogudu

- l ) / M-y-'llogiil dw .

By Lemma 5.2.3 the error term connected with (5.2.12) is op(l)i _ > / _ 1 / 2 _ . The error
term connected with (5.2.13) is op(l)(s~Y l)/y. The error term connected with
(5.2.14) is
0P(l)

(s~li>-rl

- l ) /* i T ^ l l o g K l d K .

Now

-Ir-rl _ ! = |p _ y | /

M-lK-yl-irfM<

|y - yis-ly-yli '|logj|
i

Js

i.e., for e > 0,


s-V-rl - I = op(l) s~8 .
Moreover, for y > 0, y < 0, and y = 0 one checks that
f

M _)/ - 1 |logM|rfM<c(log5) 2 j- , ' + .

Hence the combined error term is op(l) s~Y+~ll2~e.


Proof (of Theorem 5.2.1). (1) Lemma 5.2.5 implies
h,n = 0P(l)f

= op(l)fl

(s-r-W-e

(s-y-1'2-*

l) V

+ +

l ds

v lfs-ss2r++1

ds.

5.2 Checking the Extreme Value Condition


Since the integral is finite h,n - > p 0.
(2) We write the right-hand side of (5.2.11) as a(Wn) +oP(l)
__ S~Y - l \

(Xn-.[ks],n -Xn-k,n

*(!)

y
2

< (2(a(Wn))

5-x+-i/2-^

169
Hence

2p++1

)
2es-lY+-l-2e}s-es2y++\

which is integrable (and note that the distribution of a(Wn) does not depend on n).
Hence by Lebesgue's theorem on dominated convergence (and Skorohod construction), klk,n ~^d ly, as n -> oo.

Simulations seem to tell us (cf. Hiisler and Li (2005)) that this quite natural test
does not perform as well as a similar one involving the logarithms of the observations
(Dietrich, de Haan, and Hiisler (2002)). The background is the following. The domain
of attraction condition
hm
f-oo

=
a(t)

x > 0,

is easily seen to be equivalent (provided that U(oo) > 0) to


log U(tx)-log
'-oo
a(t)/U(t)

U(t) _ xY~ - 1
~
y-

and
a(t)
hm
= V-L. .
/+
^ o o U(t)
Now the moment estimator (Section 3.5) provides separate estimators y # and y_
for y+ and y_. Lemma 3.5.1 states that for y e R, the estimator YH which in fact is
the Hill estimator from Section 3.2, satisfies
YH
P
1
r

(f)/^(i)

~y-

provided k = k(n) -> oo, k/n -> 0 as n -> oo. This suggests that one use the
following test statistic:

Dkn '= I' (l0gXn-^- ***-* - ^"": Jo \


KH
y-

(1 - P-)V ,2
7

rfjt

where y// and j?_ are as in Section 3.5, with y>H the Hill estimator and p_ the one in
Remark 3.5.7.
We have the following result.
Theorem 5.2.7 Let X\, X2,... be i. i.d. random variables and suppose that their distribution function F is in the domain of attraction of some extreme value distribution.

170

5 Advanced Topics

1. Let k = k(n) -> oo, k/n -> 0 as n oo. Tften

2. Suppose that moreover, the second-order condition for log U (3.5.11) (cf. also
Lemma B.3.16) holds. Let k = k(n) be such that
lim Vk Q (j)

= 0.

Then kDk,n converges in distribution to


s~-Y-

Dy := J {(1 - Y-) (s-y-lW(s) - W(l)) - (1 - y-)


R(W) + (l-y-)R(W)[
u~y-- ll
YJs

+-

P(W\

y2

02 U

du\] s ds

with W Brownian motion and


P(W):= I
Jo
Q(W):=lf

s-y--lW(s)-W(l)ds,
(s-y--lW(s)-W(l)}ds,

R(W) := (1 - y-)2 (1 - 2y-) { Q - y-\ Q(W) - 2P(W)\

Table 5.1. Quantiles of the asymptotic test statistic kD\^n.


Y

P
.10 0.30 0.50 0.70 0.90 0.95 0.9750 0.99

> 0 .028 .042 .057 .078 .122 .150


-0.1 .027 .041 .054 .074 .116 .144
-0.2 .027 .040 .053 .072 .114 .141
-0.3 .027 .040 .054 .073 .113 .140
-0.4 .027 .040 .054 .073 .114 .141
-0.5 .027 .040 .054 .073 .115 .141
-0.6 .027 .040 .054 .074 .116 .144
-0.7 .028 .041 .055 .074 .118 .147

.181
.174
.169
.168
.169
.169
.173
.176

.222
.213
.208
.206
.207
.208
.212
.218

For the proof we need some auxiliary results.


Lemma 5.2.8 Under the conditions of Theorem 5.2.7(2), for each s > 0

5.2 Checking the Extreme Value Condition


logXn [ks]
n^(i,sy-^^)\Vk(
- '"-logXn-k'n
v
n \
o(f)
0<.s <i

-s-^)
y- /

P
sup

-Y--X

171

Wn(s) + W(l) - 0 ,

with qofrom (3.5.13).


Proof. Apply Corollary 2.4.6 with the random variables X\,...,Xn
logXi,...,logXn.

replaced with

Lemma 5.2.9 Under the assumptions of Theorem 5.2.7(2),

Jk(y_-y_)-R(Wn)^P0,
with
P(W,'n) := f

-yU

Jo

R(W) := (1 - y_) 2 (l - 2 K _)
x

S Y

- 2 P + (1 - 2y_) f

'~

(s~y-1

Wn(s) - W(l)) ds

and Wn is Brownian motion for all n.


Proof. We use Lemma 5.2.8 with 0 < s < \:
YH

fl

>(f)

l o g X-[ksln

~ log

Xn-k,n

90 (I)

- 1 a-y-

-1

Jo
+ OP(1)

rfi

-If

ds + ^={

I s-y--lWn(s)

- W(l) ds

/k [Jo
s-Y-M2-e ds

I_

Jo
It follows that

V^l^--!)-P(W)4-0.
Next note that by Remark B.3.2, (3.5.11) holds with q replaced by qo, since

Then (5.2.15) follows with qo replaced by q.

(5.2.15)

172

5 Advanced Topics

Similarly,
M(n _ _

fl / l o g X-[ks],n ~ log Xn-k,n

Hence
/

Mf>

V(o(f))

(1-Y-W-2Y-))
Cl s~y~ 1 /

(s~y lW(s)-Wn(l)j

-2
Since y_ = 1 - 2 _ 1 (1 finishes the proof.

YH/MP)'1

ds-^0.

(cf. Remark 3.5.7), Cramer's delta method

Lemma 5.2.10 Under the conditions of Theorem 5.2.7(1),


s~y-
Y-

(1 - Y-)

s~y- -

(1 " Y-)

Y-

= -(Y- - K-)(l -Y-)

u~y-~~llogudu

+ \Y- - Y-\ opdxiogs)2{s-\y-y-\ - I) ,


where the Op (I) term is bounded uniformly for s e (0, 1].
Proof Apply Lemmas 5.2.4 and 5.2.9.

Proof (of Theorem 5.2.7). The proof is similar to that proof of Theorem 5.2.1, now
with the use of Lemmas 5.2.8-5.2.10. It is left to the reader.

Remark 5.2.11 Here and in the next theorem a similar result can be proved when
one replaces s2ds in the definition of Dk,n with s^ds as long as n > 0. Hiisler and Li
(2005) recommended the value n = 2.
In the special case that only positive values of gamma are possible (for example
if the distribution is not bounded above), a simpler test can be used.
Theorem 5.2.12 Let X\, X2,... be i.i.d. random variables with distribution function
F. Suppose F is in the domain of attraction of an extreme value distribution GY with
Y > 0. Define

5.2 Checking the Extreme Value Condition

173

2
LJl
Sktn := /

^
JO \
YH
where yn is the Hill estimator from Section 3.2.

+ log^

^ ds,
J

J. Ifk k(n) - oo, k/n -> 0, n -> oo, then

2. Suppose in addition that a second-order condition holds: for some positive or


negative function a and all x > 0 with p < 0,
lim
'-oo

'

m
/ x

= JC-1^-

- .

Qf(f)

(5.2.16)

yp

Then for sequences k = fc(n) -> oo, Vka(U(n/k))


function of 1/(1 F),

- 0, w/iere / w f/ie inverse

k&k,n ~~> &

as n -> oo, w/i^r^


5 : = A f 5(5) + 5log5 /

M_1^(M)

rfw J

s^ds

with B Brownian bridge.


Proof. The proof of part (1) is as before. Next note that the second-order relation
(5.2.16) implies (cf. Theorem 2.4.8)
limsup51/2+

^knogxn.[ksU-iogxn.k,n+io^

0<s<\

-s-lWn(s) + Wn(l) - VkA0 (j) --I


\k/ y

-+ 0. (5.2.17)
|

The rest of the proof is similar to that of the previous theorem.

According to Hiisler and Li (2005) this test does not perform so well as the others.
Next we discuss the behavior of the test of Theorem 5.2.12 under two types of
alternatives.
Example 5.2.13 (Super-heavy tails) Let F(x) = 1 - (log JC)"^ for x > e and p
a positive parameter. Note that logX = Y1^, where Y has distribution function
1 1/JC, x > 1. Then with a 1/fi (cf. proof of Lemma 3.2.3),
_
5 M =

fl I
(Yn-lkS],n/Yn-k,n)a - 1
/n 1 1 V * - 1 (Y
^
^ ~ T
J0
\k Zw=0 {Yn-i,n/Yn-k,n)
~1

S I S

ds.

174

5 Advanced Topics
d

/ 7 , ^f-W-*-1

+log,) , ' * .

ForO < a < 1,

i=0

i=l

with Y* independent and identically distributed, 1 l/x,x


1 kk-i
~l

> 1; hence

i=0

Further, by the proof of Lemma 2.4.10, lines 4-5,

uniformly on [& -1 , l ] . Hence

Sk,n + ( ((s-a-l)(l-a)
Jo

logs)2s2ds>0.

Example 5.2.14 (Periodicity) Let E\, E2, 3 , . . . be independent and identically


distributed random variables with standard exponential distribution. Then the distribution of exp(i) is in the domain of attraction of some extreme value distribution.
But the distribution of
X := exp[i]
(the exponent of the integer part) is not in any domain of attraction. We show that
the statistic Sk,n does not tend to zero as n -> 00 for all sequences k = k(n) -> 00,
k/n -> 0. Let X\,n < X2,n 5 < Xn,n be the order statistics of a sample from this
distribution. Then |X,->n }" =1 =d {exp[|>n]}?=1, where the ,-, are the order statistics
for the exponential distribution. Hence
logXn-[ks],n - log
= [En-lksln]

yHlogs

~ [En-k,n] + Qogs)-

= [(En-lks],n ~ E-k,n

( [ - , > ] - [-*,])
,=0

+ logs) + E-k, - [En-k,n] ~ lgs]

1*_1
+ (log s)- J2 [{En-i,H

- -*,) + EH-k,n ~ [E-k,n]]

i'=0

Now let us take a subsequence k = k(n) such that

5.2 Checking the Extreme Value Condition

175

En-k,n - [En-k,n] ^ lo S ~ [loS J "*


in probability as n -> oo. Moreover, apply Lemma 2.4.10. Then the above becomes

with E\, 2, 3 , . . . as before. Note that the expectation of [E\ + p] is ep/(e 1).
Hence, as n > 00, and for this sequence k = k(n),
Sk,n -+ f

([-logs]

+ (e-

l ) " 1 log*) V

ds>0.

Simulation results can be found in Hiisler and Li (2005).


We conclude with a version of the test using the tail empirical distribution function
rather than the tail empirical quantile function. This is closer to the usual goodnessof-fit tests of Cramer-von Mises and Andersen-Darling. The proof is omitted (cf.
Drees, de Haan, and Li (2006)); it is similar to but more complicated than that of
Theorem 5.1.2, now using the asymptotic theory of Section 5.1.
Theorem 5.2.15 Under the conditions and with the notation of Theorem 5.2.1(2),
with
c" - 2 dx
we have
kTk,n-j

(Wn(x) + L(/\x)f

x"-2dx

for allr] > 0 if not y = p = 0, and allrj>lify=p=0.


[ 7 * (jrr(W) - a ( W ) ) +
(

L /\x)

:= I

-4 0
Here

r(Wn)xlogx

l+Y

-}* (yWnd) + $r(wn)-(w)) ,


2

[x (-W(l) - i f (W) log x+a(W) logx) ,

y*o,
y=0.

Hiisler and Li recommend the value rj = 1.


For more information about these tests we refer to Hiisler and Li (2005). In
particular, for applications they suggest: (i) to estimate the extreme value index y
based on both estimators maximum likelihood and moment; (ii) if the extreme value
index can be believed to be positive (for example, both estimators of y are larger
than 0.05), then it might be better to use the test statistic kTk,n\ otherwise use the
test kDk,n-

176

5 Advanced Topics

Table 5.2. Quantiles of the asymptotic test statistic kTk,n with rj = 1 and the maximum
likelihood estimators.
y
4
3
2
1.5
1
0.5
0.25
0
-0.1
-0.2
-0.3
-0.4
-0.45
-0.499

P
.10 0.30 0.50 0.70 0.90 0.95 0.9750 0.99
.086 .123 .161 .212 .322 .393
.085 .120 .156 .205 .307 .372
.083 .116 .150 .195 .286 .344
.082 .115 .148 .192 .282 .340
.082 .114 .146 .189 .276 .330
.083 .116 .149 .194 .285 .343
.085 .119 .153 .120 .295 .355
.089 .126 .163 .213 .319 .388
.091 .129 .168 .221 .330 .400
.093 .133 .174 .231 .350 .425
.096 .139 .183 .242 .369 .449
.100 .145 .192 .256 .393 .484
.103 .150 .199 .320 .416 .511
.107 .157 .210 .338 .439 .546

.462
.440
.402
.400
.388
.404
.415
.455
.471
.500
.531
.576
.605
.652

.558
.532
.489
.480
.466
.481
.499
.542
.569
.604
.653
.690
.735
.799

5.3 Convergence of Moments, Speed of Convergence, and Large


Deviations
In this section we shall see that the domain of attraction conditions also imply convergence of the relevant moments of normalized sample maxima. Next we shall prove
that the second-order condition allows for a precise uniform speed of convergence
result, which in turn will imply a large deviations result.
Let X\, X2,... be independent and identically distributed random variables with
distribution function F and assume that F is in the domain of attraction of an extreme
value distribution Gy,y R. Let Xn,n := max(Xi, X2,..., Xn), n = 1,2,... .
5.3.1 Convergence of Moments
We have seen in Chapter 1 (Exercise 1.16) that if F e D(Gy) with y < 0, then
E \X\a l{x>x} is finite for all a > 0 and all x < U(oo), where U(oo) is the right
endpointofthe distribution (recall that U := (l/(l~F))^~).However,ifF e D(Gy),
y > 0, then
E\X\al[x>x}
<oo
(5.3.1)
for 0 < a < 1 / / , but is infinite for a > l/y. This defines the scope of possible
convergence of moments in extreme value theory.
Recall that under the domain of attraction condition for some positive function a,
U(tx) - U(t)
f.
t-oo
a(t)

XY

- 1
y

5.3 Convergence of Moments, Speed of Convergence, and Large Deviations

177

for all x > 0. Then


Bm P (Xn,n~U(n)
n-*oo y
a(n)

\ =
)

for all x with 1 + yx > 0.


Theorem 5.3.1 Ler X be F distributed and assume F G D(GY), y R. Let k be an
integer with 0 < k < \/y+, where y+ := max(0, y). Suppose E\X\k is finite, which
in view of (5.3.1) is implied by E \ X | k 1 [x <x}finitefor some x for which 0 < F(x) < 1.
Then

lim

(X--V<ri\k

f xk dG

{x)

(533)

Proof Let Z be a random variable with distribution function exp(1/JC), x > 0. By


comparing the distribution functions of the left- and right-hand sides of the equation
it is easily seen that
Xn,n - V(n) ^ V(nZ) - V(n)
a(n)
a(n)
withV : = ( l / ( - l o g F ) ) ^ .
Next note that (5.3.2) is equivalent to
lim n ( - log F(U(n) + xa(n)) = (1 +

yx)~l,Y

for all x with 1 + yx > 0. Application of Lemma 1.1.1, Theorem 1.1.2, and Lemma
1.2.12 gives
,. v(tx) - v(t) xy - I
hm
=
t^oo
a(t)
y
for all x > 0. By Theorem B.2.18 for s, e' > 0 there existsfasuch that for t, tx > to,
V(tx) - V(t)

f l

x-Ye-

\ Zx\

00 (0

xY - 11
Y

<s

(5.3.4)

for some ao satisfying ao(t) ~ a(t), t -> oo. We write for n > fy>
/V(wZ)-y(w)\*

^o(") /
=

*I
= : I + H.

a0(n)

1 {

^0}

ood.)

l{nZ<t ]

For / we use inequalities (5.3.4). The upper bound is (note that \a + b\k <
2 (\a\k + \b\k))
k

2k j Y-f

d (e-Vx) + {2e)k r ^ / * ' 1 0 * * ' J (V1/*) ,

178

5 Advanced Topics

which is finite. Hence by Lebesgue's theorem on dominated convergence the limit of


part / is
poo

(rrX

xK dGy(x) .

For II we proceed as follows. It is bounded by


Wn
Jo

I V(nx) - V(n) \k
d(e-V*)
a0(n)

-L
V(x) - V(n) I
ao(n)

JO

< ne-(n~1)/t0

2k

\Jo

(-l)M)W
e-(n-D/to^
< (n
[ne

' I V(x) - V(n)

d {e~nlx)

a0(n)

ne-^~^xd(e-llx)

\ao(n)\

\ao(n)\ J0

//

W W
\ao(n)\k

Since the sequences V (n) and a(n) are of polynomial growth and since the first factor
tends to zero exponentially fast, part II tends to zero.
Since(-logF(x))/(1-F(x)) -> l,as* f
U(oo),weget(U(n)-V(n))/a0(n))k
-> 0, n -> oo (Lemma 1.2.12, Chapter 1). Hence E((Xntn - U(n))/a0(n))k =
E((Xntn V(n))/ao(n) + sn)k with sn -> 0, n -> oo. By going through the proof
again with this modification, it is easy to see that also E((Xnyn - V(n))/ao(n) +
sn)k - fooXkGy(x) for any en -> 0. It follows that (5.3.3) holds with U(n) replaced by V(n). Finally, note that changing from ao to a does not affect the result.
This finishes the proof.

For y ^ 0 we have somewhat simpler results. The proof is very similar to the
proof of Theorem 5.3.1 and it is omitted.
Theorem 5.3.2 Suppose that the conditions of Theorem 5.3.1 hold.

l. ify > o,
lim E
w-oo

(m) -P'M-*"*)}

2. Ify < 0 (note that U(oo) := lim^oo U(x) is finite in this case),

"-oo

\U(oo)-U(n)J

/J

5.3 Convergence of Moments, Speed of Convergence, and Large Deviations

179

5.3.2 Speed of Convergence; Large Deviations


Let Xi, X 2 , . . . be a sequence of independent and identically F distributed random
variables. The basic convergence in extreme value theory is that if the extreme value
conditions holds, then for some sequence of constants an > 0 and bn real, n =
1,2,..., and some y e R,
lim P

'max(Xi,X 2 , . . . , X ) - fc,
- < * ) = exp ( - ( 1 + yjc)- 1 / y ) =: Gy(x)

n->oo

for all x with 1 + yx > 0. Then by the continuity of the limit distribution function,
lim sup \Fn(anx + bn) - Gy(x)\ = 0 .
(5.3.5)
^ xeR
The speed of convergence in (5.3.5) is not the same for all distributions in some
domain of attraction. For example, the convergence rate for the exponential distribution is of order n~l (Hall and Wellner (1979)), but for the normal distribution it
is of order (logn) - 1 (de Haan and Resnick (1996)). Rootzen (1984) proves that if
the convergence rate is faster than exponential, the initial distribution must be an
extreme value distribution. In fact, the convergence rate depends on the second-order
behavior. This came out for the first time in a paper by Smith (1982). As we shall
see, the second-order condition is sufficient for a uniformly weighted version of a
second-order expansion for Fn.
We are going to assume that the function V := (l/( log F))^~ = F*~(e~1/1)
satisfies the second-order condition of Section 2.3. Consequently, for some y e R,
p < 0, and all , 8 > 0 there exists to = to(e, 8) > 0 such that for all t, tx > to,
n

x-iY+p)e-8\logx\

with

v(tx)-v(t) _
a0(t)
A0(t)

xy-i

xY+P-\

*y,p(*) =

Y+P
y ^^logjt
1

*vfpW

<

(5.3.6)

P<0,
P = 0#y,

(5.3.7)

2" (logJc) ,

and where the functions ao and Ao(t) are from Theorem B.3.10.
In order to get a uniform rate of convergence in (5.3.5) we have to choose the
normalizing constants an and bn in a special way. For the first take ao(n) with the
function ao from (5.3.6), and for the second,
V(n) ,
V(oo) + y~lao(n) ,
V(oo) + y~la0(n) + (y + p)-lao(n)A0(n)

b0(n) =

y > 0,
y < 0 = p,
,y <0, p <0

with Ao(n) again from (5.3.6). Further, define the function ^y,p by
^

/rx
y,plj

._ J *y.p(*) + (Y + P)" 1 , P < 0 ^ K + /o,


" J %,p(x) ,
otherwise ,

180

5 Advanced Topics

Theorem 5.3.3 Suppose the function V := (l/(logF))*~ satisfies the secondorder condition, so that (5.3.6) holds. Then
Fn(a0(n)x + b0(n)) lim
A0(n)
oOO

Gy(x)

= -J

(5.3.8)

(Wy(x))

uniformly for x e R, where for x R,


wy(x) = l +

yx)v0)l'y

and for x > 0,


J(x)=x~{l+y)e-1/x

Vy,p(x)f

and J(0) and 7(oo) are defined by continuity, i.e., 7(0) = 7(oo) = 0.
Remark 5.3.4 The second-order condition for this theorem is imposed on the function V := (l/( logF))*~, whereas in Theorem 5.1.1, for example, it is imposed
on the function U := (1/(1 F)) 4 ". The relation between these two conditions is
discussed in Drees, de Haan, and Li (2003).
For the proof we need some lemmas.
Lemma 5.3.5 Assume the conditions of Theorem 5.3.3. For any e, 8 > 0, there exists
no > 0 such that
Pn AX)
>
A0(n)

?,(*)

<emax(xY+p+*9xY+f>-*)

(5.3.9)

for all n,nx > no with


Pn,y(x)

V(nx) - bpjn)

x* - 1

for all x > 0.


Proof It follows from (5.3.6) that for any s, 8 > 0, there exists no = no(e, 8) > 0
such that for alln,nx >no,
v(nx)-v(n) _
aoin)

xy-i

Aoin)

-Vy,pix)

<emzx(xy+e+s,xy+p-8)

This gives (5.3.9) for y > 0. For y < 0, it is easily checked by the definitions of
aoin), Aoin), and boin) that
Vin)-bpjn)
tf0(w)A0(n)

fQ,

= 0,

1 - O + P)~ > P < 0,

holds for all , so that (5.3.9) follows also.

5.3 Convergence of Moments, Speed of Convergence, and Large Deviations

181

Lemma 5.3.6 Under the conditions of Theorem 5.3.3,


Pn,y(x)
-*yfPW
A0(n)

sup x -Q+Y)e~l/x

lim

-+ <Xn<S<Pn

lim

sup

PlyM

00

"-* Ctn<S<Pn

Ao(n)x2Y

= 0.

(5.3.10)

= 0,

(5.3.11)

where an = -log _ 1 (A 0 (n)) 2 and fin = (A 0 (n))~ 2 .


Proof. Note that |Ao| RVP and therefore AQ e RV2P with p < 0. Hence (Proposition B. 1.9(5)) there exist a constant C > 0 and an integer no > 0 such that
AQ > C n 2 ^ - 1 for all n > o- Hence n an > n((2p 1) logn + log C ) - 1 -> oo,
as n -> oo. By Lemma 5.3.5, the latter implies that (5.3.9) holds for all x [an, /3n]
and n > no- Thus we have for 8 e (0, 1),
Pn,y(x)
=
- *y,p W
Ao(n)

x-^y)e~l/x

sup
(Xn<X<Pn

< e sup jc- ( 1 + y ) e- 1 /*max(jc y + p + *jc ) ' + p -*)


JC>0

e-l/xm&x(x-l+p+8,x-l+p-8

= e sup

x>0

so that (5.3.10) holds by noting that sup x > 0 e~l/x max (x~l+p+8, x~l+p~8)

< oo.

Choosing 8 e (0, | ) , we have from (5.3.9)

4,1 := A0(n)

sup x

2y

ttfi<*<Ai

^-%,i*)
Ao(n)

< A 0 (n) sup max (x2(-p+S),

JC 2 *"-**)

Otn<X<Pn

< A0(n) max ( ( A o W ) - 4 5 , ( - l o g " 1 (Ao()) 2 ) 2(P ^

-> 0 .

On the other hand, it is easily seen that


In,2 := A 0 (n)

sup x

2y

=2

V Ax) -> 0 ,

n - oo .

<Xn<X<Pn

Hence (5.3.11) follows from the inequality

Pl.oM
Y,Py
sup
Reading y

T -

<2(/M+/n,2).

log(l + yjc) = JC, for all JC R and y = 0, we denote

182

5 Advanced Topics

/l
/
V(nx)-b0(n)\
\
ov
Mx) = G0 I- log ( 1 + y
^
M - Go(logx)J ,

x> 0 .

Moreover, for any function / on (a, b) with oo < a < b < oo, define
/ ( A ) := lim r _^ / ( 0 and f(b) := limr_>^ /(f) if the limits exist, e.g., 7(0) =

/(oo) = 0.

Lemma 5.3.7 Under the conditions of Theorem 53.3,


Jn(x)

lim sup

-*0<x<oo A0(n)

(5.3.12)

= 0

-J(x)

Proof. We shall prove (5.3.12) only for the case that Ao is positive near infinity,
because the proof for the other case is similar.
Since for every positive integer n and x > 0, there exists 0 = 0(n, x) e [0, 1]
such that
Jn(x) = G0(logx + qn{x)) - G 0 (log*) = qn(x)G0(logx

+ Oqn(x))

with qn(x) = y~l log {1 4- y(V(nx) bo(n))/ao(n)} logx, we have


(A0(n)rlJn(x)-J(x)
= IcAoCn))-1 qn(x)G0(\ogx

+ Oqn(x)) - J(x)\

< (A0(n))~l \qn(x) - x~ypniy(x)\


l

+ (A0(n))~

x~ pniy(x)

G0(\ogx + 0qn(x))

Go(logx + Oqn(x)) -

+ x~y G0(logx) \(A0(n)rl


x

pn,y(x)

G0(\ogx)\

VY,p(x))\

=: Jn,l(x) + JnM ) + JnM )


Note that for some #o [0, 1],
qn(x) = y-1 log j l + y (aoin))'1 (V(nx) - b0(n))} - y~l log j l + yy~\xy
_

Pn,y(x)

__

xy +OoypniY(x)

- 1)]

X-Ypn,y(x)

1 +00yx-ypn,y(x)

'

Letting M = max(supJC>0 G^logx), supJC>0 G^log JC)), we have from (5.3.11) that
sup
(Xn<X<Pn

Jn,\(x) < M sup

(A0(n))

(qn(x)-x

pn,y(x))

0ln<X<Pn

= M \y\ sup
and that for some 6 e [0, 1]

PlyW
x2YA0(n)

\l+0oyx-rPn,y(x)\

-0

5.3 Convergence of Moments, Speed of Convergence, and Large Deviations


sup

183

7,2(*)

(AoOz))"1 x~Ypn,y{x)

sup

<M

GQ (logx + 06qn(x)J

Oqn(x)\

(A0(n)r1JC-2yp2fyW[l+floy^"yPn,yWr1->0.

SUp
Cln<X<Pn

On the other hand, (5.3.10) gives lim^oo suP<xn<x<pn JnMx) = 0- So we obtain


lim

sup

\Jn(x) - /(JC)| = 0 .

(5.3.13)

It follows from (5.3.13) that


(Ao(n))" 1

sup
sup ||/ n (x)|
Pn<X<00

<(Ao(n)yl

sup (l-Go(y-l\og\l
Pn<X<OC

+ (Ao(n))" 1

ya-l(V(nx)-b0(n))}))

+
l

J / /

sup (1 - Go(logx))
Pn<X <00

< (Ao(n))" 1 ( l - Go ( y " 1 log j l + ya~x (V(npn) - *o(n))}))


+ (AoWr^l-GoOogA,))
< (Ao(n))" 1 JG 0 ( y " 1 log j l + ya~l (V(n^) - &o(/i))}) - GoOog j8)}
+ 2 (Ao(n))- 1 (1 - GoOog Ai)) - 0 ,

n -> oo .

su

Noting that /(oo) = 0, we have lim n ^oo P<jc<oo Unix) J(x)\ = 0. Similarly, it may be shown that limn_>oo s u p o ^ ^ \Jn(x) J(x)\ = 0, completing the
proof of the lemma.

= {-n log F (aoin)u + boin))}~1, for all u e R.

Proof (ojTheorem 5.3.3).Letxn(u)


We have

P (X, < a0in)u + fco(n)) - GK(w)


^n(w) :=
r~T\
+ J ( ft> y( M ))
A 0 (n)
Fn(aoin)u +
boin))-Gy(u)
=

T-T-T

Ji(Oyiu))

Aoin)
+ J((Dy(xn(u)))
_ Go(\ogxn(u)) - Gp(logcoyju))
Aoin)
+ {Ji(0y(u)) - J((Oy(Xn(u)))}
=: Kn,i(u) + Kn,2(u) .
In order to establish (5.3.8) we need only to prove
lim

sup

0<F(a0(n)u+b0(n))<l

!*,, GO I = 0 ,

lim

sup

F(a0(n)u+b0(n))=0

,\Kn(u)\ = 0,

i = l,2,

(5.3.14)
(5.3.15)

184

5 Advanced Topics

and

lim

sup

-*

F(a0(n)u+b0(n))=l

\Kn(u)\=0.

(5.3.16)

It follows from the definition of V that if 0 < F(ao(n)u -h bo(n)) < 1, then
V(nxn(u))-b0(n)
V+(nxn(u)) - b0(n)

<u<

, n = l,2,...
a0(n)
a0(n)
(recall that V + is the right-continuous version of V). Therefore for u > 0 such that
0 < F(ao(n)u + bo(n)) < 1, we have

- Go ( i log (l + y V + ( n t y ( W ) l )
A0(n)
< Kn,\(u)

Go(logxnm

Go(logxn(u)) - Go ( i log j l +

y ^ y ^ l )

A0(w)
Combining this with (5.3.12), we obtain (5.3.14) for i = 1. Since /() is continuous
on (0, oo) and /(0) = /(oo) = 0, it is easily seen that (5.3.14) for i = 2 is also true.
Since F(ao(n)u + oOO) = 0 implies u < (V(0) bo(n))/ao(n), using (5.3.12)
once again, we have
hm

sup

< hm

F(a0(n)u+b0(n))=0 A 0 ( w )

A0()

= lim |/((>)- 7 ( 0 ) 1 = 0 .
/i->>oo

Note that (5.3.6) implies for x > 0,


,.
V(fjt) - V(t)
xy - 1
lim
=
.
t-+oo
a{t)
y
For any e > 0, there exists no such that (V(0) bo(n))/ao(n) < a)^~{s) for all
n > no and therefore
lim

sup

|/(toy(M))| <

"-oo F(ao(n)u+bo(n))=0

sup

|/(tyj,(n))| = sup |/(JC)| .

u<(of(e)

0<jc<e

This then gives


lim
n_>0

sup

\Kn(u)\

F(ao()K+*o(n))=0
GyW

< lim
n >0

sup

- F(a0(")w+M>0)=0 ^ 0 W
< sup |/(JC)| .

,-

+ hm
n

sup

/(a> y (i0)

F(a0(n)u+b0(n))=0

0<Jt<

Hence (5.3.15) is obtained by letting s -> 0 in the above. Similarly we may prove
(5.3.16), completing the proof of the theorem.

5.3 Convergence of Moments, Speed of Convergence, and Large Deviations

185

Remark 5.3.8 The uniform limit (5.3.8) gives an Edgeworth expansion as follows:
P {Xn,n < a0(n)u + bo(n))
( - l o g ^ 1 Gy(u)^y,p

= Gy(u) - A0(n)Gy(u)

( - l o g " 1 GK(w))

+o(A0(n))

holds uniformly on R.
Remark 5.3.9 The uniform limit (5.3.8) also gives a rate for convergence, that is,
lim sup

P {Xn,n < ao(n)u + bo(n)) A0(n)

Gy(u)

:supjc~ ( > /+1) ^- 1 ^ * y , p W

The surprising part is that under a weak extra condition the converse of Theorem
5.3.3 holds.
Theorem 5.3.10 (Cheng and Jiang (2001)) Ifthere exist sequences an > 0, bn real
and An > 0 satisfying
lim An = 0 and
n-*oo

lim - = 1
n^-oo

An

and a function K such that


lim

P(X
V nn<
'

anx + bn) ^

n-*oo

G
Y Y(x)
= K(x)

(5.3.17)

An

holds locally uniformly on R, and K is not a multiple ofx (cDy(x))~y~lGy(x),


V satisfies the second-order condition for some p < 0.

then

For the proof we need the following lemma.


Lemma 5.3.11 Suppose that f is a positive measurable function. If there exist sequences an > 0, bn real, and An > 0 satisfying
lim An = 0 and
n-^oo

lim ^ 1 = 1
n-oo

(5.3.18)

An

and a function K such that


f(nx)-bn

lim 2
n-*oo

x^l

2L_ = K(x)

(5.3.19)

An

holds locally uniformly on R, with K not a multiple of (xy l)/y, then f satisfies
the second-order condition.
Proof. Let [t] be the integer part of t e R and set

186

5 Advanced Topics
l-{*/(M+l)F
(t/[t])r-[t/([t]+l))y
, log(M+l)-lQg*
I log(M+l)-log[f] '

a(t) = {

'

^'
Q
u

X -

>

ft/MF-l

(t/[t])y-{t/([t]+i)}y
\ogt-\og[t]
log([f]+l)-log[r] '

o =
'(*)

(0 =

. %]

j8(0
%]+l

r . *(0=(o[^(o+^^/j(oi.

J
L%]
%]+l
J
and A(t) = A[t]. It follows from (5.3.18) and local uniformly for (5.3.19) that
f(tx)-b(t)
a(t)

XY-l

A(t)
/(**)-*([*]) _
a
[t)

= (0

(tx/[t])y-i
Y

[t]

f(tx)-bqt]+i)
+ /K0
-*0,

K(x)

{tx/m+w-i
K(x)

%]+!

A[t]

00

This then gives


f(tx)-f(t)

_ xZ^l

lim _ C 0
f-*oo
A(0
completing the proof of the lemma.

x_

= Jj:(

Proof (of Theorem 5.3.10). If (5.3.17) holds locally uniformly on R , then


n log F (anx + bn) - log Gy (x)
An
_ Fn (anx + bn) -Gy
An

(x) n log F (anx + bn) - log GY (x)


Fn(anx +
bn)-GY(x)

K(x)
Gy(x)
holds locally uniformly on R and therefore
If
1
An [ n\ogF(anx+bn)
_ n log F (anx + bn)co2y(x)K(x)
>

GvW

+ :

1
logGy(x)

log GY(JC) -_(#i^log F (^JC +i B )_)" 1 + log - 1 GK(JC)


n log F (a* + bn) - log GK (A:)

5.3 Convergence of Moments, Speed of Convergence, and Large Deviations

187

holds locally uniformly on R. By Vervaat's lemma (Appendix A), the above gives
V(nx)-bn

_ gT-l

/ v

1 X

r_ = xY^eVxK(x_Z\ .
\ V J

_^L_^
n->oo
An
lim

This implies that V satisfies the second-order condition by Lemma 5.3.11, completing
the proof of the theorem.

Theorem 5.3.3 gives a uniform convergence rate. In fact there is a somewhat


sharper result: a weighted approximation of the left-hand side of (5.3.8) by the righthand side, restricted to the right tail. We are not going to prove this result (cf. Drees,
de Haan, and Li (2006)), which is a consequence of the weighted approximation of
Theorem 5.1.1. We only prove the following large deviations result.
Theorem 5.3.12 (Large deviations) Suppose that the second-order relationfor U :=
(1/(1 - F))<~ holds (cf. Section 2.3) with p < 0. Then
lim

l-Fn(anxn+bn)

n-+co

= 1

l-Gy(xn)

for any sequence xn \ l/((y) V 0 ) .


Proof. Note that l/((y) v 0) is the right endpoint of the distribution function Gy.
Statement (5.1.4) of Theorem 5.1.1 implies
x/y
({\ + y xyx{t))^(l
(0ri/ryy

lim
*(ti/((-y)v0)

IV

(l + yx(t)yl'y

-TT^)^((l +

yx{t))l/y =

)} -

Note that ((1 + } / J C ( 0 ) ~ 1 / K ) P + -> oo if e < -p and^ K , / 0 ((l + yjc(0) 1 / K )/(l +


yx{t)) -> 0, as t -> oo. Hence
r

lim

t{l-F(bo(t)+x(t)ao(t))}

*->00

-lOg

Gy(x(t))

=1 .

Note that for xn t l / ( ( - y ) v 0),


1 - Fn (ao(n)xn + bo(n)) ~ -n log F (ao(n)xn + bo(n))
~n{\F (a0(n)xn 4- b0(n))} .
Finally, it is clear that we can replace ao(n), bo(n) by any set of normalizing constants
an,bn.

188

5 Weak and Strong Laws and Law of the Iterated Logarithm

5.4 Weak and Strong Laws of Large Numbers and Law of the
Iterated Logarithm
Let X\, Z2, X3,... be independent and identically distributed random variables with
distribution function F. Define Xn,n := max(Xi, X 2 , . . . , Xn) for n = 1, 2,
We
are going to discuss analogues for partial maxima, of the weak and strong laws of
large numbers, and the law of the iterated logarithm for partial sums.
Whereas in the partial sum case existence of moments is therightthing to consider,
for partial maxima conditions of regular variation type turn out to play an important
role. We start with a weak law of large numbers.
Theorem 5.4.1 Suppose F(x) < 1 for all real x. The following assertions are equivalent:
1. There exists a sequence an ofpositive numbers such that
Xn,n P

an

1.

(5.4.1)

1.

(5.4.2)

2. With bn := U(n) = (1/(1 - F))<"(n),


Xn,n P

K
3. For all x > 1.
l

i m

i ^ > = 0 .

r->oo 1 -

4. f0

(5.4.3)

F(t)

s dF(s) is finite and


lim

ftsdF(s)
*S- \ f = 1 .

f-+oo t{\ -

(5.4.4)

F(t))

Proof Relation (5.4.1) can be expressed as

]imF"(anx)=\1'

"-<x>

[0 ,

> h

0 < x < 1.

We first show that this is equivalent to


limn(l-F(anx))=\0j
->o

*> h
[00 , 0 < x < 1 .

(5.4.5)

We use the following inequalities: for 0 < t < 1,


f <-log(l-f)< y ^ .

(5A6)

5 Weak and Strong Laws and Law of the Iterated Logarithm

189

Let first x > l.Then


0 < n{\ F(anx)) < n log F(anx) -> 0 ,

n -> oo .

Next let 0 < x < l.Then


n(l - F(anx)) > F\anx)(-n

log F{anx)) -> oo ,

n > oo.
Next we prove that (5.4.1) implies (5.4.2). Let U be the (generalized) inverse
function of 1/(1 F). By inversion (5.4.5) is equivalent to
lim

= 1 for x > 0 .

It follows that an ~ U(n),n -> oo.


We continue with the proof of the equivalence between (5.4.2) and (5.4.3). Assume
(5.4.2). For some x > 1 take 0 < & < l < < i < o o such that x = d/b. Let
n = /i(r) := min{m : tfm+i > f}. Then
1 - F(tt)
1 - Fjfd/b)
0 < lim
= lim
- t^oo 1 - F(t)
t-+oo 1 - F(t)
\~F(td)
n+ 1
w(l - F(fl>lrf))
y
= lim
< lim
= (J .
,^oo 1 - F(tb) ~ ->oo n (n + 1)(1 - F(a n+ i&))
For 0 < JC < 1 one proceeds in a similar way.
To go from (5.4.3) to (5.4.2) recall the inequalities
1 - F + (E/(0) < t~l < 1 - F~(U(t))

(5.4.7)

with J7 the inverse function of 1/(1 F). For JC > 1 by (5.4.6) and (5.4.7),
lim sup log Fn (bnx) = lim sup n log F(bnx)
II(1 - F(fex))
5hmsup

FTTT;

< 2 lim sup

1 - F(^njc)

= 0.
For 0 < x < 1 one proceeds in a similar way.
Next we prove the equivalence of (5.4.3) and (5.4.4). First assume (5.4.3). For
X > 1 there exists xo(k) such that for x > JCO(A.),
I-F(AJC)

hence by repeated application,

<2_1(1-F(JC));

190

5 Weak and Strong Laws and Law of the Iterated Logarithm

1-F(knx)

<2"n(l

-F(JC))

and
roo
poo

/
Jx

_^_
pxkn+l*
w
pxk"

(1 - F(t)) dt = V ) Jxk>
/

n=0JxX
00

= J2xkn

(1 - F(t)) dt
pk

(\-F{txkn))dt

< Y"xkn(l-

00

<x(l-F(x)(X-l)J2

F(xkn))

pX

dt

/k\n
U

'

By letting k approach 1 we see that


lim sup J x )
x^oo
*(1 -

/;
F(x))

= 0,

(5.4.8)

which gives (5.4.4) by partial integration.


Finally we prove that (5.4.4) implies (5.4.3) by contradiction: suppose for some
xo > 1 and sequence tn -> oo,
T.

. _ 1 - F(f n * 0 ) ^
A
lim mi
> c >0.
n_>oo

1 F(f n )

Then by Fatou's lemma,


lim inf /
ds > I
n-^oo Jx

1 - F(tn)

- Jx

lim inf

ds > c

(XQ

- 1) > 0 ,

n-+oo 1 - F(t
F(t
ns)
n)

which contradicts (5.4.8) and hence also (5.4.4).


Corollary 5.4.2 IfF is in the domain of attraction of the Gumbel distribution Go(x)
and F(x) < 1 for all x, the result of Theorem 5.4.1 holds.
Proof. By Theorem 1.2.5,
lim
,_>oo
1 - F(t)
for x e R and the function / is such that
lim
f-oo

Hence (5.4.3) is applicable.

= e

= 0.
t

Remark 5.4.3 Clearly if F is in the domain of attraction of GY (x) for some y > 0,
relation (5.4.3) does not hold (cf. Theorem 1.2.1(1)), hence (5.4.1) cannot hold.

5 Weak and Strong Laws and Law of the Iterated Logarithm

191

N o w w e turn to strong (a.s.) laws. The validity of the strong law of large numbers
depends o n the finiteness of a certain integral. For the law of the iterated logarithm
the second-order conditions are basically sufficient.
We provide proofs for all the results except for the necessity of the condition for
the strong law, which is lengthy and complicated. We need the following lemma.
Lemma 5.4.4 Let cn be a sequence of positive constants and bn : = (1/(1 F))^(ri).
Suppose that bn+xcn
is an ultimately nondecreasing sequence for all real x > 1.
1. For each distribution function F we have almost surely
,. . Xn n bn
hm inf
< 0.
n-*oo
cn
2. Let c be a finite constant. We have almost surely
Xn,n-bn
h m sup
n>oo

= c
Cn

if and only if
00

J2(l-F(cnx+bn))

(5.4.9)

n=l
converges for all x > c and diverges for all x < c.
3. If for all-1 < x < 0,
00

] T ( 1 - F(cnx +
n=\

fcn))exp(-n(l

- F(cnx + bn))) < oo ,

(5.4.10)

then almost surely


liminf
n-^oo
Proof

Xn

'"-bn
cn

>o,

(5.4.11)

(1) Note that

P(Xn,n < bn infinitely often) > limsupP(X n , < bn)


n->oo
= limsupF n (fc) > lim (1 - \/n)n
rt-oo
n->oo

= e~l > 0 .

Since {X n , n < bn infinitely often} is a tail event, we have


P(Xn,n/cn 5 bn/cn infinitely often) = P(Xn^n < bn infinitely often) = 1 .
(2) Since cnx + bn is a nondecreasing sequence for all real x > 1, we have
Xn,n > cnx + bn infinitely often if and only if Xn > cnx -f bn infinitely often. Since
the Xn are independent, part (2) is a direct consequence of the Borel-Cantelli lemmas.

192

5 Weak and Strong Laws and Law of the Iterated Logarithm

(3) Since ]C^Li(! F(bn)) = oo, we have almost surely Xn,n > bn infinitely
often. Hence also Xn,n > cnx + bn infinitely often for all x < 0. So to prove (5.4.11)
it is sufficient to show that
P(Xn,n

< cnx + bn and X+i,+i > cn+\x + bn+\ finitely often) = 1 ,

or equivalently (since cnx + bn is a nondecreasing sequence for x > 1),


P(Xn,n

< cnx + bn and X+i > cn+\x + &+i finitely often) = 1 .

By the first Borel-Cantelli lemma this is true if


Y^ p(xn,n
n=l

< cnx + bn and Xn+i > cn+\x + bn+\)


oo
= J2 (l ~ F(cn+ix + 6+i)) F rt (cx + bn)
w=l

(5.4.12)

converges. Now
1 - F(cn+\x

+ &+i) < 1 - F ( c n * + bn)

and
F n ( c * + bn) = exp (log F(cnx + &w)) < exp { - n ( l - F ( c n * + &))} ;
hence the convergence of (5.4.12) is implied by (5.4.10).

Theorem 5.4.5 Let F(x) < 1 for all real x. Equivalent are:
1. For some sequence bn,
Xn n

'
bn

-* 1 A.J.

(5.4.13)

< oo .
1 F(vx)

(5.4.14)

2. For all 0 < v < 1,


/l

Proof. We prove that (5.4.14) implies (5.4.13). First note that (5.4.14) implies
lim

7T, 7 = 0

x->oo 1 F(vx)
for 0 < i; < 1. Hence (5.4.14) implies

5 Weak and Strong Laws and Law of the Iterated Logarithm

193

With U := (1/(1 - F))<" this gives

fOO

{1 - F(vU(t))}exp(-t{l

- F(vU(t))}) dt < oo .

By applying bn <U(t)<

bn+\ for n < t < n + 1 this entails

00

J2(l

- F(vbn+i))exp{-(n

+ 1)(1 - F(vbn))} < oo,

n=l

which is easily seen to lead to


oo

( 1 - F(vbn))exV{-n(l

- F(vbn))} < oo

w=l

for 0 < v < 1. This is the condition of Lemma 5.4.4(3) with cn = bn. Hence
liminfH-^oo Xn,n/bn > 1 a.s. In order to get lim s u p ^ ^ Xnfn/bn = 1 we need to
prove that the sum in (5.4.9) with cn = bn converges for x > 0 and diverges for
x < 0. That is, Y1%L\ * ~ F(vbn) is finite for v > 1 and infinite for v < 1. First note
that YOZLi 1 - F(vbn) is finite if and only if f 1 - F(vU(t)) dt is finite. Clearly
by the definition of U(t) this integral is infinite for v < 1. For v > 1 by partial
integration
Jl

Jl

1 - F(VJ)

which is finite by assumption.


We conclude from the result of Lemma 5.4.4 with cn = bn and c 0 that
lim sup ^ = 1 and

lim inf = 1 a.s.

We omit the proof that (5.4.13) implies (5.4.14), which can be found in BarndorffNielsen (1963).

For the law of the iterated logarithm we have conditions that are believed to be
new (cf. Pickands (1967) and de Haan and Hordijk (1972)).
Theorem 5.4.6 Let F(x) < 1 for all x. Define V(x) := U(ex) for real x. Suppose
there is a positive function p such that for all real x and some real f$,
v
V(t+xlost)-V(f)
lim
Then almost surely

'-<*>

e*-\
=

Pit)

lim sup
H^OO

Xn,n-V(logn)

and
liminf

n^-oo

eP-1

P(lOgtt)

Xn,n-V(logn)=()
p(logn)

(5.4.15)

194

5 Weak and Strong Laws and Law of the Iterated Logarithm

Remark 5.4.7 By writing V(x) Q(f^(logs)~l


ds) for x > 1 and noticing
that lim^oo fft+x ogr (log J ) " 1 ds = x for real x, one sees that ]imt-+oo(Q(t + x)
Q(f))/p(t) must exist for real x. Hence the limit function in (5.4.15) is (except for
shift and scale constants) the only possible nonconstant one.
Proof. Let En,n be the maximum of n independent standard exponential random
variables. It is easy to see that in this case the conditions of Lemma 5.4.4 are fulfilled
with c 1, bn = logn, and cn = log logn.
Hence, with
_ Entn -logn
log log n
we have almost surely
lim sup Qn = 1 and lim inf Qn = 0 .
Clearly {X*,,,}^ =d {V (En,n)}=l and
Xn,n - V(log;t)
p(logn)

V (logn + Qn log logn) - Vqogw)


p(logn)

The result follows from (5.4.15).

Corollary 5.4.8 If for some positive function a, a distribution function satisfies the
second-order condition of Section 2.3 with y 0 and p < 0, then the result of
Theorem 5.4.6 is true with bn = U(n) and cn = a(n) log log n.
Proof The uniform inequalities of Theorem 2.3.6 imply for x > 0,
V(t+X)-V(t) =

x +

o{l)Ao{et)e{P+e)^

ao(e )
where the o(l) term is uniform for x > 0, as / -> oo. Hence
ao(e{) logt

logt

logt

It is easily seen that the last two terms tend to zero as t - oo; hence (5.4.15)
holds.

If the second-order condition holds with y = 0 and p = 0, the result of Theorem


5.4.6 still holds in many cases. We give an example.
Example 5.4.9 (Normal distribution) Since
1 - <D(0 = (2n)~l/2e-t2/2

(l/t - l/t3 4-

o(l/t3))

as t -> oo, with <f> the standard normal distribution function, we have, with V *~ the
inverse function of V,

5.5 Weak "Temporal" Dependence

V*"(0 = log (_

195

j = y + log* + 1 log(2jr) + *(D .

From this one sees that log V *~(t) ~ 2 log t and


V^q+xQogO/0-V^(0
^oo
logV*"(f)
By inversion one gets relation (5.4.15) with p(t) = (log

x
2"
t)/\flt.

Finally, we mention a result of a somewhat different nature due to Klass (1985).


Theorem 5.4.10 Let X, Xi, X 2 , . . . be Ltd. random variables. Let bn be a nondecreasing sequence, P(X > bn) > 0, nP(X > bn) 00, as n -> 00. Then
l,if^TP(X>

bn)exp(-nP(X

> bn)) = 00,

n=l

P {Xn,n < bn l.O.) =

00

0,if^2

P(X > bn) exp (~nP(X > bn)) < 00 .


n=\

5.5 Weak "TemporaF' Dependence


Up to now we have considered partial maxima of a sequence of independent and
identically distributed random variables. In many practical situations the assumption
of independence in particular is not realistic. In this section we consider a random
sequence {X,-}?^ = Xi, X 2 , . . . that is strictly stationary, i.e., {X^+fc}^ has the
same distribution as {Xn}(nxLl for all positive integers k. Now not only the marginal
distribution function but also the dependence structure come into play.
We call this type of dependence "temporal," since we imagine the observations
obtained one after the other as time progresses. In the next chapter we shall consider a
sequence of vector-valued observations. In that situation the dependence between the
components of the vector will be the object of study. The latter kind of dependence
will be called "spatial." Of course in practical situations these concepts may not be
connected with real time or space.
Since the topic of dependent observations is vast and rather specialized, and since
excellent books on the subject are available (for example Leadbetter, Lindgren, and
Rootzen (1983), in the sequel LLR), the present section contains hardly any proofs.
We intend to give only a flavor of what is possible. We start with some examples.
Example 5.5.1 Let X be a random variable and consider the sequence X, X, X,
The partial maximum is just X and no limit theory is involved.
Example 5.5.2 Let Y\, Y2,... be independent and identically distributed with distribution function exp(-l/;c), x > 0. Consider the random sequence

196

5 Weak and Strong Laws and Law of the Iterated Logarithm

J^i, X2, X 3 , . . . :=

Yu *2, Y2, Y2l F 3 , *3, F 3 , . , with probability^ ,


Yi,Yi, Y2, Y2, Y2, y 3 , Y3,..., with probability! ,
[YuYuYu
Y2, Y2, Y2, Y3,...9 with probability!

Hence m a x ( Z i , . . . , Xin-2) behaves like m a x ( 7 i , . . . , Yn).


Example 5.5.3 Let Yo,Y\, Y2,... be independent and identically distributed with
distribution function exp( 1/jt), x > 0, and define for n = 1, 2 , . . . ,
Xn := -max(7_i,y) .
This is an example of a max-moving average process. Note that the sequence
X\, X2,... is stationary and the marginal distribution is the same as that of Y. Further,
m a x ( X i , . . . , Xn) = l- max(Fo,

Yu...,Yn),

so that
P (max(Xi,..., X2n-i)

<x) = P (max(F0, . . , Yn) < x) .

Hence max(Xi,..., Xn) behaves approximately like the maximum ofn/2 independent and identically distributed random variables F,-.
The situation is more or less the same if {Xn }<n*L1 is an infinite-order "max-moving
average" process.
Example 5.5.4 Let Y\, Y2,... be as in the previous example and let V be some
random variable. Define for n = 1, 2 , . . . ,
Xn := Yn + V .
Then m a x ( * i , . . . , Xn) = V + m a x ( 7 i , . . . , Yn) and
Pin'1 max(Xi, ...,Xn)<x)

= P(n~lV + n~l max(Yu . . . , Yn) < JC),

which converges to exp( l/x), x > 0, as n -* 00.


Example 5.5.5 Let U\,U2, ...be independent and identically distributed uniform
(0, 1) random variables and V some random variable. Define for n = 1, 2 , . . . ,
Xn := Un + V .
Then P(max(X\,...,

Xn) < x) converges to P(V + 1 < JC), as - 00.

Example 5.5.6 Let 1 , 2*... be independent and identically distributed standard


exponential random variables and V some independent random variable. Define for
/i = 1,2,...,
Xn := En + V .
Then
PCmaxCXi, Z 2 , . . . , Xn) - logn < JC) = P(max(i, . . . , ) - log AX + V < JC)
converges to P(M + V < JC), where M is a random variable with extreme value
distribution Go from Theorem 1.1.3 independent of V.

5.5 Weak "Temporal" Dependence

197

Example 5.5.7 Let X\, X 2 , . . . be a stationary Gaussian sequence and let EX\ = 0,
EX\ = 1. Let rn be the correlation function rn = E(X\, X n +i). Consider also
a sequence Y\,Y2,... of independent and identically distributed standard normal
random variables. Then we know from Example 1.1.7 that sequences an > 0 and bn
exist such that
hm P

n->oc

/max(Yi,...,Yn)-bn
\

<x

an

\
)

= exp l-e

_x.
)

for all x. If
lim rn log n = y > 0 ,
then
n->oo

/max(Xi,X2, ...,X)-fc

an

Ar

\
/

where M and iV are independent, M has distribution function exp (e~x), and A^ is
standard normal.
If rn log n -> 00, under certain conditions with different normalizing constants a
normal limit distribution is obtained (Pickands (1967)).
One aim of extreme value theory for stationary dependent sequences is to formulate general conditions that allow most of the theory for sequences of independent
and identically distributed random variables to go through.
The Conditions D and D'
The direction followed in the basic book of LLR is to formulate mixing conditions,
as weak as possible, for probabilities of events connected with large values of the
random variables.
Suppose X\, X2, X 3 , . . . is a stationary stochastic process and the distribution
function F of X\ is in the domain of attraction of some extreme value distribution;
that is, if Y\, Y2,... is an independent and identically distributed sequence with the
same marginal distribution, there are sequences an > 0 and bn such that for all x and
some y e R,
n(meix(YuY2,...,Yn)-bn

lim P I
n^oo

an

< x I = Gy(x) .

(5.5.1)

For fixed x define un := bn + anx. The mixing conditions for the process {Xn} will
be concerned with events above the level un, for fixed x and sample size n.
Let / and p be positive integers. For any random vector ( X i , . . . , Xp) let Pi,...,p
denote its joint distribution function. The condition D(un) will be said to hold if for
any integers
1 < i\ < < ip < j \ < < j p < n
for which j \ ip > /, we have

198

5 Weak and Strong Laws and Law of the Iterated Logarithm


\Fii,...,ipJu...jp(Un) ~ Fiit...,ipO*n)Fjl,...jp(un)\

< xnJ ,

(5.5.2)

where xnj > 0, n -> oo, for some sequence / = ln = o(n).


The condition D'(un) will be said to hold for the stationary sequence {Xj} and
the sequence of constants un if
[n/k]

limsupn Y^ P(X\ > un, Xj > un) -> 0 ,

(5.5.3)

7=2

as k -> oo.
Theorem 5.5.8 (LLR, Theorem 3.4.1) Let X\, X2,... be a stationary random sequence and let Y\, F2, be an independent and identically distributed sequence with
the same marginal distribution for which (5.5.1) holds. Assume that the conditions
D(un) and D'(un) hold with un=bn+
anx. Then
D{maiL(XuX2,...,Xn)-bn

hm P
n-+oo

^ \
< x J = GY(x) .

an

Next we describe a situation in which the normalizing constants are still the
same as under independence but the limit distribution is slightly different. This is the
situation of Examples 5.5.2 and 5.5.3.
Definition 5.5.9 Let X\, X 2 , . . . be a stationary sequence. Let Y\, Y2,... be an independent and identically distributed sequence with the same marginal distribution for
which (5.5.1) holds. If for some 0 < 0 < 1,
hm P
n^oo

/max(Xi,...,Xn)-fc
\

an

\
e
< x I = Gv(x)
)

for all JC, then the sequence X\, X2,... is said to have extremal index 0.
Note that our definition is slightly more specific than the definition in LLR, p. 67.
An example of a sequence with arbitrary extremal index 0 e (0,1] is the process

Xi:=Yu
X + 1 : = m a x ( ( l - 0 ) X n , 0 y + i ) far/i > 1,
where Y\, Y2,... are independent and identically distributed with distribution function exp( 1/JC), JC > 0. It is easy to see that P(Xn < x) = exp( l/x) and that 0 is
the extremal index of the sequence.
Many processes have an extremal index between zero and one: well-known examples are a moving average process of stable random variables (LLR, Section 3.8)
and a process satisfying a stochastic difference equation (de Haan, Resnick, Rootzen,
and de Vries (1989)). The point process convergence of Section 2.1 can be generalized
to this case: the epochs of the points in the limiting point process still form a homogeneous Poisson process. However the two-dimensional points are now arranged on
vertical lines where the mean number of points on a vertical line is 1/0.

5.5 Weak "Temporal" Dependence

199

Several estimators for the extremal index have been developed based on this
interpretation of 0. A general form for those estimators is given by (cf. AnconaNavarrete and Tawn (2000))
0 := TTTr,

(5.5.4)

N(un)
where N(un) is the number of exceedances of a high threshold un and C(un) is the
number of clusters. There are two general ways of identifying clusters. The runs
estimator for 0 is, for 1 < / < w,
R :=

1 n~l
~M7^\ J2 l{Xi>un)l{Xi+l<uH} ' ' ' i{Xi+i<Un}
iy\un)

It recognizes two different clusters of exceedances when there are at least / consecutive
observations below the threshold between them.
The second kind of estimator for 0 is called a blocks estimator. By first dividing
the sample into k blocks of length m, so n approximately equals km, the number of
clusters C(un) in (5.5.4) is estimated as the number of blocks in which at least one
exceedance of un occurs.
This approach is mainly connected with the behavior of extreme order statistics
and point process convergence (as in Section 2.1). For results concerning intermediate order statistics, convergence to Gaussian processes and asymptotic behavior of
estimators we need other conditions than the conditions D and D\ and the existence
of the extremal index. That is what we discuss next.
Mixing Conditions
A different way to deal with dependence has been explored by Rootz^n (1995) and developed by Drees. The aim is to formulate rather general but quite specific conditions
under which the approximation of the tail empirical quantile process by Brownian
motion of Section 2.4 can be generalized. As before, this approximation can serve as
a basis for a wealth of statistical results. It can be proved that many specific stochastic
processes used in applications satisfy the conditions.
We consider a version of the conditions and the ensuing result. Let {Xn} be a
stationary sequence. The common distribution function is denoted by F and the
inverse function of 1/(1 F) is denoted by U. We assume that F is in the domain
of attraction of an extreme value distribution Gy.
Further, we assume that the sequence {Xn} is ^-mixing, i.e.,
P(f) := sup E I
meN

sup
sup

\P (A| BJ1) - P(A)\) -* 0 ,

/ -> oo

\n*m+l+l

where B and B ^ + / + 1 denote the a-fields generated by {X,-}i<,-<m and {X,-},->m+/+i,


respectively.
The other conditions are the following:

200

5 Weak and Strong Laws and Law of the Iterated Logarithm

Condition 5.5.10
^ )

, i

-1/2,

2 i

lim n + lnkn ' log z fc n =0,


n-^oo

ln

where ln and kn are sequences of integers, ln, kn oo, n -> oo. The growth of the
sequence kn has to be restricted by Condition 5.5.13 below.
Condition 5.5.11 There exist s > 0 and functions cm, m = 1, 2 , . . . , SMC/I that
lim ^ - P ( * i > U (-^-)

, Z m + i > U (^-))

= cm(x, y)

for all x,y (0, 1 + ) am/ flc/z m.


Condition 5.5.12 77*ere eraf a constant D\

> 0 #d <2 sequence pm, with

H2rn=l Prn < OO, Such that

,y))<(y-x)(pm

D1^\

for all x, y e (0,1 + s) and each m, where

'-''> =-KsX).'
Condition 5.5.13

^te)-^fe) *-'-i

eX+l/2

lim v&n sup


n
^
o< s <i+ l /l + l l o g 5 l

-w

= 0,

w/zere a > Ois a suitable version of the auxiliary function of Theorem 1.1.6.
Theorem 5.5.14 Let X\, Xz,... be a ft-mixing stationary random sequence with
common marginal distribution function F e V(GY), for some real y. Under Conditions 5.5.10-5.5.13, for some sequence ln = o(n/kn), there exist versions of the tail
quantile process, denoted by {Xn-[knS],n}o<s<h and a centered Gaussian process E
with covariance function c defined by
00

c(x, y) := minO, y) + ] P (cm(x, y) + cm(y, x)) ,


m=\

such that
sup ^ + 1 / 2 ( l + | l o g j | ) " 1 / 2
0<5<1
Xn-[kns],n

~ Ai

- \

-(y+i)

E(s)

0,

n ~> oo .

(*)
Here D\, >2,... is a sequence of random variables which for y > | can be chosen
as U(n/kn) (hence nonrandom).

5.6 Mejzler's Theorem

201

Under very general conditions, estimators of y, a(n/kn), large quantiles, etc. can
be expressed as functionals of the tail empirical quantile function, and an appropriate
invariance theorem entails asymptotic normality of such statistics, very much in the
same way as for the approximation developed in Section 2.4. For details we refer to
Drees (2000, 2002) and in particular to Drees (2003), where the case y > 0 receives
special attention. It is shown that under certain conditions a stochastic difference
equation (hence also the ARCH process) satisfies the stated conditions, as well as a
moving average process.

5,6 Mejzler's Theorem


Instead of dropping the assumption of independence in the usual requirement of
independent and identically distributed random variables, one can also drop the assumption that the random variables have all the same distribution. The most general
and interesting result in this area is due to Mejzler (1956). It is somewhat similar to
Levy's class L in the theory of partial sums of random variables.
Theorem 5.6.1 Suppose X\, Xi,... are independent random variables with distribution functions F\, F 2 , . . . respectively. Suppose there exist sequences an > 0 and
bn such that (max(Xi, X 2 , . . . , Xn) bn)/an has a nondegenerate limit distribution,
which we call G. Suppose that as n -> 00,
|log<i|+ !*!- 00

(5.6.1)

and
\an+i/an

-* 1,

(5.6.2)

[(bn+i -bn)/an

-> 0 .

Then
f log G(x) is convex
ifx*(G) = 00 ,
{
_ x
[ log G (JC* e ) is convex ifx*(G) < 00 .

(5.6.3)

Herex*(G) := sup{jc : G(x) < 1}.


Conversely, any distribution function G satisfying (5.6.3) occurs as a limit in the
given set-up.
Remark 5.6.2 The conditions of Mejzler's version of the theorem are the same except
for (5.6.1) and (5.6.2). Mejzler's condition is that for any x with G(x) > 0 and any
sequence Jc = kin) of integers with 1 < k < n,
lim Fk(anx+bn)

= l .

(5.6.4)

n->oo

By taking k fixed in (5.6.4) one sees that (5.6.1) follows. Next take k(n) = n in
(5.6.4). This implies lim^oo 112=1 Fk(anx + bn) = G(x) for continuity points x of
G. Since we also have \imn-^oQ fl/Li ^k(anx + bn) = G(x) (5.6.2), follows by the
convergence of types theorem.

202

5 Weak and Strong Laws and Law of the Iterated Logarithm

Proof (Balkema, de Haan, and Karandikar (1993)). First the direct statement. Define
anx := anx + bn ,
n

Mn{x) := ] P - log Fk(anx) ,


\\an\\ :=\logOn\ + \bn\.
We know that Mn -> M := logG,w -> oo, weakly. Further, Mn is a "tail function"
for any n, i.e., Mw is nonincreasing and Hindoo MW(JC) = 0. Take s > 0. By virtue
of (5.6.1) and (5.6.2) we can construct a sequence of integers k(n) such that
lim
n->-oo

= 6

I an+*(n)

Then along a subsequence n' we have convergence of <x~f+k,nf\<xn'to <*<?> say. Now
Mn+u

is a tail function for n, k = 1,2,

<Xn'+k(n')

Mn> =

Mn

= 2

-logFr

Also,

\ <Xn'+k(n')

1 an/ -* - log G as + log G .


<V /

So log G a + log G is a tail function. Let


A := {a : Mo? M is a tail function} .
We have proved that for all s > 0 there exists a e A with ||a|| = e. We shall prove
that there is a /* e A such that P* e A for all f > 0 (note that if fix = : 0* + fc, then
j8'jc=:a'jt+*/0V<fa).
For the proof choose an A, n = 1,2,..., such that ||a|| = l/n. Choose
r(n) e N, r(n) -> oo, such that ||a (n) || -* 1, n -* oo. Note that arn(n) e A for all n.
Then along a subsequence nf we have convergence: a n ; ; -> p, say. Note that 0 is
not the identity and p e A.Nowa, r(n ' )] e Aforf > Omdlhnn^00al!r(n\ntrin)
i&
n)
r(w)]
the identity, since an converges to the identity. Since a^
-* p*, also a,
-> /?'
and hence P* e A for all f > 0. This proves the statement. Now, as we have seen, pt
has a specific form:
P'x =a'x + b [ as ds ,

(5.6.5)

for some a > 0 and & R. Hence M(P*x) M{x) is a tail function for all t > 0,
i.e., either
M(a r (;t - JCO) + Jc0) - M(x)
(with a ^ 1)
(5.6.6)

5.6 Mejzler's Theorem

203

or

M(x - hi) - M(x)


(5.6.7)
f
is a tail function for all t > 0. Since M(fi x) M(x) must be nonnegative, we must
have
ffx < x for all x < JC* := JC*(G) .

(5.6.8)

Let *JC := inf {JC : G(x) > 0}. Then if (5.6.6) holds with a > 1, we have JCO > JC*; if
(5.6.6) holds with a < 1, then xo <* *; if (5.6.7) holds, then & > 0.
Let us consider the case a > 1. Then JC* < oo. We know that for all t > 0 the
function
M(at(x - xo) 4- *o) - M((x - xo) 4- *o)
is nonnegative and nonincreasing for JC < JC*. That is, for all t > 0,
M(JC 0 - e'ey)

- M(x0 -

ey)

is nonnegative and nonincreasing for ey > xo JC*. It follows that M(JCO e~y)
is nondecreasing and convex for ey > JCO JC*, i.e., (with M' the right one-sided
derivative of M) eyM'(x$ ey) is nonnegative and nondecreasing for ey > JCOJC*.
Writejco ey = x* e*; then also exMf(x* ^ ) is nonnegative and nondecreasing
for JC G R, since 1 4- ex (JCO JC*) is nonnegative and nonincreasing. It follows that
Af (JC* ex) is a convex function and hence also M(JC* e~x). Similar reasoning
applies in the other two cases.
Conclusions: if (5.6.6) holds with a > 1, then JC* < oo and M(JC* e~x) is
convex; if (5.6.6) holds with 0 < a < 1, then *JC > oo and M(JCO + ex) is convex
for some JCO <**;if (5.6.7) holds, then JC* = oo and M(JC) is convex. Some reflection
shows that in fact the convexity of M (JCO 4- e*) for some JCO < * JC implies the convexity
of M(JC). This proves the direct statement of the theorem.
Conversely, suppose that G satisfies (5.6.3). We shall consider only the case
log G(JC) convex. The other case is similar. Relation (5.6.7) implies that the function

JO,
is a distribution function for n = 1, 2,

JC <*Jc(G)4-log(n + l),
Moreover,

f[ Fk(x + logn) = n( G{^


- * G(x) ,
^
G(x + \ogn)
as n -> oo, for all JC. Note that (5.6.1) and (5.6.2) apply.

Corollary 5.6.3 If (5.6.6) holds with a > 1, then the same relation holds with JCO
replaced by any x\ satisfying JC* < x\ < JCO- If (5.6.1) holds, then (5.6.6) holds for
a > 1 and JCO > JC*. If (5.6.6) holds with 0 < a < 1, then the same relation holds
with xo replaced by any x\ < xo; moreover, (5.6.7) holds.
Proof. Let us consider the case a > 1. The previous proof shows that then (5.6.6)
holds with JCO replaced by JC*. The same proof also gives (5.6.6) with *o replaced by
any JCI satisfying JC* < JCI < JCQ. Similar reasoning applies in the other two cases.

204

5 Weak and Strong Laws and Law of the Iterated Logarithm

Exercises
5.1. Let F be twice differentiable and write r(t) := ((1 - F)/F')' (t). Recall that
]imt-oor(t) = 0 implies that F e V{GQ). Prove that if r(t) log log 1/(1 - F(t)) -
0, then the condition of Theorem 5.4.6 holds.
5.2. Prove that Theorem 5.4.6 holds for any gamma distribution.
5.3. Prove that under the conditions of Theorem 5.1.1 with p < 0,
hm

n(\-

sup

F(ao(n)x + bo(n))

=1

-* * 0 <*<1/((-K)V0)

~ log Gy (x)

for any JCO > l/(y v 0).


5.4. Suppose for a sequence of i.i.d. random variables X\, X 2 , . . . we have that
(maxi<i< X,- bn)/an converges to an extreme value distribution Gy(x) with
Gy (x) < 1 for all x (hence y > 0). Consider now maxima in the presence of a
trend, i.e., maxi<,-<n(X,- + cbt) with, say, c > 0. Prove that

( max

Xi+cbi-(l+c)bn

converges to
I
w r
P I sup
\->i

Ty-1
+ c

<xI

\
<x ,

where {(W/, 7})}^j is an enumeration of the points of a Poisson point process on


R + x [0, 1] with mean measure v defined by v{(x, 00) x (0, t)} = t/x. Hence

hm P ( max
->oo

\l<i<n

Xi+cbi-(l+c)bn
an

=exp

< xJ
J

|"i'( 1+, '('- c{ T i ))"'"'"|'

Hint: Use the point process convergence of Theorem 2.1.2, the interpretation of point
process convergence given at the end of Section 2.1, and Theorem 1.1.6(2). Note that
for a Poisson point process the probability that a certain set is empty (contains no
points) equals e~~q, where q is the mean measure of that set.
5.5. Let E\, E2,... be i.i.d. standard exponential random variables. Define for x > 1,
N := min {n : Ek < x log k for k > n].
Show that N < 00 a.s. For what values of x does EN exist?
5.6. Let X\, X2,... be i.i.d. positive random variables with distribution function F.
Prove that max (X\, X 2 , . . . , Xn) l^fn -+p 0 if and only if lim^oo x2(l F(x)) =
0.
5.7. Let X\, X2,... be i.i.d. positive random variables. Prove that max(Xi, X2,
. . . , Xn)/n -> 0 a.s. if and only if EX < 00.

Part II

Finite-Dimensional Observations

6
Basic Theory

6.1 Limit Laws


6.1.1 Introduction: An Example
In order to see the usefulness of developing multivariate extremes we start with a
problem for which multivariate extreme value theory seems to be relevant.
Consider the situation of thefirstcase study in Section 1.1.4. We want to assess
the safety of a seawall near the town of Petten, in the province of North Holland, the
Netherlands. The seawall is called Pettemer Zeewering or Hondsbossche Zeewering.
The seawall could be in danger of overflowing and eventual collapse in case of high
levels of the water of the North Sea combined with high waves (both mainly due to
wind storm activity). High-tide seawater levels, i.e., the water level with the waves
filtered out, that is, the average level taken over a few minutes, have been monitored
over a long period of time. However, these measure still water levels and do not take
into account wave activity since it involves a kind of time averaging. Hence it makes
sense to introduce the waves as a second variable, thus completing the picture. Since
monitoring waves is much more problematic than that of still water levels, less data
are available.
The wave height (HmO) and still water level (SWL) have been recorded during 828
storm events that are relevant for the Pettemer Zeewering. Also, engineers of RIKZ
(Institute for Coastal and Marine Management) have determined failure conditions,
that is, those combinations of HmO and SWL that result in overtopping the seawall,
thus creating a dangerous situation. The set of those combinations forms a failure set
C. Figure 6.1 shows the observations as well as a simplified failure set C, which is
{(HmO, SWL) : 0.3 HmO + SWL > 7.6} .
Clearly the observations stay clear of C, indicating that there has been no dangerous situation during the observation period.
The question is, what is the probability that during a wind storm of the considered
type, we get an observation (HmO, SWL) in the set C, i.e., what is the probability of
failure?

208

6 Basic Theory

8 -

Still water level

6 -

-^~^zzz^
^uV/y

-z~~^~-^~~^~~-/^_/_

4 -

2 -

mp*

0 -2

/ / / / / /

~-^^^^-^-^~-^L_Z
-^ul/ / / '///////

10

Wave height (m)

Fig. 6.1. Observations and failure set C.


This problem is clearly similar to estimation of an exceedance probability considered in Section 4.4, but we need a multivariate extension of the theory that leads
to the estimator used there.
We shall proceed as follows. After stating the definition of multivariate extremes,
we shall identify all possible limit distributions as in the one-dimensional case and
then determine their domains of attraction. Then we go on to discuss the statistical
issues.
6.1.2 The Limit Distribution; Standardization
In order to make the results more accessible, we shall mainly focus on the twodimensional case. Generalization to higher-dimensional spaces is usually obvious.
When needed, we shall mention the extension explicitly.
It is not obvious how one should define the maximum of a set of vectors. In fact,
there are many possibilities, but it turns out that a very naive definition leads to a rich
enough theory to tackle problems in applications. We just take the maximum of each
coordinate separately and then we assemble these maxima into a vector. Note that the
resulting vector maximum will usually not be one of the constituent vectors, but this
will not disturb us.
Suppose (Zi, Y\), (X2, Y2),... are independent and identically distributed random vectors with distribution function F. Suppose that there exist sequences of constants an,cn > 0, bn, and dn real and a distribution function G with nondegenerate
marginals such that for all continuity points (x, y) of G,
D(max(XuX2,...,Xn)-bn
hm P (

n->oo

= G(x, y)

an

^
<

JC,

max(Yu Y2,..., Yn) - dn


Cn

\
< vI
J

(6.1.1)

Any limit distribution function G in (6.1.1) with nondegenerate marginals is called a


multivariate extreme value distribution.

6.1 Limit Laws

209

In this section we are going to determine the class of all possible limit distributions
G. In doing so we will heavily rely on the theory developed in Chapter 1. Since (6.1.1)
implies convergence of the one-dimensional two marginal distributions, we have
n/max(Xi,X2,...,X)-fr

^ \
< x = G O , oo)

lim P
n^oo

an

(6.1.2)

and
n /max(ri,r 2 ,...,r)-4

lim P

n^oo

^ \

< y

Cn

= G(oo, y) .

(6.1.3)

Now we choose the constants an,cn,bn, and dn such that (cf. Theorem 1.1.3) for
someyi, YI e R,
G(JC, oo) = exp ( - (1 + y i x ) ~ 1 / n )
(6.1.4)
and
G(oo, y) = exp ( - (1 + y 2 y)" 1 / K 2 j

(6.1.5)

We note in passing that since the two marginal distributions of G are continuous,
G must be continuous as well.
Next we are going to use the results of Lemma 1.2.9 and Corollary 1.2.10. Let Ft,
i = 1, 2, be the marginal distribution functions of F. Define U((t) := F.*~(l l/t),
t > 1, for i = 1,2. Then according to Theorem 1.1.6 there are positive functions
at(t),i = 1, 2, such that
r
Ui(fx)-Uj(t)
hm

and

x* - 1
=

(6.1.6)

at(tx)
r
y.
lim - = xn
t^oo atit)

for all x > 0. Moreover, (6.1.2)-(6.1.5) hold with


&:=tfi([n]),
dn := U2([n]) ,
an :=ai([n]) ,
cn := ^2([1)
(cf. proof of Theorem 1.1.2). It follows from (6.1.6) that for JC, y > 0,
r

lim
n^oo
r

lim
n-oo

Ul(nx)-bn

an

Ui(ny)-dn
Cn

x*-l

Y\
=

y^-1
Yl

Now we return to (6.1.1), which can be written as

(6.1.7)
.

210

6 Basic Theory
lim Fn(anx + bn, cy + d) = G(x, y) .

(6.1.8)

n>oo

Note that if xn -> w, yn -> v, then by the continuity of G and the monotonicity of F,
lim F"(ajtw +

+ d) = G(u,v).

(6.1.9)

n->oo

We apply (6.1.9) with


__
xn '.
yn :=

U\(nx)-bn
,
an
U2(ny)-dn

JC, y > 0, and get by (6.1.7)


lim F w (Ui(nx), U2(ny)) = G (
n->oo
\
We have proved the following theorem:

,
Yl

.
Yl

Theorem 6.1.1 Suppose that there are real constants an,cn > 0, bn, and dn such
that
lim Fn(anx + bn, cny + dn) = G(x, y)
n-+oo

for all (JC, y) ofG, and the marginals ofG are standardized as in (6.1.4) and (6.1.5).
Then with F\(x) := F(x9 oo), F2(y) := F(oo, y), and Ut(x) := f}*"(l - 1/JC),
i = l,2,
lim F71 (Ui(nx), U2(ny)) = G0(x, y)
(6.1.10)
n-*oo

for all x, y > 0, w/iere


G0(^y):=G(-,^
-)
\
Yl
Yl )
and yi, y2 are the marginal extreme value indices from (6.1.2)-(6.1.5).
Remark 6.1.2 In case F has continuous marginal distribution functions F\ and F2,
relation (6.1.10) can be formulated simply as
lim P | max (

,...,

) < nx ,

max {
,...,
I < ny \
y
Vi-f^ri)'
' \-F2(Yn))~
\
= Go(x, y)
for JC, y > 0, i.e., after a transformation of the marginal distributions to a standard
distribution, namely F(x) := 1 1/x, x > 1, a simplified limit relation applies. This
means that we have reformulated the problem of identifying the limit distribution in
such a way that the marginal distributions no longer play a role. From now on we can
focus solely on the dependence structure.

6.1 Limit Laws

211

Corollary 6.1.3 For any (JC, y) for which 0 < Go(x, y) < 1,
lim n {1 - F(tfi(iuc), l^toO)} = -logG 0 (JC, y) .

(6.1.11)

Proo/ Taking logarithms to the left and to the right of (6.1.10), we get
lim -n log {F(Ui(nx), U2(ny))} = - log G 0 (JC, y) .

(6.1.12)

w-*oo

Note that (6.1.12) implies F{U\{nx), U2(ny)) - 1; hence


-log F(Ul(nx)iU2(ny))
l-F(Vi(nx),U2(ny))

^
~* '

and (with (6.1.12)) relation (6.1.11) follows.

We shall also use the following slight extension.


Corollary 6.1.4 For any

(JC, y) for

which 0 <

GO(JC,

y) < 1,

lim t{\ - F(Ui(fx), U2{ty))} = - l o g G o O c y ) ,

(6.1.13)

00

where t runs through the real numbers.


Proof By applying inequalities as in the proof of Theorem 1.1.2 to relation (6.1.11)
one sees then, that (6.1.13) also holds.

6.1.3 The Exponent Measure


Next we take (6.1.11) as our point of departure. Take a > 0 and define for (JC, y) e
R2_ : = [o, oo) 2 with max(jc, y) > a and n = 1, 2 , . . . ,
,
, _,
HnAx*y):-i

l-F(Ui(nx),U2{ny))
i-F(Ul(na),U2(na)y

Clearly Hn,a is the distribution function of a probability measure, Pn,a say, on R+ \


[0, a]2 for all n, and by (6.1.11),
lim Hn,a(x,y)

=:

Ha(x,y)

exists for all JC, y with max(jc, y) > a. Hence by Billingsley (1979), Theorem 29.1,
we have that Ha is the distribution function of a probability measure, Pa say, and it
follows that
lim Pn,a(A) = Pa(A)
n-voo

for all Borel sets A c R^. \ [0, a]2 with Pa(dA) = 0. Now clearly
vn := n {1 - F (Ui(na), U2(na))} Pn,a

212

6 Basic Theory

is a measure on R+ \ [0, fl]2> n 0* depending on a and such that for all Borel sets
A C M^ \ [0, a]2,
lim v(A) = v(A)
with
v := -logG0(a,a)

Pa .

Note that since a > 0 is arbitrary, vn(A) and v(A) are defined for all Borel sets A
with
inf max(jc, y) > 0
(6.1.14)
(jf,y)eA

and that for x, y with max(jc, y) > 0,


vn |(,y, f) R+ : * > x or f > y\ = {1 - F(U\(nx),
v I(j, 0 R+

: s > x ort

> y\

= -logGQ(X,

U2(ny))},

y) .

Finally, for all Borel sets A such that (6.1.14) holds we have lim^-^oo vn(A) = v(A).
We formulate these results in the following theorem.
Theorem 6.1.5 Let F and Go be probability distribution functions for which (6.1.11)
holds, i.e., for JC, y > 0 with 0 < Go(x, y) < 1,
lim n {1 - F(tfi(a*), t/2(#o0)} = -logGoC*, y),
w/i^r^ Ut (1/(1 x)) is the inverse function of the ith marginal distribution, i = 1, 2.
7%e f/*re Are set functions v, vi, V2,... defined for all Borel sets A C R+ with
inf

max(jc, ;y) > 0

(x,y)eA

such that:
1.
vn Us, t) G R 2 .

* > xort

> y\ =n{l

- F(U\(nx),

U2(ny))},

v | (s, f) R+ : s > * or t > y J = - log Go(x, y) ;


2. /or alia > 0 the set functions v, vi, V2 are finite measures on R+ \ [0, a ] 2 ;
J. for each Borel set A c R+ vWf/i inf (JC.^GA max(;c, y) > 0 awd v(8A) = 0,
lim vn(A) = v(A) .
Remark 6.1.6 One can consider this result in the framework of convergence of measures on a metric space as follows. Consider the space
R2.* := R2. \ {(0,0)} .

6.1 Limit Laws

213

One can write this space as a product space by using the transformation
U, y) -

max(x, y),
\

I .
max(x,y)/

Then
1% := (0, oo) x Q,
where

Q := Us, t) e R2+ : s, t > 0 , maxO, t) = l | .

Next we extend the space R+" to


2 := (0, oo] x Q
(although there is never any mass at infinity) and we change the Euclidean metric on
(0, oo] to the metric Q(X, y) := \\/x \/y\. With this change (0, oo] is a complete
separable metric space (CSMS) and so is (0, oo] x Q. The set functions v, vi, V2,...
are boundedly finite measures on this CSMS, i.e., v(A), v,- (A) < oo for each bounded
Borel set, for i 1, 2,
Moreover, vn(A) -> v(A) for each bounded Borel set A
with v(3A) = 0, i.e., vn converges weakly to v on this CSMS. For details about this
type of convergence see the appendix in Daley and Vere-Jones (1988). Compare also
the somewhat analogous results of Theorem 9.3.1 in an infinite-dimensional setting.
Definition 6.1.7 The measure v from Theorem 6.1.5 is sometimes called the exponent
measure of the extreme value distribution Go, since
G0(Jc,y) = exp(-v(A, f y))

(6.1.15)

Ax,y := {(5, 0 R+ : s > x or t > y) .

(6.1.16)

with
Remark 6.1.8 Relation (6.1.15) does not hold for all Borel sets Ax,y.
The characterizing property of the exponent measure is the following homogeneity
relation.
Theorem 6.1.9 For any Borel set A C M+ with inf(Xi),)GAniax(x, y) > 0 and
v(d A) = 0, and any a > 0,
v(aA)=a~lv(A)
,
(6.1.17)
where a A is the set obtained by multiplying all elements of A by a.
Proof. Taking tn = na for some a > 0 in (6.1.13) we obtain
lim n{\ F(U\(nax), U2(nay))} = a~l logGo(*, y) .
On the other hand, by direct application of (6.1.11),
lim n{\ - F(U\(nax), U2(nay))} = - l o g G 0 ( a x , ay) .
n-+oo

Hence

(6.1.18)

214

6 Basic Theory

-a'1

log G0(x, y) = - log G0(ax, ay),

(6.1.19)

and the statement of the theorem holds for all sets AXty defined by (6.1.16). It is then
clear that this relation must also hold for the generated a -field.

Remark 6.1.10 Relation (6.1.17) implies that v (A) isfinitefor all sets A with positive
distance from the origin, but v is not bounded. Note in particular that
G0(ax, ay) = GlQ/a(x, y) ,

for

a, x, y > 0 .

(6.1.20)

A nice intuitive background for the role of the exponent measure is provided by
the following theorem. The proof is very similar to the proof of Theorem 2.1.2 and
is omitted.
Theorem 6.1.11 Let (X\ ,Y\), (X2, Y2) ...,be i.i.d random vectors with distribution
function F. Suppose (6.1.1) holds with an, bn, cn, anddn as in (6.1.2)-(6.1.5), i.e.,
lim nP I 1 + y\
n^oo

\ \

)
an

> xor [1 + yi

I
Cn

> y]

-logG0(x,y),

that is, more generally for each Borel set AofQ (for the definition of Q see Remark
6.1.6),

i+

,+

M(( ^T'( ^n

eA

= v(A) .

Define the point process Nn as follows: for each Borel set B e R+ x Q,


00

Nn(B)

=^1^i/n(1+yi(Xi_bnyan)l/n

Define also a Poisson point process N on the same space with mean measure k x v
with k Lebesgue measure and v the measure defined in Theorem 6.1.5. Then Nn
converges in distribution to N, i.e., for Borel sets B\, B2,..., Br e M + x Q with
(kxv)(dBi)=0,i
= l,...,r,
( # ( * ! ) , . . . , Nn(Br)) A (N(B0,...,

N(Br))

This theorem opens the way to estimating the measure v by just counting the
number of observations in certain sets, as we shall see later on (Sections 7.2 and 8.2).
6.1.4 The Spectral Measure
The homogeneity property (6.1.17) of the exponent measure v suggests a coordinate
transformation in order to capitalize on that. Recall R^_* from Remark 6.1.6. Take any
one-to-one transformation M+* -> (0, 00) x [0, c] for some c > 0,

6.1 Limit Laws

215

r = r(x, y),
d =

d(x,y),

with the property that for all a, x, y > 0,

r(ax, ay) = ar(x, y),


d(ax,ay) = d(x, y) .

One can think of r as a radius and d as a direction. Examples are


r(x .?) = < / z
* +

r(x,y) = x + y,
x
d(x, y) =
x+ y

r,

J(JC, y) = arctan - ,
x
and

r(x, y) = x V y,
(6.1.21)

t/(x, y) = arctan :

It will turn out that the measure v has a simple structure when expressed in the new
coordinates.
Let us start with the first transformation. Define for constants r > 0 and 0 e
[0,7r/2] the set
Br%e := I (x, y) e R+ : yjx2 + y2 >r and arctan - < 01 .
Clearly
Br,o =rB\e

and hence by (6.1.17),


v(Br,e):=r-lv(Bhe)

(6.1.22)

This relation means that after transformation to the new coordinates r(x, y) and
d(x, y) the measure v becomes a product measure. Set for 0 < 0 < 7r/2,
(6.1.23)

*(0):=v(fli,*) .

Clearly *I> is the distribution function of a finite measure on [0, n/2\. This finite
measure is called the spectral measure of the limit distribution G. The spectral measure determines the distribution function G in the following way. Write s = r cos 6,
t = r sin 0. Take x,y > 0,
- log Go(*, y) = v {(s, 0 :

> x or t > y]

= v { ( s , 0 : rcosO > x orrsinO > y]

= V[CM)

r > cos 0

y i
sin 0

(6.1.24)

216

6 Basic Theory

We consider two subsets in order to evaluate. First the subset where x/cosO <
y/sinO. Then r > min(jc/cos#, y/sinO) translates into r > x/cosO. Hence by
(6.1.22) and (6.1.23) the v measure of this set is the integral
f

Jx/(cos9)<y/(sin0)

dr

Jr>x/(cosO)

cosO

J(cosO)/x>(sinO)/y

The integral over the other subset, namely where Jt/(cos0) > v/(sin#), can be
evaluated similarly and yields
sind

J(cosO)/x<(sinO)/y

*)

Combination of the two integrals gives that (6.1.24) equals


/cos0
C^ln1'2 fcosO

sinO\

, ^

The term "spectral" can be seen as an analogue to the light spectrum, which highlights the contribution of each color separately. Here the spectral measure highlights
the contribution of each direction separately. The terminology comes from corresponding results in the theory of partial sums rather than partial maxima, see, e.g.,
Breiman (1968), Section 11.6.
We have proved the direct statement of the following.
Proposition 6.1.12 For any extreme value distribution function Gfrom (6.1.1) with
(6.1.4) and (6.1.5) there exists a finite measure on the set [0,7r/2], called spectral
measure, with the property that ifty is the distribution function of this measure, for
x, y > 0,
/*"-l

yy2-l\

fn/2/cos0

sin0\

T/

(6.1.25)
where y\ and yi are the extreme value indices of the marginal distributions of G.
Moreover, we have the side conditions
pn/2

/
Jo

pn/2

cos<9 *(<W) = /
Jo

sin6> V(d0) = 1 .

(6.1.26)

Conversely, any finite measure represented by its distribution function V gives rise
to a limit distribution function G in (6.1.1) via (6.1.25) provided the side conditions
(6.1.26) are fulfilled.
Proof We have already proved the direct statement. The side conditions (6.1.26)
stem from the fact that GoQt, oo) = Go(oo, JC) = exp( \/x) for x > 0.
For the converse we first prove that Go defined by (6.1.25) is the distribution
function of a probability measure.

6.1 Limit Laws

217

Clearly exp ((COS#)/JC V (sin0)/y) is the distribution function of the random


vector (V cos#, V sin0), where V has the distribution function exp( l/x), x > 0.
Further, if Ft is the distribution function of the random vector (Vi, W;), i 1, 2,
and (V\,W\) and (V2, W2) are independent, then F\Fi is the distribution function
of (max(Vi, V2), max(Wi, W2)). Hence any product of distribution functions is a
distribution function. It follows that

(-l>\-T-v-r))
v^v

/ cos 0i

sin 0i \ \

< 6 - i - 27 >

is a distribution function for any 0 < 0\ < < 0n < n/2 and *I>; > 0, 1 =
1, 2 , . . . , n. Now the expression on the right-hand side in (6.1.25) can be approximated
by a sequence of type (6.1.27). This proves that Go is a distribution function.
Next we prove that G can serve as a limit distribution in (6.1.1). Note that for all
JC, v > Oandn = 1,2,...,
G^{nx,ny) = Go(x,y) .
Hence for all JC, y with 1 + y\x > 0, 1 + Yiy > 0,
Gn(

+nJc, -
K2

+nY2y)
/

= Gn0 (n(i + n ^ ) 1 / n , " ( i + y 2 )0 1/K2 )


= G 0 ( ( l + y ^ ) 1 / ) / 1 , ( l + K2y)1/)/2)
= G(JC,V).

(6.1.28)

Hence (6.1.8) holds with F = G, an = n>\ cn = n ^ , Z?n = (n* - l)/yi, dn =


(n71 l)/y2> It follows that any distribution function G satisfying (6.1.25) can occur
as a limit function in (6.1.1).

Definition 6.1.13 We call the class of limit distribution functions G in (6.1.1) the
class of max-stable distributions, as suggested by relation (6.1.28). Hence any extreme value distribution is max-stable and vice versa. The class of limit distribution functions Go in (6.1.10) is called the class of simple max-stable distributions, "simple" meaning that the marginal distributions are fixed as follows:
GO(JC, 00) = Go(oo, x) = exp ( - 1 / J C ) , x > 0.
So far we have considered only the first transformation from (6.1.21). A similar
analysis of the other transformations yields the following result. The proof is left to
the reader.
Theorem 6.1.14 For each limit distribution Gfrom (6.1.1), (6.1.4), and (6.1.5) there
exist:

218

6 Basic Theory

1. A finite measure (denoted by the distribution function W) on [0, 7t/2] such that
forx,y > 0,
G(-J
V Y\

-)=G0(x,y)
Yl

)
I
C71'1 /cos (9 sin<9\
\
= exp I - /
(
v
J V(d0) 1 (6.1.29)

with the side conditions


rn/2
rn/2
/
cos<9 *(<) = /
sin<9 (<W) = 1 .
Jo
Jo

2. A probability distribution (denoted by the distribution function H) concentrated


on [0, 1] with mean \ such that for x, y > 0,
/ r t t _ 1 VK2 _ 1 \
G(1,!
i) = G0(x,y)
\
Yl
Y2 )

= e x p ( - 2 f (-v]^\

H(dw)\ . (6.1.30)

3. A finite measure (denoted by the distribution function <&) on [0, n/2] such that
forx,y > 0,

Yl

Yl

= exp

(" i (-7- v -T") *m) (6131)

with the side conditions


pn/2
nn/2
/
(1 A tan0) <P(dO) =
(1 A cot0) <S>(</0) = 1 .
Jo
Jo
The parameters y\ and yi are the extreme value indices of the marginal distributions
ofG.
Conversely, any finite measures represented by the distribution function V, H, or
$ gives rise to a limit distribution function G in (6.1.1) via (6.1.29), (6.1.30), and
(6.1.31) respectively, provided that the stated side conditions are fulfilled.
Corollary 6.1.15 Convergence in distribution for a sequence of simple max-stable
distributions GQ is equivalent to convergence in distribution of their spectral measures. In particular, the class of simple max-stable distributions is closed under weak
convergence.

6.1 Limit Laws

219

Representations (6.1.29)-(6.1.31) mean that the limit distributions in (6.1.1) are


characterized by just the spectral measure and the extreme value indices of the
marginal distributions.
Clearly the list of possible versions of the spectral measure can be extended ad
infinitum. The spectral measures *I>, H, and 4> are the most common ones. Clearly
one can transform one into the other. Which one is more convenient depends on the
particular situation at hand.
For example, the relation between H and <I> follows from
-logGo(x,y)

]0

= 2l

fl (w
(-v

- ? fn/2

(jc(l+cot0)

r*'1 /sin<9
Jo

1-uA
J

\y V ~TTcote))\

cos0\ / J _
V

\x

H(dw)

y )\faO coso)

f/

(sm0

1+cot0/
T+hie)

+ cos0)(^A^)

1 Acot0\

n 2

H d

_l_\

[l A tan(9

{l+cot0
V ~ l+cot<9/J
(\ Atan<9
1 Acot0\ _ ,_

l+cot6>/

That is,
/.l/(l+cat0)

d>(6>) =
= 22 //

(w v (1 - tu)) #(dw) .

Remark 6.1.16 We discuss two extreme cases of spectral measures. For simplicity
we formulate only the results for H. We consider

'""

Yd J
= eXp|-<*
(

//
/

w\-\

|^v...v^)H(rfw)
(

\-Wd=l

where / / is the distribution function of a probability measure on


{w = (wu ...,wd)
with

: wi + . + wd = 1, u;,- > 0, / = 1, 2 , . . . , d]

/.../, H(dw) = d\
w\-\ Vwd\

for/ = 1,2, . . . , d .

(6.1.32)

220

6 Basic Theory

1. Let the spectral measure be concentrated at the point (l/d, l/d,..., l/d) with
mass 1. If (X\, X2,..., Xd) is a random vector with distribution function
G ((* - l ) / n , . . . , {xYdd - I)/yd), then Xx = X2 = = Xd a.s.
2. Let the spectral measure be concentrated at the extreme points of the set
(6.1.32), i.e., the d points (1, 0 , 0 , . . . , 0), (0,1, 0 , . . . , 0), ..., (0, 0 , . . . , 0, 1)
with masses l/d. If (X\, X 2 , . . . , Xd) is a random vector with distribution function G [{x\l l ) / y i , . . . , (x%d 1) /yd), then Xi, X2, ...,Xd are independent.
Let us consider some examples of limit distributions and their spectral measures.
It is useful to note first that the transformation theory for integrals implies that if,
for example, the spectral measure has density *!>', then log Go (A:, y) has density
q(x, y), say. For instance, for a Borel set A C M+ with inf (x,y)eA max(jc, y) > 0 and
v(3A) = 0, and with r = y/x2 -f y 2 and 0 = arctan(y/jc),
v(A) = J q(x, y)dx dy = /

q(r cosO, r sin#)r dr d 0 ;

hence
ty'(0) = r3q(r cos(9, r sin6) .
Similarly for the other spectral measures (cf. Exercises 6.6 and 6.7).
Example 6.1.17 (Geffroy (1958)) Take W(p) = 0 for 0 < 0 < TT/2. Then the side
conditions are fulfilled and
GQ(X, y) = exp J - (x~2 + y ~ 2 ) 1 / 2 [ ,

x > 0 ,y > 0 .

Generalization: for 0 < a < 1,


GoOt, y) = exp j - ( x " 1 / a + y" 1 /*)*) ,

JC> 0 , y > 0 .

Note that a = 1 corresponds to independence of the coordinates, and a = 0 (defined


as a limit) corresponds to full dependence.
Example 6.1.18 (Sibuya (I960)) Take H(w) = w for 0 < w < 1. Then the side
conditions are fulfilled and

G0(x, y) = exp j - ( J " 1 + y~l -

(JC

+ y)" 1 ) j ,

JC >

0 , y > 0.

Generalization: for k > 0,


G 0 (*, y) = exp j - (x~l + y" 1 - k(x + y ) _ 1 ) ) ,

x> 0 ,y > 0 .

Note that k 0 corresponds to independence of the coordinates.


Example 6.1.19 Take <D(0) = (*/4 + 2" 1 log2)~1(9 for 0 < 6> < ;r/2. Then the
side conditions are fulfilled and

6.1 Limit Laws 221

( j + i]og2)(-IogGo^
1 I 7T
X ^ , / *\/2
B
+ arctany- '+ 0 v log
y |\
4 ,xA"^""
\jx2 + y2)\ '
To see this note that
anfl
rC7 '/^2/ l/ A
l tAtan<9

lAcotfl
lAcot<9\
_
/ "* /^2
= /
./arctan(x/y)
Jaictan(x/y)

11 AA tan
tan6161
x

fucun(x/y)
, /
dO + /

1 A cot 6>

dO .

JO

Now we have for example


Cn'2

lAtanfl
x

J arctan (x/y)

1 rx/4
/**/
- /

1 r*/2
/**
tan<9d<9 + - /

/(7r/4)Aarctan(jc/y)
* y(7r/4)Aarctan(jc/y)

dO .

* /(7r/4)varctan(;t/;y)
J(n

Finally note that J0Z tan 0 dO = - log cos z, for 0 < z < n/2.
Example 6.1.20 This example is based on the normal distribution: for c > 0,
-logG0(x,y)^E(^v^N-^
with N a standard normal random variable. Then
Go(x,,)

exp{-(iF( + Ilog^) + lF(^ + I l o g ^ ) ) } ,

where F is the standard normal distribution function. Again, when c -> oo we have
independence and when c | 0 we have full dependence. This distribution function can
be obtained in several other ways (Eddy and Gale (1981), Hiisler and Reiss (1989),
and de Haan and Pereira (2005)).
6.1.5 The Sets Qc and the Functions L, x> and A
Finally, we discuss a few other ways to characterize the max-stable distributions. Since
the dependence structure is quite general, one could describe the dependence using
copulas. If F is the distribution function of the random vector (X, F), the copula C
associated with F is a distribution function that satisfies F(x, y) = C (F\ (JC), i<2(y))
with F\(x) := F(x, oo) and F2OO := F(oo, y). It contains complete information
about the joint distribution of F apart from the marginal distributions (for more details
see Nelsen (1998) and Joe (1997)).

222

6 Basic Theory
Define for 0 < JC, y < 1,
C(x,y):=G0(-l/logx,-l/logy) .

Then C is a copula and relation (6.1.20) translates into the following: for 0 < x, y < 1,
a >0,
C(xa,ya) = Ca(x,y) .
Since this relation is not very tractable for analysis, it is usual to consider instead the
function L defined by
L(jc,y):=-logGo(l/JC,l/y)
for JC, v > 0. We can express the function L in terms of the exponent measure v (cf.
Section 6.1.3) as
L(x,y) = vUs,t)eR\

: s > l/x ort > 1/yj .

(6.1.33)

Proposition 6.1.21 (Properties of the function L)


1. Homogeneity of order 1: L(ax, ay) = aL(x, y), for alla,x,y > 0.
2. L(JC, 0) = L(0, JC) = JC, for all x > 0.
3. x v y < L(JC, y) < JC 4- y,/or a// JC, y > 0.
4. Let (X, F) e a random vector with distribution function Go. If X and Y are
independent, then L(JC, y) = JC + y, for JC, y>0.Ifon
the other hand X and Y
are completely positive dependent, i.e., X = Y a.s., then L(JC, y) = max(jc, y)
forx,y > 0.
5. L w continuous.
6. L(JC, y) is a convex function: L(A.(JCI, yi) + (1 A.)(JC2, V2)) < AL(JCI, yi) +
(1 - k)L(x2, yi), for all x\, y\, X2, yi > OandX e [0,1].
Proof (1) Direct consequence of Theorem 6.1.9.
(2) We have
L(JC, 0) = v{(s, t)eR2+

: s > l/x]

= - log Go ( - , 00 J = - log G I

, 00 j = x ,

for all JC > 0. Similarly it follows for L(0, JC).


(3) We have
L(JC,

y) = v{(s, t) R+ : s > l/x or t > 1/y}


< v{(5, 0 R+ : 5 >
= L(jc,0) + L(0,y)

= ^ + y,
for all JC, y > 0. On the other hand,

1/JC}

+ v{(5, f) R^. : f > !/)>}

6.1 Limit Laws


L(x, y) > v{(5, 0 e R+ : s >
= L(JC,0) v L ( 0 ,

1/JC}

y)=x

223

v v{(s, 0 R^. : t > l/y]


vy

for all JC, y > 0.


(4) If X and Y are independent, Go(l/x, l/y) = Go(l/*, oo) Go(oo, l/y) =
e~xe-y. Hence L(x, y) = - log G 0 (1/JC, l/y) = x + y. If X = F a.s., G 0 (x, y) =
P(X <x,Y<y)
= P(X < min(jc, y)) = exp(-l/min(jc, y)). Hence L(x, y) =
- log GQ(1/X, l/y) = maxU, y).
(5) The statement is an immediate consequence of the continuity of G, which
follows by Lebesgue's theorem on dominated convergence using, for example, representation (6.1.29).
(6) The function (ax) V (by) is convex for x,y > 0, provided a,b > 0.
Also, positive linear combinations of such functions are convex. Hence L(JC, y) =
log Go(l/*, l/y), which can be written as
rn/2

- l o g G o ( l / x , l/y) = /
Jo

((jccos0) V (ysin<9)) V(d0) ,

is convex.

The function L leads to a characterization of the limit distribution in the following


way. For c > 0 define the level sets Qc by
Qc:={(x,y)Rl

L(x,y)<c}.

The sets Qc have the following properties:


1. Qc is a closed convex set.
2. The points (0,0), (c, 0), and (0, c) are extreme points.
3. c = c Q i .
The convexity of the level set Q\ is characteristic for a limit distribution, as the
following theorem shows.
Theorem 6.1.22 For any simple max-stable distribution Go define the set Q\ by
Ql:={(x,y)eRl:-\ogG0(l/x,l/y)<l}

(6.1.34)

The set Q\ is closed convex and the points (0, 0), (0, 1), and (1,0) are vertices.
Conversely, any closed convex set Q \ with vertices (0,0), (0,1), and (1,0) gives rise
to a limit distribution Go for which (6.1.34) holds. The mapping is one-to-one.
Proof. Let Go be a simple max-stable distribution. The convexity of Q \ is an immediate consequence of the convexity of the function L (cf. Proposition 6.1.21 (6)). The
statement about the vertices follows from the side conditions for *I>.
Conversely, let Q\ satisfy the stated properties. The closed convex set Q\ can be
approximated from below and above by sequences of sets Q^ and Qff that satisfy
the properties and have a polygonal boundary, i.e., a boundary on R^_ satisfying

224

6 Basic Theory

J21 max(fl/jc, &; v) = 1 with at > 0, 6,- > 0, / = 1, 2 , . . . , n, being constants


satisfying ^ " a,- = YA h = 1 Then we have

and

e c Gi c e

(6.1.35)

Cu} \ ei"} i o, it -> oo.

(6.1.36)

Note that the sets Q and Q\y can be written as follows:


m{n)

Qf =

(X,V)ER2_:^((A/X)V(5/V))<1
I=I

r(n)

G ^ = j (x, y)eRl:

^((QJC)

v (D/30) < 1

i=i

for some sequences m(n), r(n) and positive constants A;, 5,-, C,-, A > 0 satisfying

?<> A,- = ET (n) = E? B ) Q = i ( n ) A- = i.


Define distribution functions G^ and G{j by

G^C*,?) :=exp

-l(f^))-

Clearly G and G^ } are simple max-stable distributions and there exist discrete
n)
}
spectral measures *[ and * with
G

) =

X P

- / ;

fnl2

G%kx,y) = CXP

/ 2

( ^ v ^ ) <

(cos 9

sin 6> \

^ ,

Wo ( V ) * i } (<*0)
T (n)

The inclusions (6.1.35) and relation (6.1.36) imply


m(n)

r(n)

0 < ]T((A,*) v (Biy)) - ( ( C , * ) V (Diy)) -* 0 ,


i=l

i=l

for x,y > 0. It follows that


A")
:<)/
0<Gy\x,y)-G);\x,y)-+0,

n - oo,

n -+ oo ,

6.1 Limit Laws

225

for JC, y > 0 and hence there is a distribution function, Go say, such that for JC, y > 0,
lim

G^Oc,

y) =

lim

G ( " } (JC, y) = G 0 (JC, y) .

By Corollary 6.1.15 we have also


lim V(fl\o) = lim V%\0) = V(0)
for 0 < 0 < nil. Clearly for
Q* := j

(JC, y ) e E + :

/"

((JCCOS0) V

(y sin0)) *(<W) < l I

we have

Qf c G* c e>
for all n. It follows that Q* = Q\. Hence
Qi = {(x,y)eRl

: - l o g G 0 (1/*, l/y) < l )

with
/

r/2

G 0 (*,y):=expl-J

/cos<9

f - v

sin<9\

J *(<W)J .

Since the convexity of the level set ?i is typical for a limit distribution, this property can be used to check whether the tail of a given distribution function resembles
a limit distribution. Details will be given later.
A related function to the function L (or the measure v) is the function R:
R(x, y) := x + y - L(x, y) = v |(.s, 0 e R+ : s > l/x and t > l / y | .
Note that the function R is the distribution function of a measure.
Finally, we review two other ways of characterizing the limit distribution Go in
the two-dimensional context.
Sibuya (1960), see also Geffroy (1958), introduced for t > 0,
X (0 := - log Go (1/f, 1) + log Go (1/f, oo) + log G 0 (oo, 1)
= I(f, 1) - L(t, 0) - L(0, 1) = -R(t, 1) .

(6.1.37)

By the homogeneity of the function - log Go, the function x determines the function
Go- The determining properties for the function / are as follows:
1. X is convex,
2. ( ( - 0 v (-1)) < x ( 0 < Ofor t > 0.
Pickands (1981) introduced for 0 < t < 1,

226

6 Basic Theory

A(t) := -logGo (j^-t, J) = L(l - M) .

(6.1.38)

By the homogeneity of the function log Go, the function A determines the function
Go. The determining properties of the function A are the following:
1. A is convex,
2. A(0) = A(l) = 1,
3. ( ( 1 - O v r ) <A(t) < 1.
Any function A satisfying Properties (l)-(3) leads to a unique limit function Go- The
convexity of A can be proved using
A(t) = 2 f (u;(l - r) v ((1 Jo

w)t)H(dw)

with H as in Theorem 6.1.14: in case H has a density, break the integral into two
integrals according to w(1 t) > (1 u>)r or w(l t) < (1 w)f and differentiate.
If H does not have a density, one needs to approximate the measure H.
For the characterization of multivariate max-stable distributions one can use the
distribution function of the spectral measure or any of the functions / and A. Since
the functions / and A are more complicated objects (convex functions rather than
monotone functions), we concentrate on the use of the spectral measure, which also
has a simple intuitive meaning. Moreover, the generalization to higher-dimensional
spaces is straightforward in that case.

6.2 Domains of Attraction; Asymptotic Independence


As stated in Section 6.1, the class of limiting distributions in (6.1.1) is characterized
by three objects: the extreme value indices y\ and yi of the marginal distributions
and the spectral measure that governs the dependence structure. This is reflected in
the domain of attraction conditions, which concern the marginal distribution and,
separately, the dependence structure.
Let G : R 2 -* R+ be a max-stable distribution function. A distribution function
F is said to be in its (max-) domain of attraction (notation F e V{G)) if for sequences
of constants an,cn > 0 and bn, dn real,
lim Fn(anx + bni cny + dn) = G(x, v)
n-+oo

for all x, y M. This is the same as convergence in distribution since any max-stable
distribution is continuous.
Theorem 6.2.1 Let G be a max-stable distribution. Let the marginal distribution
functions be exp ( - ( 1 + y\x)~l/yi)for i = 1,2 and let *I> or H or&be its spectral
measure according to the representations of Theorem 6.1.14.
1. If the distribution function F of the random vector (X, Y) with continuous
marginal distribution functions F\ and F2 is in the domain of attraction of G,
then the following equivalent conditions are fulfilled:

6.2 Domains of Attraction; Asymptotic Independence

227

(a) With Ut := (1/(1 - F;))*", i = 1,2, for x, y > 0,


,

l i m

17/77

77 /,NN

( 6

2 1 )

r->oo 1 - F ( t / i (0,1/2(0)
vWtfi 5(JC, y) := log G ((*" - l)/yi, (y^ - \)/Yi) I log G(0, 0).
(&) (Via ^ circle) For allr > 1 ant/ a// 0 [0, TT/2] fter are continuity points
of*,
lim P ( V 2 + W2 > t2r2 and < tan<9

V2 + W2 > t2J

-i *(fl)

(6.2.2)

= r

wfe?rc? V := 1/(1 - Fi(X)) and W := 1/(1 - F 2 (F)).


(c) (Via the triangle) For allr > 1 anJ a// s e [0, 1] f t o are continuity points

ofH,
+ W > tr and

lim P
f-*00

("

V+ W

< 5

V + W>f

= r-1H(j),

(6.2.3)

w / i ^ V := 1/(1 - Fi(X)) and W := 1/(1 - F2{Y)).


(d) (Via the square) For all r > 1 and all 0 e [0,n/2] that are continuity points

of,
lim P [WW
r-oo

> tr and < tan 0 V v f f x

W
-i *(0)

(6.2.4)

(f)'

wterK V := 1/(1 - Pi(X)) and W := 1/(1 - F2(Y)).


2. Conversely, if the continuous marginal distribution functions Ft are in the domain of attraction ofexp ((1 + Yix)~l^Yi)>for * = 1 2, anJ any limit relation
(6.2.1)-(6.2.4) holds for some positive function S or some bounded distribution
function *, H, or <J>, then F is in the domain of attraction ofG.
Proof The direct statements follow immediately from the results of Section 6.1.
For the converse statement assume for example (6.2.4). For any a, t > 0 we
define a probability measure Pa,t on M+ \ [0, a]2 by
Pa%t(B) : =
for Borel sets 5 c l

2
+

((y> W)etB\WW

\ [0, a]2. Note that by (6.2.4),


lim Pa,t(A) = Pa(A),
f-00

>ta)

228

6 Basic Theory

where Pa is a probability measure on M+ \ [0, a]2 and A a /^-continuity set of the


form
{(JC, y) : x V y > r and x/y < tan#}
with r > a and 0 < 0 < n/2. The finite unions of sets of this form constitute a family
that is closed under finite intersections and such that each open set in R+ \ [0, a]2 is
a countable union of sets in the family. It follows (Billingsley (1968), Theorem 2.2)
that
lim Pa,t(B) = Pa(B)
/-OO

for Pa-continuity Borel sets B in R+ \ [0, a]2. Since this is true for all a > 0, in
particular (6.2.1) holds. Hence the statements (la)-(ld) are equivalent. We proceed
with statement (la).
Since the function S is homogeneous of order 1, the statement implies that the
function 1 F(U\(t), U2{t)) is regularly varying with index 1. Hence there exists
a sequence an > 0, an -> oo as n -> oo, with
lim n{\ - F(Ui(an), U2(an))} = -logG(0,0) .
n-+oo

It then follows from (6.2.1) that


, y-

lim n {(1 - F(Ui(anx), U2(any))} = - l o g G


n^oo
\
and hence
lim Fn (Ui(anx), U2(any)) = G
n-oo
\
In particular, the marginal distribution converges:
lim Fn(Ui(anx),oo)
n-+oo

Y\

Y2

,
Y\

Y2

Xj

= G\-^-,oo)=exp()
\
Yl
J

Since the distribution function F(U\(x), oo) is in fact 1 1/JC, x > 1, we have also
lim Fn (Ui(nx), oo) = exp ( ) .
n-*oo

xj

It follows that lim n ^oo an/n = 1, i.e.,


lim Fn (Ui(nx), U2(ny)) = G
n-*oo

/XY\ _ 1 yYl _ 1 \

,
\

Yl

Y2

(6.2.5)

We now proceed as in the proof of Theorem 6.1.1. The convergence of the marginal
distributions implies, for an,cn > 0,
,.
Ui(nx)-Ui(n)
lim
n-+oo
v

hm

n->oo

jc^-l

an

U2(ny)-U2(n)
cn

Yl

yn

- 1
Yl

(o.z.o)

6.2 Domains of Attraction; Asymptotic Independence

229

Combining (6.2.5) and (6.2.6) as in the proof of Theorem 6.1.1 we get


lim Fn (anx + Ux(n), cny + U2(n)) = G(JC, y) .

n-+oo

Remark 6.2.2 In fact if the marginal distributions are in some domains of attraction,
if for all x, y,
\-F(Ux(tx),U2{ty))
t^>
l-F(Ul(t),U2(t))
exists and is positive and if the regularly varying function 1 F(U\(t), U2(t)) has
index 1, then F is in the domain of attraction of some max-stable distribution.
A particular case is the domain of attraction of a max-stable distribution with
independent components, i.e., one that is the product of its marginal distributions. A
random vector (X\, X2,...,
Xj) whose distribution is in the domain of attraction of
such a max-stable distribution is said to have the property of asymptotic independence.
A simple criterion for this to happen is given in the next theorem.
Theorem 6.2.3 Let F : M.d -> R+ be a probability distribution function. Suppose
that its marginal distribution functions F; : R > R+ satisfy
Hm^ F/1 (a^x

+ &>) = exp ( - ( 1 +

Yix)~l,Yi)

for all x for which 1 + yix > 0 and where a > 0 and bn are sequences of real
constants, i = 1, 2 , . . . , d. Let(X\, X2,...,
X<f) be a random vector with distribution
function F. If
, , "(*>"''/>"/>..0
t^oo
P(Xi>Ui(t))
for all 1 < / < j < d, then
lim F (<'>*, + b\ . . . , a^xd

+ b?)

= exp L p i

<6 . 2 . 7)

yiXi)-l/A

for 1 + YiXi > 0, / = 1, 2 , . . . , d. Hence the components of (X\, X2,...,


asymptotically independent.

Xd) are

Remark 6.2.4 It is clear that conversely, asymptotic independence entails (6.2.7).


Remark 6.2.5 The result means in particular that pairwise asymptotic independence
implies joint asymptotic independence.
Proof (of Theorem 6.2.3). Relation (6.2.7) means that for the exponent measure v
(Section 6.1.3) we have
v j (JI, s2,...,

sd) R+ : Si' > 0 and Sj > o | = 0 .

230

6 Basic Theory

Since this is true for all pairs (i, j ) , the exponent measure must be concentrated on
the lines
/,- = | (s\, S2,..., Sd) R+ : si > 0 and Sj = 0 for

i ^ j \ .

This is the same as saying that the spectral measure is concentrated on the extreme
points, i.e., the limit distribution has independent components (cf. Remark 6.1.16).
Example 6.2.6 (Sibuya (I960)) Consider the random vector (X, 7), normally distributed with mean zero, variances one, and correlation coefficient p < 1. We shall
prove asymptotic independence in this case, i.e.,
lim nP(X>bn,Y

>bn) = 0

n-+oo

with bn chosen in such a way that


lim nP (X > bn) = lim n (1 - F(bn)) = 1
(cf. Example 1.1.7). Note that
nP (X >bn,Y>

bn) < nP {^^~

> bn\

Now, (X -j- Y)/2 has a normal distribution with variance (1 + p)/2. If p = 1 the
result is immediate. If \p\ < 1,

IX + Y

lim nP ( - > bn I
Ai-KX)

= lim nP [X > J
n-+oo

V 1+ p

bn

= lim

n-+oc

P (X >

bn)

and this limit is zero by Corollary 5.4.2.


Note that if p = 1 the limit is one.
In the two-dimensional situation, when we have asymptotic independence, a quite
natural submodel can be defined. Since the model appears most naturally in a statistical
context, we postpone the discussion to Sections 7.5 and 7.6.

Exercises
6.1. Let A\, A2 be positive random variables with EAi = 1 for i = 1, 2. Prove that

,.:p(-.(v))
is a distribution function that is simple max-stable.

6.2 Domains of Attraction; Asymptotic Independence

231

6.2. Let Fn be two-dimensional normal distribution functions with means zero, variances one, and covariances pn. Define an := (21ogn) 1 / 2 and bn := (21ogn
log log n log(47r))1/2. Then we have (Example 1.1.7) Y\mn-+oo n{\ F(anx+bn)) =
e~x, for x e R, where F is the standard normal distribution function. Take pn such that
a%/(lpn) -> A. > 0,w -> oo.Provethat/i(3 2 /(3x8y))(l-F n (a n x+^ n ,ay+^ w ))
converges to 2~l log(A./(4jr)) - (4A.)"1 - X4~l (x - y) 2 - 2~l (x + y) for JC, y R.
Conclude that
lim /i (1 - F(a n * + &, ay + K)) = - log Go(*, y)
with Go form Theorem 6.1.1. Cf. Example 6.1.20.
6.3. Prove that if (X, Y) are random variables with distribution function F with
continuous marginals then the following are equivalent:
(a) F is in the domain of attraction of some max-stable distribution G.
(b) For any Borel set A C R+ with inf (X,y)eA max(;c, y) > 0 and v(9A) = 0,

Hfo-^n'-^T)

= v(A),

where the sequences an, cn > 0, bn, dn e R are chosen so that G(JC, OO) is as (6.1.4),
G(oo, y) is as (6.1.5), and v is the exponent measure defined in Section 6.1.3.
(c) For any Borel set A C R+ with inf (* ^ A maxQc, y) > 0 and v(9A) = 0,
lim np\(-

ITTZTZ,-,

^ 7

) nA\

= v(A),

->oo
IVl-Fi(X) 1-F2(F)/
J
where F; : R -> R+, i = 1, 2, are the marginal distribution functions of F and
satisfy, for some sequences an,cn > 0, bn,dn e R, limn-^oo F" (anx + bn) =
e x p ( - ( l 4 - y i ^ ) _ 1 / ) / 1 ) and lim-+oo F2n (c* + d n ) = e x p ( - ( l + Yix)~l,Y1) for
all x for which 1 + yix > 0, i = 1, 2.
(d) For any Borel set A c R+ with inf (x,y)eA maxQt, y) > 0 and v(3A) = 0,
lim t~lP {(1 - Fi(X), 1 - F2(Y)) fA" 1 } = v(A)
with Fj, i = 1, 2, as before.
6.4. Prove Theorem 6.1.14.
6.5. A distribution function F in the J-dimensional space is called max-infinitely
divisible if for all n there is a distribution function Fn with F% = F , i.e., for each n
the random vector can be written as the maximum of n independent and identically
distributed random vectors. Using the method of Section 6.1.3 prove that F is maxinfinitely divisible if and only if
- l o g F ( x i , x 2 , ...,**)
= v{(s\,S2, ...,sn)

:sj > xt for at least one i with i = 1,2,..., d]

232

6 Basic Theory

for all (JCI , * 2 , . . . , Xd) with 0 < F{x\, JC2,..., Xd) < 1, where v is a measure (not
necessarily homogeneous).
6.6. With Hf the density of the spectral measure H and q(x, y) the density of
-logG 0 (Jt, y), verify that for r = JC + y and 0 = JC/(JC + y), H'{0) =

r3q(0r,r(l-9)).

6.7. With <' the density of the spectral measure <f> and q(x,y) the density of
log Go(x, y), verify that for r = max(jc, y) and 0 = arctan(jc/y),

r V ' * , ' ' tan 0)/ cos2 0,


3

0 < 0 < TT/4,

r q (r, r/ tan 0) / sin 0, TT/4 < 0 < TT/2 .

6.8. If (X, y) is a random vector with some simple max-stable distribution function,
then L(JC, y) = JC + y for all x, y > 0 if and only if (X, Y) are independent.
6.9. Discuss properties of the function R (cf. Proposition 6.1.21).
6.10. Prove that R(x, y) is positive for all JC, y > 0 or R(x, y) = 0 for all x, y > 0.
6.11. Let (Vi, V2,..., Vd) be independent and identically distributed random variables with distribution function exp (1/JC), JC > 0. Let {rtj}f , = 1 be a matrix with
positive entries. Show that the random vector (v*y =1 n ?; V),..., vj^rdj Vj) has a
simple max-stable distribution. Find the distribution function. Show that any twodimensional simple max-stable distribution function can be obtained as a limit of
elements in this class.
6.12. Let X, Y be independent positive random variables with distribution function
F. Suppose lim^-^oo nP(X > JC a(n)) = x~a for some a > 0 and all JC > 0. Show
that for k\, A.2, v\, V2 positive, the random vector (k\X\ + X2X2, vi^i + V2X2) is in
the domain of attraction of the extreme value distribution

exp-

te^r (^)>

the spectral measure of which is purely discrete, with two atoms.


Hint: Apply Theorem 6.1.5 to v n () := nP((X, Y) e a(n)-).
6.13. Let X, Y be independent random variables with distribution function F. Suppose
that lim^oo nP(X < x a(n)) x~a for some a > 0 and all x > 0. That is, F is
in the domain of attraction of an extreme value distribution Gy with y = \/a < 0.
Show that as n > 00, for X\, A.2, v\, V2 positive and all JC > 0,
lim n2P ( - (A4X1 + X2X2) < x a(n) or - (viXi + v2X2) < y a{n))
= a 2 ff

s-a-lra-ldsdt

6.2 Domains of Attraction; Asymptotic Independence

233

where B := {(s, t) e R+ : k\s + Xzt < x, v\s + V2t < y}. Conclude that the distribution of (X\X\ + X2X2, V1X1 + V2X2) is in the domain of attraction of the extreme
value distribution with nondiscrete spectral measure. The marginal distributions have
extreme value index 2a.
Hint: Apply Theorem 6.1.5 to v n () := n 2 P((X, Y) e a{n)-).

7
Estimation of the Dependence Structure

7.1 Introduction
In Chapter 6 we have seen that a multivariate extreme value distribution is characterized by the marginal extreme value indices plus a homogeneous exponent measure
or alternatively a spectral measure. In particular, there is no finite parametrization
for extreme value distributions. This suggests the use of nonparametric methods for
estimating the dependence structure, and in fact we are going to emphasize those
methods.
In Sections 7.2 and 7.3 we shall consider estimation of the exponent measure v
exemplified by the function L and the sets Qc of Section 6.1.5, as well as estimation
of the spectral measure introduced in Section 6.1.4.
Further, in Section 7.4 we shall discuss a simple coefficient that summarizes the
amount of dependence between components of the random vector.
Finally, in Sections 7.5-7.6, for the case of asymptotic independence of the components, we shall discuss a submodel that allows for a more precise analysis in that case.

7.2 Estimation of the Function L and the Sets Qc


Recall that for any extreme value distribution function G for which the marginal distribution functions have the standard von Mises form we have defined the distribution
function Go by (cf. Section 6.1.2)
G0(*,v):=G(-/V Y\

Yl

-) ,
)

where y\, yi are the extreme value indices of the marginal distributions. The relation between Go and the exponent measure v from Section 6.1.3 is (cf. Theorem 6.1.5)
Go(x, y) = exp l-v 10, t) 1R+ : s > x or t > v)[) ,

JC, v

> 0.

236

7 Estimation of the Dependence Structure

Next we defined the function L by (cf. Section 6.1.5)


L(x,y)

:= - l o g G o l - , - )

for x, y > 0. In fact, L is connected to the exponent measure v as follows:

L(x,y) := v Us J) eR+ : s > 1/xort > l/y\ .


It is clear that L determines Go and v. So in estimating the function L in fact we
estimate the dependence structure of the extreme value distribution.
Since it is not realistic to assume that the available observations have been taken
from the extreme value distribution itself, we assume instead that we have independent and identically distributed observations (X\, Y\), (Z2,12), ., (Xn, Yn) from
a distribution function F that is in the domain of attraction of an extreme value distribution. Suppose that the marginal distribution functions of F are continuous. The
domain of attraction condition can be expressed as (cf. Corollary 6.1.4, Section 6.1.2)
lim t{\-F

(Ui(tx), U2(fy))} = - l o g G 0 ( * , y),

t-+oo

where t runs through the reals, i.e.,


l^j[l-F(ui^,U2^]=L(X,y).

(7.2.1)

We want to use relation (7.2.1) to obtain an estimator for L by replacing F by


its empirical measure and also U\ and U2 by their empirical counterparts (we shall
use the empirical left-continuous versions). The empirical counterpart of U\(t/x)
is Xn-[nx/t]+ifn. In particular, for U\(t) and Uiit), two basic quantities, we get
Xn-[n/t]+i,n and Fn_[w/f]+i>n. Since this vector plays an anchor role we want to call
them Xn-.k+\,n and Yn-k+i,n respectively. That means that we read (7.2.1) as

Jfc 1 {'-F H E ) * ())}-**

<7 22)

'

where k may depend on n but we need to have k = o(n). We shall see that in order to
get consistency for our estimators we also need to assume k = k(n) -> 00, n -> 00.
Replacing F by its empirical distribution function, U\(n/{kx)) by X n _[^]+i,/i, and
U2(n/(ky)) by Yn-[ky]+i,n in the left-hand side of (7.2.2), we get
1

L(x, y) := - 2_, 1{xi>xn-m+u

or Yi>Yn.[ky]+hn}

(123)

* ,-1
1

l{R(Xi)>n-kx+l

orR(Yi)>n-ky+l},

*,.,
where R(X() is the rank of Xt among (X\, X2,...,

Xn), i = 1 , . . . , n, i.e.,

(7.2.4)

7.2 Estimation of the Function L and the Sets Qc

237

*<X.0:=5>*,<*},
and R(Yi) is the rank of Yt among (Y\9 F2, > Yn).
Indeed, this estimator is invariant under monotone transformations of the components of the random vector; hence it does not depend on the marginal distributions.
We will establish consistency and asymptotic normality for L. We start with the
consistency.
Theorem 7.2.1 Let (X\, Y\), (X2, F2), ...be i.i.d. random vectors with distribution
function F. Suppose F is in the domain of attraction of an extreme value distribution
G. Define
(fx~Yl - 1 v~n - 1 \
L(*,y):=-logG(-J
-)
\
Y\
Y2 J
for x,y > 0, where y\, 72 aw the marginal extreme value indices. Let G be such
that L(JC, 0) = L(0, x) = x (this means that the marginal distribution functions of
G are exactly exp ((1 + Yix)~l/yi) for i 1, 2). Then for T > 0, as n -> 00,
k = k(n) -* 00, k/n - 0,
L(x, y)-L(x,

sup

y) ^ 0

0<*,;y<r

Proof First we show that it is sufficient to prove pointwise convergence. Fix s > 0.
Select
(0,0) = (*o, yo); (xi, v i ) , . . . , (xr, yr) e [0, r ] x [0, r ] ; (xr+i, yr+i) = (T, T) ,
such that for 1 = 0 , 1 , 2 , . . . , r,
0 < L(JC,-+I, y;+i) - L(xi9 yt) < - .

This is possible since L is bounded on finite rectangles, continuous, and monotone


(cf. Proposition 6.1.21). Suppose
L(xi,yi)-+L(xi,yi)
for i = 0 , 1 , 2 , . . . , r + 1. Then as n -> 00,
PI

sup

L(*i,?i)-

L(xi,yt)

\0<i<r+l

0-

By the monotonicity of L and L,


(

,sup

\(0,0)<(.x,y)<(T,T)'

\L(x,y)-L(x,y)\

>e
'

(7.2.5)

238

7 Estimation of the Dependence Structure

is at most the left-hand side of (7.2.5).


Note that the result is quite general: it holds for any function L bounded on finite
rectangles, continuous and monotone, and L monotone.
Next we prove pointwise convergence. Consider fixed x, y > 0. Since L(x, y) is
invariant under monotone transformations of the components of the vector, we can
write, with /,- := 1 - F\(Xi) and W( := 1 - F2(F*), for i = 1, 2 , . . . , n,
1 n
L(x, y) := - 2 J l{Ui<u[kxln or Wi<w[kyU}
i=i

We deal with this in two steps. First we consider


1
Vn,k(x>

: =

7 Z2 ^{Ui<kx/n or Wt<ky/n)

(7.2.6)

The characteristic function of Vnik(x, y) is


{(1 - Pn,k) + Pn,keHlk)n = {l - A,,* ( l - e"/*))"

-{,-;(i~*(,-'"*))r
with /7n,^ := P(Ut < kx/n or W( < ky/n). We know by (7.2.2) that npnik/k -
L(JC, y), n -> oo. It follows that the characteristic function converges to exp (itL(x, y)),
i.e.,
p

Vn,jt(A:,y)->L(x,y) .
Again by continuity and monotonicity the convergence is locally uniform. Next note
that
/n
n
\
L(X, y) = Vntk \j-U[kx],n, TW[ky],n)
(the random objects U[kx],n, ^[ky],n, and Vn^ are dependent but this is not relevant
for the present consistency proof).
Since
n

TJ

T U[kx],n -*X

and

ML7

- W{ky],n ~+ V

(cf. Lemma 2.4.11), we have proved


p

L(x,y)-+L(x,y)

Next we move to the more complicated matter of proving asymptotic normality


forL.
For the formulation of the theorem it is useful to introduce a measure /x that is
closely related to the measure v from Section 6.1.3 (cf. Remark 6.1.6) as follows: for
x, y > 0,

7.2 Estimation of the Function L and the Sets Qc

239

//,{($, t) [0, oo] 2 \ {(oo, oo)} : s < x ort < y]


:= v{(s, t) e [0, oo] 2 \ {(0,0)} : s > l/x or t > l/y} . (7.2.7)
So the two measures are the same modulo the marginal transformations x \-> l/x
and y H l/y.
Let D([0, T] x [0, T]) be the space of functions in [0, T] x [0, T] that are rightcontinuous and have left-hand limits.
Theorem 7.2.2 Let (X\, Y\), (X2, Y2), ...be i.i.d random vectors with distribution
function F. Suppose F is in the domain ofattraction ofsome extreme value distribution
G. Suppose also that the marginal distribution functions F\ and F2 are continuous.
Let the marginal distributions ofG be standard, i.e., the function L defined by
/x~yi
L(*,y):=-logG(-

- 1 v~n -/-

l\
-)

(with / 1 , y2 the marginal extreme value indices) satisfies L(x, 0) = L(0, x) = x.


Suppose that for some a > 0 and for allx,y > 0,
t j l - F (l/x (),

U2 ( - ) ) } = L(x, y) + 0(ra)

(7.2.8)

f -> 00, where Ut := (1/(1 F())*~, i = 1,2, holds uniformly on the set
|JC2 + V2 = 1, JC > 0 , y > 0 J .

Suppose further that the function L has continuous first-order partial derivatives
L\(x, y) := L(x, y) and L2(x, y) := L(x, y)
ox
ay
forx, y > 0. Then fork = k(n) -> 00, k(n) = o (n2a/<1+2a>), as n -* 00,

Vk(L(x,y)-L(x,y))4>B(x,y)
in D([0, T] x [0, T]), for every T > 0, w/*ere
B(x, y) = W(JC, v) - LI(JC, v)W(JC, 0) - L2(x,

y) W(0, v)

awd W is a continuous mean-zero Gaussian process with covariance structure

EW(xu yi)W{x2, y2) = 11 (R(xu yi) n R(x2, y2))


with
R(x,y) := Uu,v) e R+ : 0 < u < x orO < v < y\ .
It is useful to prove a similar statement for Vn^ (see the proof of Theorem 7.2.1)
before giving the proof of the theorem.

240

7 Estimation of the Dependence Structure

Proposition 7.2.3 Let (X\, Fi), (X2, Yi),...


tion function F. Suppose that for JC, y > 0,

be Ltd random vectors with distribu-

rHS,'{1-F(^G)'%G))}=L<*^>Then, provided k = k(n) -> 00, k/n -> 0 05" n > 00,

m D([0, T] x [0, 71),/or every T > 0.


Proo/ We prove convergence of finite-dimensional distributions plus tightness (cf.
Billingsley (1968), Theorem 15.1). For ease of writing we take 7 = 1. The changes
in the proof for T ^ 1 are obvious. For the convergence of finite-dimensional distributions it is sufficient to prove (Cramer-Wold device) that

J2frW(Xr,yr)
r=l

for d = 1,2,..., all real numbers f i , . . . , td, and all (x\, y i ) , . . . , (xd, v^).This can be
done conveniently by applying Lyapunov's form of the central limit theorem (Chung
(1974), Theorem 7.1.2).
In order to establish tightness we define subrectangles

\_m

m J

Lm

for /, j = 0, 1, 2 , . . . , m 1 and define

Wn{x, y) := V* (VnM*, y) ~ \ {1 - F ( * ( ) . U2 ( i ) ) J)
forO < JC, v < 1.
The main tool is an inequality from Einmahl (1987), which for our purposes can
be written as follows. Define for a rectangle S e [0, l ] 2 ,

1 A

V,k(S) := 7 2^, llna-Fl(Xi),l-F2(Yi))/k

6 S]

and
v,k(S) := \P (j (1 - *i(X), 1 - F2(Y)) 6 5 j .
Einmahl's inequality: Let R be a rectangle in [0, l ] 2 with

7.2 Estimation of the Function L and the Sets Qc

241

0 < U-P (j (1 - Fi(X), 1 - F2(Y)) G J?) < i .


Then there exists a constant C > 0 such that
P (sup |Vt (Vnfifc(S) - vn,k(S))\ >x)<

Cexp(

(_ - ^ _

] |

for A > 0. The function ^ satisfies the following conditions: VOO is continuous
and decreasing, xx/rix) is increasing, and ^r(O) = 1. In particular, this implies that
VK*) >0forjc > 0 .
We shall apply the inequality to the rectangles
L

m \

\_m

\m

m J

|_

m J

i, 7 = 1,2, . . . , m - 1.
First note that if for some x = (JCI, JC2) and y = (yi, y2) with |x y| < 8
(consider the Euclidean norm) we have |Wn(x) Wn(y)\ > s, then there exist i, j
such that x and y are in //7 with m = \\fl/8'\ and
6

Wn(x)-Wn(-,^]\

Wn(y) - Wn

>-or

s
> -

L l

( - )\

Hence
P I

|\Wn(x)-Wn(y)\>4s

sup

\|x-y|<a/2

<P\

< T

max
sup \Wn(x) - Wn (-,
yu=0,i,2,...,m-iXG/.. I
\m
T

P ( sup \Wn(x) - Wn ( - , ^ ) > 2e\

j=0 ; = 0

\ X G / 'i '

m-lm-1

< V

^ ) > 2s)
m)\
J

Y >

'

* \ I

/
\

sup W(x) - WH (xu - ) > e

+ P ( sup W fjci, ^
m-lm-1

- W ( - , - )
2

> e)

\\

1=0 ; = 0

+ C exp I
\ 2vn,k(Kij)f

\VkVn,k^ij)))

'

(7.2.9)

242

7 Estimation of the Dependence Structure

where for the last inequality we apply Einmahrs inequality with R replaced with Jtj
and Kij. Note that WXuo = WO,JC2 = 0 for x\, X2 > 0.
Next note that

^ w s ;,{!(,- a ( n[i.ii])}.i
and

^ , s :,{:(- m c[i.ii])j-I.
Hence by the monotonicity of x\/r(x) expression (7.2.9) is at most
m1m1

2C

\\

Sg -(-T*(5))'
Since \[r(0) = 1, this clearly converges to zero as A: > oo, ra > oo, m = o(Vk).
Corollary 7.2.4 If moreover (7.2.8) /wfcfo ifc(/i) -> oc andk(n) = o (2c*/(i+2<*))

fl5

V* (VntJk(jc, v) - L(*. y)) - i W(JC, y)


in D([0, 7] x [0, 7 ] ) , . / ^ every ^ > 0.
Proof. Invoking a Skorohod construction we can start from

VS{^C.,)-i[i-i'(,(i).ft(i))]j-wu.,)

sup
0<x,y<T

-> 0

a.s.

Then
Vfc (Vnfifc(jc, y) - L(x, y)) - W(x, y)\

sup
0<x,y<T

<

sup
0<x,y<T

Vk\vn,k(x,y)--

+ sup Vk
0<x,y<T

\-F

N)-0)] }-"<"

i ['-'(" )*()) -L(x,y)\

In view of the condition k(n) = o (n2a/(1+2a)} it is now sufficient to prove that


sup
t
Q<x]+xl<T

"(l-'(">()

">&)))-""*>

is bounded as t | 0. We shall prove this for T = 1. Now,

(7.2.10)

7.2 Estimation of the Function L and the Sets Qc

243

'('-'("() *()))-= |x| a ('W)


-|x|L

>" [I-Kw(5K7w)^(?i557iii))

VW'WJI

. , x , - (,x|)- |(,|x|- [l - F (c, ( j j j j i ^ ) . U 2

(^))]

Vlx| \x\J\\
This expression remains bounded uniformly for |x| < 1 as t | 0 since |x| 1 + a < 1
and since by (7.2.8) the second factor remains bounded uniformly for |x| < 1.

Proof (of Theorem 7.2.2). Once again it is sufficient to prove the result for T = 1.
Again we invoke a Skorohod construction (but keep the same notation) and we start
from
sup V* (Vfjt(jc, v) - L(x, y)) - W(x, y)\ -> 0 a.s.,
0<x,y<l '

'

which implies by setting y = 0,


sup

\Vk{Vntk0c,0)-x)-W(x, 0)

0<x<l'

with

a.s.

(7.2.11)

1 n
Vn,k( ' 0) = 7 Zl l{l-Fi(Xi)<kx/n)
x

i=l

Let Ut := 1 - Fi(Xi), i = 1, 2 , . . . , n. The function V^(JC, 0) is a nondecreasing function. Its inverse function is (n/k)U\kx],n, the [kx]\h order statistic from
t / i , . . . , Un. Vervaat's lemma (Appendix A) allows us to invert relation (7.2.11) and
we get
(7.2.12)
sup \Vk (jU[kxln - x) + W(x, 0) ->0
a.s.
<JC<1 '
0<x

X,C

'

Similarly we get
sup \Vk (jW[kyln
0<v<l'

-x) + W(0, y) - > 0

a.s.

with Wt := 1 - F2(Yi), i = l , 2 , . . . , n .
Since we have uniform convergence in
Vk (Vntk(x, y) - L(x, y)) - W(x, y) -> 0
and since by (7.2.12) and (7.2.13)

(7.2.13)

244

7 Estimation of the Dependence Structure


sup

T.U[kx],n

-X

-> 0 and

sup

0<x<\

-W[ky],n-y

a.s. ,

(7.2.14)

~> 0

a.S.

0<y<\

we have
^{Vn,k

(j-Uikxin, lWihln)

~ L ( /[**],,,, j^BW.")}
/n
n
- W \rU[kx],n, TW[kyln)

uniformly. We consider the three terms separately. First note that


/n
n
\
Vn,k \rU[kx],n, -rW[kyin) = L(x, y) .
Further, (7.2.14) implies by the continuity of W that
sup

\w (lu[kxln,

lw[kyln)

- W(x, y)

0<*,;y

a.s.

Finally, relations (7.2.12) and (7.2.13) imply when combined with Cramer's delta
method and the differentiability conditions for L that
sup

Vk ( L \rU[kx]tn, 7 % , ) - L(x, y)J

0<*,;y<l

II(JC,

y)W(x, 0) + L2(x, y)W(0, v)

a.s.

The proof is complete.


Next we turn to estimating the set Q\ introduced in Section 6.1.5. The set determines the distribution Go, as we have seen in this section. Since the convexity of
the set <2i is characteristic for Go being a simple max-stable distribution, looking at
how close the estimated Q-set is to a convex set, it can provide a heuristic way of
checking whether the distribution F is in some domain of attraction, as we shall see
later on. The estimator of the set Q\ is
Qx :={(*, y ) e *

L(x,y)<l)

with L from (7.2.4).


Clearly the set includes the points (1,0) and (0,1). If one draws the points (n
R(Xt) + 1, n - R{Yt) + 1), i = 1 , . . . , n9 in
and one draws horizontal and
vertical lines through these points, then the boundary of Q\ follows these lines. One
can start for example from the point (1,0) and follow the vertical line until one hits
the first horizontal line. Then one goes either up or to the left depending on whether
the point on the horizontal line is to the left or to the right respectively. This way
the number of points in the L-shaped area to the left and below the graph is kept
constant. The estimation procedure is naturally extended to estimate any Qc curve:
Qc:={(x,y):L(x,y)<c}.

7.2 Estimation of the Function L and the Sets Qc


u.vo

245

r r

0.04

0.02

0 i'*-\

r'
0.04

i'i-"

0.02

r
0.06

HmO

Fig. 7.1. Estimated Q-curves: k = 7,14, 21, 28, 35,42.


The resulting graph for a particular data set is shown in Figure 7.1. The estimated
curves are for different levels of c = k/n, corresponding to k = 7,14, 21, 28, 35,42.
The data set corresponds to 828 observations of wave height (HmO) and still water
level (SWL) (these observations are illustrated in Figure 8.1). The characterizing
properties of the Q-curve, convexity and equality of shape for different levels of c,
seem to be true for this data set, giving some confidence in the extreme value model.
Also, even for small values of c the curve seems to differ significantly from a straight
line, giving indication of dependence between (high values of) the variables.
By estimating the Q-curve in this way, one does not make use of the extreme
value conditions. An alternative way of estimating Q is via the homogeneity of L.
Let {p(0)}o<e<jt/2 be the polar representation of the boundary of the set Q\ between
the points (1,0) and (0,1),
L(p(0)cos0,p(0)sin0) = 1 ,

O < 0 < TT/2,

and since L is homogeneous of degree 1,


1

(7.2.15)
L (cos0, sin0)
We can estimate the set Q\ by estimating p(0), 0 < 0 < n/2. A natural estimator
forp(0) is
p(0) =

p(0) : =

L (cos0, sin0)
In Figure 7.2 we find the estimation of Q\ via p(0). Again the concavity of the
Q-curve seems true and there is some indication of dependence between (high values
of) the variables.
The asymptotic normality of p(0) follows straightforwardly from Theorem 7.2.2.
Corollary 7.2.5 Let (X\ ,Y\), (X2, Yi), ...be Ltd. random vectors with distribution
function F. Suppose F is in the domain of attraction of an extreme value distribution
G with standard marginals. Suppose that for some a > 0 and for all JC, v > 0 the
relation

246

7 Estimation of the Dependence Structure

~i

0.2

0.4

0.6

0.8

1.0

HmO

Fig. 7.2. Estimated d-curve fr0m p(0) with k = 42.

-F( E / 1 (i).%(i))}=L(,, y ) + 0(O.


f oo, w/iere t/,- := (1/(1 Fi))*~ and Ft is the ith marginal distribution function,
supposed continuous, i = 1,2, holds uniformly on the set

L2 + y2 = 1 : x >0,y >ol .
Suppose further that the function L has continuous first-order partial derivatives
Li(x, y) := L(x, y) and L2(x, y) := L(x, y)
ox
ay
forx, y > 0. Then fork = k(n) - oo, Jfc(n) = o (,,2a/(i+2a))
fl5 - > OO,

VX ( ?^1 - 1 ) 4 - -p(0)B(cos0, sinO)


\P(0)
)
in D([Q, 7r/2]), WIYA 5 fAe stochastic process defined in Theorem 7.2.2.
Finally, we remark that (for example) in the three-dimensional space one can
estimate L by
1

L(x, y, z) : = - 2 ^ 1 { J ? ( X / ) > H - * * + 1 or tf(y/)>n-*y+l or J?(Z,-)>n-*z+l}


i=l

and the g-curve by


Q\ :={(x,y,z)eRl

L(x,y,z)<l}.

A sample graph of the g-curve in R+ is in Figure 7.3. The variables involved are
wave height (HmO) and still water level (SWL) as before, and wave period (Tpb)
measured in seconds. The picture indicates no asymptotic independence since for
asymptotic independence one expects a flat convex function.

7.3 Estimation of the Spectral Measure (and L)


*=54

247

k=21

Fig. 7.3. Trivariate Q-surfaces: contours shown correspond to k = 54,27,14.

7.3 Estimation of the Spectral Measure (and L)


In Section 7.2 we were concerned with estimating the extreme value distribution Go
via estimation of the function L(JC, y) := - log Go(l/x, 1/y), x, y > 0. However, in
general Go := exp(L(1/JC, 1/y)) itself is not an extreme value distribution since it
is not guaranteed that L satisfies the homogeneity property that is valid for the function L:
L(ax,ay) = aL(x, y)
for a, JC, y > 0. Whether this is a problem depends on the application. However, it
is useful to develop an estimator for Go that itself is an extreme value distribution.
This can be done via Theorem 6.1.14 (e.g., via (6.1.31)), which states that any finite
measure satisfying the side conditions, represented by the distribution function 4> (the
spectral measure), gives rise to an extreme value distribution Go via (6.1.31). Hence
now we focus on the estimation of the spectral measure and in order to do so we
have to go back to the origin of this measure. We discuss only the spectral measure
of Theorem 6.1.14 (3) and not the other two, since asymptotic normality has been
proved so far only for the third form of the spectral measure.

248

7 Estimation of the Dependence Structure

Adapting the arguments in the beginning of Section 6.1.4 for this specific case,
we consider the sets
Dr,e := {(*, y) e R+* : x v y > r andx/y < tanflj
for some r > 0 and 0 e [0,7r/2]. Then
<D(0) := rv(Dr,e) = v(Dhe),

(7.3.1)

where v is the exponent measure of Section 6.1.3. Since it is easier in this context
to work with the uniform distribution as the basic distribution rather than with the
distribution function 1 1/JC , JC > 1, we reformulate (7.3.1) in terms of the measure
ix, defined (as in Section 7.2) by
JJL{(S, t) e [0, oo] 2 \ {(oo, oo)}

: s < x or t < y}
2

:= v{(s, t) e [0, oo] \ {(0, 0)} : s > l/x or t > l/y] .


Clearly the two measures are the same modulo the marginal transformations x \->
l/x, y \-> l/y. Then
$(0) = n(Ehe)
with
Eq,o := \(x, y) e [0, oo] 2 \ {(oo, oo)} : x A y < q and y/x < tanflj

(7.3.2)

for some q > 0 and 6 e [0, n/2].


Now assume for the moment just for the sake of simplifying this explanation that
the marginal distributions of F, F\, and F2, are continuous. Then from the proof of
Theorem 6.1.9,
lim t P(\-

t^oo

Fi(X) < - or 1 - F2(Y) < -)


t

t /

= fji{(s, t) e [0, oo] 2 \ {(00, 00)} : s < x or t < y};


hence
1
1-F?(Y)
^
lim t P I (1 - Fi(X)) A (1 - F2(Y)) < - and
- ^ < tan^
(<
= n(Ehe) = <t>(0)
(7.3.3)
for all continuity points 0 of <t>, where (X, Y) is a random vector with distribution
function F.
Now suppose that we have independent and identically distributed random vectors
(X\, Y\), (Z2, F 2 ) , . . . , (Xn, Yn) with distribution function F. In order to transform
the left-hand side of (7.3.3) into an estimator of $ we are going to replace the measure
P by its empirical counterpart and we replace the measures symbolized by F\ and F2
by their empirical counterparts. But before doing so we need to choose t from (7.3.3)

7.3 Estimation of the Spectral Measure (and L)

249

in relation to the sample size n. Since we want to deal with the tail of the distribution
only, the choice t = n/k imposes itself with k = k(n), k - oo, and k/n - 0.
Next we replace the measure P by its empirical counterpart. Then the left-hand
side of (7.3.3) becomes
n1
~
kn

X ) l{(l-F1(Xi))A(l-F2(Yi))<k/n
i=l

and l-F 2 (r)<(l-Fi(X))tan0}

Further, we replace l F\ (x) by its empirical counterpart, the left-continuous version


of the empirical distribution function,

l-Fl"\x):=l-J2l{Xj>x].

(7.3.4)

J=i

Then 1 F\ (X() should be replaced by


1 rW/yv
^ 1
* + l-*(X/)
1-^
(^)-=-2^1{^>XI} =
.
7=1

where R(Xt) is the rank of the ith observation X,-,i = 1 , . . . , n, among (Xi, X 2 , . . . ,
Xn). Similarly we replace 1 - F2{Yi) by (n + 1 - R{Yi))/n where fl(F;) is the rank of
7; among (Y\, Y2,..., y n ). Taking everything together we get the following estimator
for <I>:
1

* ( ^ ) '= 7 22

l{R(Xi)vR(Yi)>n+l-kandn+l-R(Yi)<(n+l-R(Xi))tanO}

(7.3.5)

The estimator is nonparametric in that the statistic is invariant under monotone transformations of the marginals of the observations. So it does not depend on the marginal
distributions.
In Figure 7.4 we find <$>(#) for the 828 observations of wave height (HmO) and
still water level (SWL). Note that there is some indication of dependence between
the variables since the angular coordinates are not clustered in the neighborhood of
0 and JT/2.
Another way of displaying the estimator of the spectral measure makes use of its
discrete character: it gives equal weight to a limited number of points. Hence we can
just display on the line segment [0, TT/2] the points
f

arctan

n + 1 - R(Yj)
n+
l-R(Xt)

for those observations (X,-, Yi) for which R(X() V R(Yi) >n + l-k. This is done
in Figure 7.4 too.
One can do this similarly in higher dimensions. For example, in R 3 one can display
the intersections of the lines through the points

250

7 Estimation of the Dependence Structure


k=21

e 0.5

0.25

SWL

HmO

Fig. 7.4. Estimated spectral measure and angular coordinates 0,- (shown as "+" signs); the solid
line represents the corresponding distribution function <I> scaled down from 39/27 to 1.
(n + 1 - R(Xt), n + 1 - RiYt), n + 1 -

R(Zt))

and the origin with the plane { J C , J , Z > 0 : J C + J + Z = 1}. TO display the intersection
points on this triangle we do the following. Figure 7.5 shows the situation in which
P is the point
n + 1 - R(Xt)
n + 1 - R(Xt)
( 3n + 3 - R(Xt) - R(Yt) - R(Zt)' 3n + 3 - R(Xt) - R(Yt) - R(Zt)'
n + 1 - R(Xj)
3n + 3 -RiX^-RiYA-RiZi))

and O is the origin.


In order to find the distance DB consider the triangle O AB; see Figure 7.5. Note
that
OF = (n + 1 - R(Yi))/(3n + 3 - R(Xt) - R(Yt) -

R(Zt)),

FG = EF = (n + \ - R(Xi))/(3n + 3 - R(Xt) - R(Yt) - i?(Z,-)),


and hence
FG = (n + 1 - R(Xi))V2/(3n + 3 - R(Xt) - R(Yt) - R(Zt)) .
Now notice that

EF _ OF
~DB~~OB'

It follows that

DB = V2

n + 1 - R{Xi)
2n + 2-R(Xi)-R(Yi)

'

Similar relations hold for the lines connecting P with the other edges, A and C.

7.3 Estimation of the Spectral Measure (and L)


0

251

/E ^-yD
/

A
T

Fig. 7.5. The point P in the triangle and its projection in the plane.

Figure 7.6 displays the empirical spectral measure for the three-dimensional sea
state data. Again the picture indicates no asymptotic independence since for asymptotic independence one expects the point to be concentrated near the three vertices.

HmO

Tpb HmO
*=14
SWL

HmO

Fig. 7.6. Trivariate angular coordinates representing the spectral measure; scatter plots shown
correspond to k = 54, 27, 14.

252

7 Estimation of the Dependence Structure


Now recall the connection between the function L of Section 6.1.5 and <f>:
rn/2

L(JC, y) = /

{(JC(1 A tan (9)) V (y(l A cot (9))} &(d0)

(7.3.6)

Jo
for JC, y > 0 (see Theorem 6.1.14(3)). After splitting the integration interval into
several parts and applying partial integration, we obtain (cf. proof of Theorem 7.3.1
below) the alternative expression
/7T\

L(x,y)=x<P(-)

/-arctanOV*)

+ (xvy) /
V2/
Jn/4

*(0) (
\sin2<9

^ J d0 . (7.3.7)
cos20/

rhis leads to an alternative estimator of the function L:


* /7T\

rzrctm(y/x)

L*(*, v) := x <J> M + (* v v) /
^2/
/j/4

4>(0) 5 - A T - J 0 . (7.3.8)
\sin 2 0
cos20/

This estimator is somewhat more complicated than the one in Section 7.2. On the
other hand, the present estimator has the advantage that it is homogeneous, i.e.,
L{ax, ay) == a L<D(JC, y)

for a, x, y > 0 and therefore the function


Go(*, 3O := exp ( - I * ( l / * , 1/y))
is an estimator of the max-stable distribution function Go, which itself is a max-stable
distribution function.
We now proceed to prove consistency and asymptotic normality of both estimators.
Theorem 7.3.1 Let (X\, Fi), (X2, Y2),... be Ltd. random vectors with continuous
distribution function F. Let F\(x) := F(x, 00) and F2&) := F(oo, JC), and let Ui
be the inverse of the function 1/(1 Ft), i = 1, 2. Suppose for x, y > 0,

UmJ[l-F(ul^,U2{^)]=L(X,y).
Let k = k(n) be a sequence of integers such that k - 00,fc/n 0, n > 00. 77ien

*(0)-5>*(0)

(7.3.9)

/or 0 = 7r/2 anJ eac/i 0 [0, n/2) that is a continuity point 0/O. Moreover,
U(x,y)-+L(x,y)
forx, y > 0.

(7.3.10)

7.3 Estimation of the Spectral Measure (and L)

253

Corollary 7.3.2 The statements of Theorem 7.3.1 imply the seemingly stronger statemmtS

t /

\m^P \X (<D, <DJ > EJ = 0


for each e > 0, where X is the Levy distance:

(7.3.11)

X (<*>, <I>)
= inf{<5 : <S>(<9 - 8) - 8 < 4>(<9) < 0(6> + <5) + SforallO < 0 < n/l]
and for all L > 0,
sup

L<t>(x,y)-

L{x,y)

4-0.

(7.3.12)

0<x,y<L

Proof (of Theorem 7.3.1). Define the measures \x and A as follows: for a Borel set A
in [0, oo] 2 \ {(oo, oo)},
M(A) := lim n-P ((1 - Fi(X), 1 - F2(Y)) e -A] ,

(7.3.13)

where (X, F) is a random vector with distribution function F and as before Theorem
7.3.1, we define
1 n
/1(A) := - 22 U(n+l-R(Xi),n+l-R{Yi)) A:A}

(7.3.14)

By Theorem 7.2.1 for all T > 0,


sup |A (([x, oo] x [y, oo])c) - n (([x, oo] x [y, oo]) c )| -+ 0 .
o<*,;y<:r

(7.3.15)

We invoke a Skorohod construction and proceed (without changing notation) as


if this convergence held almost surely. By subtracting two sets as in (7.3.15) we then
find that for 0 < JCI < x2 < oo, y > 0,
P

A ([x\9X2] x [y, oo]) -> [i ([xi,x2] x [y, oo]).

(7.3.16)

By subtracting two sets as in (7.3.16) we get for 0 < x\ < x2 < oo, y > 0,
A ([*i, *2l x [0, y]) -+ /x ([xi, x 2 ] x [0, y]).

(7.3.17)

This is also true with x2 = oo, y = oo. Let 0 be a continuity point of 3>(0). Clearly
for e > 0 we can find two finite unions of sets as in (7.3.17), Le and Ue, such that
Le c Eue C tfe
and

254

7 Estimation of the Dependence Structure


\x (U) -e<fi

(Eh0)

< n (Ls) + s

with E\to from (7.3.2). Also we have


H(L) < *(0) = A (Eit0) < H(U6) .
Since jl(Ls) and jl(Ue) arc clearly weakly consistent estimators of ix{Ls) and
/JL(US) respectively and since ii{Ue) /x(L) -> 0 as s | 0, we conclude that
*(0) = ( i , * ) ^ M ( i , * ) = * ( 0 ) .
Finally we are going to prove the statement about L<j>. Before doing so we first
prove the alternative representation of L in terms of <. First note that
pn/4
fjr/4

L(x,y)=

/
JO

rn/2
pn/2

((JC tan 0) V y) <Z>(dO) + /


T/4
Jn/4

(JC V (y cot 0)) <2>(d0)

Let JC < y. Then the two integrals become


rn/4

y /

parctan(y/x)

./0

cot0 <P(dO) + JC /

./jr/4

Since cot 0 = 1 / ^ , 4 sin


f*arctan(y/x)
/arctanov*;

>/

pn/2

<Z>(dO) + W

q dq, this equals


pn/2
/*n7^

<t>(d0)+x

./0

<D(</<9) .

*/arctan(;y/;t)

/arctan(;y/jt)
/0
/arctan(j/jt) /y

arctan(;y/jt)
/arctan(y/jt)

<Z>(dO)-y

Jn/4
Jn/4

J^
2

(d0),

Jn/4
Jn/ Sin q

and by interchanging the integrals and some simplifications we get


rarctan(;y/*)

Q ^

T/4

sin 2 0

r*/ 4

4>(<9)

dO .

Similarly for x > y we get

/arctan()7jt) COS2 0

dO

This gives the stated form. Similarly we have


faictan(y/x)

! * ( * , ? ) = * * ( - ) + (*v;y) /
v
n/4
2/
7^/4

4>W

\sm20

_ ^
^ . (7.3.18)
cos^fl/

Since <f>(<9) -* *(0) for 0 = JT/2 and all (9 in some set 5 where [0, jr/2] fl 5C is
a countable set,
nlim

(Ux,

y) - L(X, y)) = X Jto

+ (xvy)f
J

(* (|)

- *

(|))

lim (<J> (0) - <J> (0)) ( - \ - A L _ ) d$ = 0.


Jx/4<fi<mua(y/j,)
ees

n-HX> \

' \silT

COS2 0 J

7.3 Estimation of the Spectral Measure (and L)

255

Proof (of Corollary 7.3.2). We have already shown that (7.3.10) is sufficient for
(7.3.12) (cf. proof of Theorem 7.2.1). Now we show that (7.3.9) is sufficient for
(7.3.11). Fix e > 0. Take 0 < 0O < 0\ < < 6r < 0r+i = n/2 such that 0; is a
continuity point of <J> for i = 1, 2 , . . . , r and 0;+i 0; < s for / = 1, 2 , . . . , r. Then
as n > oo,
P

sup 4>(0;)-<D(0;) > f i | - > 0 .


\\<i<r

For any 0 [0, f ] there exists 0k such that 0 < 0k < 0 + s. Then if <S> (0*) >
$ (0fc) > we have
0> (0 + s) > <S> (0*) > 4> (0*) - e > 4> (0) - e .

Similarly there exists 0;- such that 0 -s

<0j <0. Then if 4> (Oj) < O (0y) + s,

<J> (0 - ) < <S> (0 ; ) < <D (0j) + e < <*> (0) + .

It follows that
P (<> (0 - e) < <D (0) + and <J> (0 + e) > O (0) - e for 0 < 0 <

n/l\

> PU> (0i) - 8 < 4> (0/) < O (Oi) + e for i = 0, 1 , . . . , r + l ) -> 1
as n -> oo.

For the asymptotic normality of 4> and L$ we need two conditions, both of which
strengthen the domain of attraction condition
lim t p ( \ - Fi(X) < - or 1 - F2(Y) < -) = L(x, y)

t->oo

tJ

considerably. Also we need to impose a further restriction on the growth of the sequence k(n).
Let 8 e | l , \, ,. . . | , p = 0 , 1 , 2 , . . . , 1/8 - 1, and define h(p) :=
[p8/ tan 0, (p + 1)5/ tan0], 0 [0, TC/4]. Let .A be the class containing all the following sets:
1. U ^ " 1 {(JC,V) :x e h(p),0< y <x tan0 + Cp(x tan0) 1 / 1 6 }, for some 0
[0, jr/4] and Co, C i , . . . , Ci/a-i [ - 1 , 1];
2. {(*, y) : y < }, for some b < 2;
3. {(*, y) : * < a}, {(JC, y) : x < M,y <2}, for some a < M (later on M will be
taken large);
4. {(*, y) : x > 1/ tan 0, y <fc},for some 0 e [0, TT/4] and b < 2.
Next define As = IA 5 : A e AL|, where for A e A, As = {(JC, y) : (y, JC) e A).
Finally define A =

AuAs.

256

7 Estimation of the Dependence Structure

Condition 7.3.3 For all 8 e \ 1, \, \,...


l
sup r P{(l

| and M > 1,

- Fi(X), 1 - F 2 (X)) tA] -

M (A)

0,

AeA'

as f 4, 0, with A! := {Ai n A2 : Ai, A2 .4}.


We also need uniform convergence over a second class of sets. Consider the class
of sets C\ = C\ (ft) defined by
C\ := { {(JC, v) : 0 < y < Z?(jc)for some nondecreasing function b]
e 6([0, oo] 2 \ {(00, 00)}) :
sup

7-772

O<*<2/tan0

< p for some 0 e [0, 7r/4],

(Xtan^)l/16

"^

'

and b(x) = b (
I for x >
1Vtan6>/
A
" '
tan<9
where B([0, oo] 2 \ {(00, 00)}) denotes the class of Borel sets on [0, oo] 2 \ {(00, 00)}.
The class of sets C2 = C^iP) is like C\ but with x and y interchanged.
Condition 7.3.4 For some f$ > 0,
D(t) :=

sup

rlP{(l

- Fi(X), 1 - F 2 (X)) tC] - /x(C)

0,

r |0.

CeCiUC2

Finally, we need a condition bounding the growth of the sequence k(n).


Condition 7.3.5 The sequence k = k(ri) should be such that (for the same f$)

VkD

Gho' B

00.

Theorem 7.3.6 Let (X, Y), (X\, Y\), (X2, I2), ...be independent and identically
distributed random vectors with continuous distribution function F. Let F\(x) :=
F(x, 00) and F2{x) := F(oo, x). Suppose that for JC, y > 0,
lim t P (1 - Fi(X) <-or\-

f-oo

F2(Y) < -) = L(x, y)


t /

and moreover, the uniform extensions Conditions 7.3.3 and 7.3.4 hold. Suppose that
fx has a continuous density X in [0, oo) 2 \ {(0, 0)}. Let k = kin) -> 00, n -> 00 and
suppose Condition 7.3.5 holds fork. Then, as n -* 00,

Vk (*(0) - *(0)) 4> W^ (Ehe) + Z(0)


in D([0, TC/2]), where W^ is a Wiener process indexed by sets, that is, a centered
Gaussian process with ,W/x(Ci)WAt(C2) = fi(C\ n C 2 ). Afote f t o

7.3 Estimation of the Spectral Measure (and L)


[W^Eij),
IVIV/I

257

0 e [0,7T/2]} L {W (*(0)), 0 [0, TT/2]} ,

W a standard Wiener process. The process {Z(0)} is defined by


lv(l/tan0)

Z(0) := /
Jo

A(x,xtan6>){Wi(jc)tan(9-W2(jctan(9)}Jjc

wifA W\(x) := WM([0, JC] x [0, oo]) and W2(x) := WM([0, oo] x [0, y]). Note that
W\ and W2 are also standard Wiener processes. Finally,
Vk ( L 0 ( X , y) - L(x,

y)J - * Q(x,

y)

in D([0, T] x [0, T]), for all T > 0, wfcere


Q(*. y) := x (WM (E lfjr/2 ) + Z ( | ) )
/arctan(y/*) /

+ (*vy)
A/4

Y- A y r (WM (1,0) + Z(0)) dO .


\sin20
cos20/
^

Proo/ The proof is very intricate. We give here a sketch of the reasoning and refer
to the paper Einmahl, de Haan, and Piterbarg (2001) for full details. We can write

with Co =

E\j,

i=i

and
Ut := 1 - F<n\Xi) ,

Vi:=l-

F^iYi)

i = 1,2,..., n, where Fj and Fj? are the marginal empirical distribution functions
(cf. (7.3.4)). We also introduce

= (*'*) H -

Ce:=-\(x,y)e[0,ooY
Now it is important to notice that
-kPn {-Ce)
with

= -Pn

[-Co)

1 n
Pn (C) := n- ^ 1 {([/-,Vi)eC}>
i-i

258

7 Estimation of the Dependence Structure

where
Ut := 1 - Fi(Xi) ,
i = 1, 2 , . . . , n. We now have

Vt := 1 - F2(Yi) ,

Vt(*(0)-*(0))

r-(n

(k * \

(k * \ \

= \lk I -Pn I -CQ I -P I -CQ 1 I


+ Vk (^P

l-Ce)

- ii ( o ? ) J

(empirical measure term)

(bias term)

(random set term)


=: Vi(0)+r(0) + V 2 (0),
0 e [0, TT/2]. The part sup0 |r(0)| is negligible by Conditions 7.3.4 and 7.3.5 and the
well-known behavior of weighted tail empirical and quantile processes.
The part V\{0) deals with the set CQ, which is a random perturbation of the set
CQ . The set CQ is the union of a rectangle and a triangle and hence it is a nice set, but
CQ is not: it is not in a Vapnick-Cervonenkis class or even in a Donsker class. But in
each segment of length 8 the set CQ can be majorized and minorized by two sets that
are manageable using Conditions 7.3.3 and 7.3.4 and do not differ too much.
Finally, the part V2 (0) also contributes to the limit result via the marginal empirical
process. This part can be dealt with by writing it as an integral using the density of
the measure [i.

7.4 A Dependence Coefficient


Theorem 7.3.6 allows one to accurately estimate the asymptotic dependence structure,
which can be very complicated. Nonetheless, it is sometimes useful to employ a simple
dependence measure that summarizes the dependence information albeit in a rather
crude way.
Consider the ^/-dimensional setting, i.e., consider a random vector (X\,..., Xd)
with distribution function F in the domain of attraction of some extreme value
distribution. Let # ( 0 := K\(t) +
h Kd(t) with Kt(t) = l{Xi>Udt)h w h e r e
Ui = (1/(1 - Ft))*~9 with Fi = F(oo, . . . , J C J , O O ) the marginal distributions,
i = 1 , . . . , d. Define
K := lim E(K(t)\K(t)

> 1)

t->OQ

Yfj=xP{Xj>Uj{t))
= hm !r'-00
P(y>=1Xj>Uj(t))
L ( 1 , 0 , . . . , 0 ) + L ( 0 , 1 , . . . , 0 ) + .- + L(0,.. .,0,1)
L(l,l,...,l)
d
d
L(l,!,...,!)
L

7.4 A Dependence Coefficient

259

One possible interpretation for this coefficient is that K quantifies, on average, how
many disasters will happen given that one disaster is sure to happen.
The case of asymptotic independence corresponds to K = 1, and the case of full
dependence corresponds to K = d (cf. Proposition 6.1.21). So in order to make things
somewhat reminiscent of the correlation coefficient in that the case of asymptotic
independence corresponds to 0 and the case of full dependence to 1, we define the
following dependence coefficient (Embrechts, de Haan, and Huang (2000)):
-1
d-\

H : = K

-L
(d-Y)V

H = 0 is equivalent to asymptotic independence and H = 1 to full dependence.


In E 2 it is somewhat usual to consider the dependence coefficient (Sibuya (I960))
X := lim tP (Xi > Ui(t), X2 > U2(f))
t-+oo

= 2 - 1(1,1) = J?(l, 1) = -x(D = 2 (l - A (J\)

Hence A. = / / L ( l , 1), 0 < X < 1 and X = 0 corresponds to asymptotic independence


and X = 1 to full dependence in R2.
The straightforward generalization of A. to M3 is
X = lim t P (Xi > t/i(r), X2 > U2(t), X3 > U3(t)) .
However, the extension of X to higher dimensions does not share this property. Consider the random vector (Y\, Y\, Y2) with Y\, Y2 independent and identically distributed with distribution function exp( 1/JC), x > 0. Since thefirsttwo components
are the same, the exponent measure must be concentrated on the set {(x\,x2, x3)
R^ : JCI = x2). On the other hand, the first and last components are independent
hence the exponent measure must concentrate on the set {(*i, *2, *3) e ^ + *i =
0 or x2 = 0}. Since also the second and last components are independent, the exponent measure must be concentrated on {(JCI, x2, x3) e R+ : x2 = 0 or x3 = 0}.
Putting everything together we find that the exponent measure is concentrated on the
intersection of these sets, that is, the lines {(JCI, JC2, JC3) e M+ : x\ = x2, x3 = 0}
and {(x\,x2, x3) e R+ : x\ = x2 = 0}. Since clearly the exponent measure is not
concentrated on the coordinate axis, there is no asymptotic independence. But
lim t P (Xi > Ui(t), X2 > U2(f), X3 > U3(t))
= lim t P(YX > t) P(Y2 > 0 = 0 .
f->00

When dealing with observations from the domain of attraction of an extreme value
distribution one can estimate H by
.

d-L
(d-l)

<*-L(l,l,...,l)
(d-l)L(l, !,...,!)

260

7 Estimation of the Dependence Structure

with L similarly as in Section 7.2, i.e.,


L

: = t ( l , l , . . . , l ) : = i E l j a,>jri(1)
:

orX(d)>X(d)
or
k+l,n r - r A i - A n - * + l , n 1

where ( x | 1 } , . . . , X ^ } ) , . . . , (X< 1 } ,..., x) are independent and identically distributed observations from the distribution function F. Similarly as in Theorem
7.2.1 (cf. Exercise 7.1) we have that under the domain of attraction condition and
k = k(n) -> oo, k/n -> 0, n -> oo,
H^H.

(7.4.1)

Let W be a d-dimensional continuous Gaussian process with mean zero and covariance structure given by the natural extension from the two-dimensional case considered in Theorem 7.2.2. For simplicity of notation define W(l) := W(l, 1 , . . . , 1),
._ W ( o , . . . , 1 , . . . , 0), where the 1 is in the *, coordinate and L ^ :=
Wd)
L(0,..., 1 , . . . , 1 , . . . , 0), where the l's are in the JC,- andXj coordinates. Then, under
the conditions of Theorem 7.2.2, with the obvious extensions to d dimensions, we
have
d

V* ( - l ) - W{\) - ^ L ; ( 1 ) W ( 0
1=1

with
Var(W(l))
Var(W (0 )
EW(l)W(i)
EW(i)Wu)

= L ,
= 1,
= 1,
i = l,...,rf,
=d- L(iJ\
ij = 1 , . . . , d ,

a
Li(l)

= L O i , . . . , * < / ) | (xj,.. . f jc/)=(i,...,i)

(cf. Exercise 7.3), i.e., W(l) - f = 1 Li(l)W=dN(0,


normal random variable and
d

1=1

1=1

i =

l,...,d

crL), where Wis a standard

^ : = L + ^ ( L ? ( l ) - 2 L / ( l ) ) + 2 j ; J ] LKD^d) (2 - L(^>) .
j=ij#

Hence, by Cramer's delta method,

In order to be able to apply this limit result for testing, one needs to estimate Lj (1)
consistently, j = 1 , . . . , d. A consistent estimator is

7 Tail Probability Estimation and Asymptotic Independence

1=1

261

(1)
Y0"-D^Y0'-1)
.k+l,n>'"'Ai
- A n-Jfc+l,n'

yU)>yU)
Y0+l)>y(;+l)
i - A -[*(l+*-l/4)] + l,' A ~An-k+\

y(d)>y(d)

'Ai

-An-k+l,n*

, L J .
)

For the proof note that


fc1/4(L(jc,y)-L(jc,y))4-0
locally uniformly and that for example,
Jt1/4(L(l+Jfc-1/4,l,...,l)-L)-^Li(l).
As mentioned before, asymptotic independence is equivalent to H = 0. Hence
one is tempted to use (7.4.2) to test for asymptotic independence. However, when
H = 0 one has L = d, L,-(l) = 1, L ( ' J ) = 2, i, j = 1 , . . . , d, i ^ j , so that the
asymptotic variance of \fk (H H) is zero and hence the result cannot be used to
construct an asymptotic confidence interval.
In fact, in order to test for asymptotic independence, it is better to work with a
more refined model, which will be discussed in the next section.

7.5 Tail Probability Estimation and Asymptotic Independence: A


Simple Case
In order to show the usefulness of a more refined model in case of asymptotic independence let us consider the following problem.
Suppose one has independent observations (Xi, Y\), (X2, Y2),..., (Xn, Yn) with
distribution function F and suppose that we are interested in estimating the probability

l-F(w,z),
where w > maxi<i<n Xi and z > maxi</<n Y(.
One may think, for example, of an athlete who wants to compete in the Olympic
Games in two disciplines. Her past records in the two disciplines are the observations
above. The values w and z are the thresholds that one has to reach, at least one of
them, in order to qualify. The athlete has never reached the thresholds.
This problem is a simple multivariate version of the problem of tail estimation
(Section 4.4). We want to consider here the simplest situation just for the sake of
exposition. We assume that both marginal distributions of F are 1 1/JC, x > L A
much more general situation will be considered in Chapter 8. We want to look at the
problem from an asymptotic point of view, hence with n -> 00, and assuming that
F is in the domain of attraction of an extreme value distribution. Since the condition
w > maxi<j< Xt and z > max\<i<n Yt is an essential feature of the problem, we

262

7 Tail Probability Estimation and Asymptotic Independence

want to preserve it in the asymptotic analysis. Hence we assume w = wn -> oo and


z = zn oo and moreover that
n(l-F(wn,zn))

(7.5.1)

is bounded.
The aim is to estimate p* := 1 F(wn, zn)- We further assume for simplicity
that wn = crn and zn = drn, for some positive sequence rn -> oo and c, d positive
constants. The domain of attraction condition is
lim t (1 - F(tx, ty)) = - log G0(x, y) = L (-, -) .
t-+oo
\x yj

(7.5.2)

Hence
/?* = 1 - F(u>n, zn) = l - F(crn, drn) ~ L I - , - J .
rn \c d)
This limit relation suggests that we estimate /?* by

^:=7/n'k(yd)=7i^i{x^c/k

or Yi>nd/k)

with V from (7.2.6). Indeed, we find by Proposition 7.2.3 that

lim 2- = lim

-A^- = 1 ,

(7.5.3)

in probability.
This is straightforward. But let us now look at the problem of how to estimate
pn :=P(X>

wn, Y>zn)

= P(X> crn, Y > drn),

where (X, Y) has distribution function F with the same simplifications as before.
Suppose moreover that the distribution function F is in the domain of attraction of
an extreme value distribution with independent components. One can try to estimate
pn as before by
11 "
r

7 , hxt>nc/k
*,=i
=

11"

and Yi>nd/k)

7 T 2 ^ liXi>nc/k)
Yn K
i=\

l l A .

l
+ Tl_s
{Yi>nd/k)
Yn K
,=1

11

~ T J . l{Xi>nc/k
T K

i=l

or Y^nd/k} ,

but this, multiplied by r n , converges to c" 1 + d~l - (c _ 1 + d~l) = 0. The problem is


that in the case of asymptotic independence we know only that P(X > tc and Y > td)
is of lower order than P(X > tc or Y > td), as t -> oo, but the theory does not say

7 Tail Probability Estimation and Asymptotic Independence

263

anything about the asymptotic behavior of this probability itself. So it seems that in
order to estimate pn consistently we need a more refined model.
In fact, the condition of Theorem 7.2.2 on the asymptotic normality of L(JC, y)
gives a clue about where to look for further conditions. The condition is in our case

(1 - F(tx, ty)) = L Q , -\ + O (ra)


as t -> oo for some a > 0. A somewhat stronger form of this condition can serve as
a second-order condition, quite similar to the one used in Section 2.3: there exists a
function A, positive or negative, such that for all 0 < x, y < oo,

lim

f(l-F(f*,oO)-L(i,i)

^ - ^ - = Q(x, y),

f-0O

(7.5.4)

A(t)

where Q is not identically zero.


In case of asymptotic independence this second-order condition takes a simple
form. Taking x = oo or y = oo in (7.5.4) we get

t{\ - F(tx, oo)) -x~l


t(l-F(oojy))-y-1
A(t)

Q(x, oo),

(7.5.5)

(oo,y).

(7.5.6)

Now note that


P (X > tx, Y > ty) = P(X >tx) + P{Y >ty)and that in the present case L

(1/JC,

P (X >txotY

> ty)

1/y) = l/x + 1/y. Then (7.5.4)-(7.5.6) imply

tP (X > tx,Y > ty)


= -Q(x, y) + G(*, oo) + G(oo, y) =: S(x, y)
A(t)

(7.5.7)

asf -> ooforO < JC, y < oo. Comparing this relation with (7.5.2), we see that P(X >
t or Y > t) is a regularly varying function of order 1 and P(X > t and Y > t) is
of lower order in case of asymptotic independence. In fact, P(X > t and Y > t) is a
regularly varying function of order p 1, where p < 0 is the index of the regularly
varying function \A\.
We now show that condition (7.5.7) allows us to estimate pn consistently. It is
common to write (7.5.7) as
lim/>(x>a,r>,y)

t-oo P(X>t,Y

>t)

In particular, #(0 := P (X > t, Y > t) is a regularly varying function with index less
than or equal to - 1 . In the original papers (Ledford and Tawn (1996,1997,1998)) the
index is written as l/rj with r\ < 1. Clearly if there is no asymptotic independence

264

7 Tail Probability Estimation and Asymptotic Independence

A(t) can be taken constant and hence rj = 1. Also S is the distribution function of a
measure, say p,that is,
S(x, y) = p Us, t) e R+ : s > x, t > y\ .

Pn : = I ~rn

\Yl

- 7 ^
Tl K

{Xi>nc/k,Yi>nd/k}

1=1

where ?) is an estimator of 77 to be discussed later.


Similarly as in the proof of the consistency of L (Theorem 7.2.1) one can prove
that

provided that
lim nq (^) = 00 .

(7.5.10)

This condition sets a lower bound for the sequence k = k(n). We now write
Pn _ lEUUx^nc/kJ^nd/k}

Pn

S (c, d) q ( f ) ( f r , , ) ^

lq(l)S{c,d)

P(X>wn,Y>Zn)

By (7.5.8)
P (X > w, Y > zn) = P(X>

rnc, Y > rnd) ~ q{rn)S (c, d) .

Hence we get pn/Pn -> /> 1 if

Um i i M ^
q(rn)

(7.5.n)

and

(W

-4 1 .

(7.5.12)

Now (7.5.11) is implied by


lim
lim
x=x(t)-+oo

<?(0*-1/r>
?
; ; = 1,
#('*)

which can be achieved by imposing a second-order condition on the regular variation


condition for q as in Appendix B, Remark B.3.15.
In order for (7.5.12) to be true we need an estimator rj that converges to r\ at a
certain rate. So let us look now at the estimation of rj.

7.6 Estimation of the Residual Dependence Index r\

265

7.6 Estimation of the Residual Dependence Index r/


After the preparation in Section 7.5 we now define the residual dependence parameter
rj generally. Let F be a probability distribution function in the domain of attraction
of an extreme value distribution; FI(JC) := F(x, oo) and F2(y) := F(oo, y) the
marginal distribution functions, which are supposed to be continuous; and (X, Y) a
random vector with distribution function F.
Suppose that for JC, y > 0,
lim
40 P (1 - FX(X) < M - F2(Y) <t)

=: S(x, y)

(7.6.1)

y)

exists and is positive. Then, P (1 - F\{X) < f, 1 - F2(F) < t) is a regularly varying
function with index l/rj, say, and as in Theorem 6.1.9, for a, x, y > 0,
S(ax, ay) = al/r,S(x,

y) .

(7.6.2)

The residual dependence index r) e (0, 1] was introduced by Ledford and Tawn
(1996, 1997, 1998).
Note that the domain of attraction condition implies
\imrlP(l
40

- Fi(X) <tx,l-

F2(Y) < ty)

= l i m r 1 P ( l - Fi(X) < tx) + l i m r 1 P ( l - F2(Y) < ty)


r|0
40
-\imrlP(l
- Fi(X) <txorlF2(Y) < ty)
40
= x + y - L(x,y) .
This expression is zero for all JC, y > 0 in the case of asymptotic independence
and positive in all other cases (cf. Proposition 6.1.21). It follows that if there is no
asymptotic independence, the index rj in (7.6.2) has to be one. In other words, (7.6.1)
and (7.6.2) with rj < 1 imply asymptotic independence. On the other hand, r\ = 1
does not imply asymptotic independence.
This opens the possibility to devise a test for asymptotic independence in the
framework of (7.6.1). The null hypothesis is r\ = 1, and the alternative one is rj < 1.
In order to carry out this test we need an estimator for t] and that is what we discuss
next.
Condition (7.6.1) implies

lim

OO

P
p

_M^lkll!U

Vl-Fi(X)

l-F2(Y)

>

= S [ - I = x-V'S(l. 1) = x-V
7

for x
> 0, i.e., the probability distribution of the random variable
((1 F\(X)) v (1 F2(Y))~l is regularly varying with index l/rj. This suggests
that we use a Hill-type estimator as in Section 3.2.

266

7 Tail Probability Estimation and Asymptotic Independence


If F\ and F2 were known, we could use
J *k-\
-l
lo

- J2

S Vn-i,n ~ log Vn-k,n

1=0

as an estimator, where {V),w} are the the order statistics of the independent and
identically distributed sequence V) := 1/ ((l - F\(Xj)) v (l - F 2 (F / ))), 1 =
1,2,...,w.
Since Fi and F2 are not known, we replace them with their empirical counterparts
Fj and Fj1 as defined in (7.3.4) (to prevent division by 0). This leads to the random
variables
T(n)

'

._
"

1
((l-F^))v(l-F

w
2

(y,)))

1
~~ /Vw+l-*(X,-A

/f,+i-/g(y f )\\

n
l-(R(Xi)AR(Yi)y

n+

where /?(X,-) is the rank of X; among Xi, X 2 , . . . , Xn and tf(F;) that of Yj among
Fi, F2, , Yn. The Hill-type estimator then becomes

n:=lf:iogT^n-logT*\n,
1=0

where {7),n} are the order statistics of the non-independent and identically distributed
sequence 7} , 1 = 1, 2 , . . . , n.
Asymptotic normality can be proved under a refinement of condition (7.6.1).
Theorem 7.6.1 Let (Xi, Fi), (X2, F2),... be i.i.d. random vectors with distribution
function F. Suppose (7.6.1) and (7.6.2) holdfor some rj e (0, 1]. We also assume that
the following second-order refinement of (7.6.1) holds:
lim

40

P(l-Fi(X)<fx and l-F2(Y)<ty) _ .


x
m-F^XXtandX-F^Xt)
^ 0

^(0

^V

%y)

exists for all x,y > 0 with x + y > 0, w/iere gi w some positive function and Q is
neither a constant nor a multiple of S. Moreover, we assume that the convergence
is uniform on {(JC, y) e R+ : x2 + y 2 = 1} anJ ffotf the function S has firstorder partial derivatives Sx := dS(jc,y)/9;c and Sy := 9S(JC, y)/3y. Finally, we
assume that
\imrlP (1 - Fi(X) < f am/1 - F 2 (F) < t) =: I
exists. Let q*~ be the inverse of the function q(t) = P(l F\(X) < t and 1
F2(F) < t). For a sequence k = k{n) of integers with k -> 00, k/n -+ 0, and
\fkq\ (q*~(k/n)) - 0, -> 00,

7.6 Estimation of the Residual Dependence Index rj

267

Vk(fj-ri)
is asymptotic normal with mean zero and variance
iy2(l-/)(1-2/5,(1, l)Sy(l,l)).

Proof. We provide a sketch of the proof. The original elaborate proof (Draisma,
Drees, Ferreira, and de Haan (2004)) is beyond the scope of this book.
Similarly to Section 7.2 one obtains, with m := nq(k/n) and m > oo,

( * E

{Xt>XH.[kxWtlt

and Yi>Yn[ky]+l,n}

~ S(x, v) j 4> W(x,

v)

in D([0, T] x [0, T]), for every T > 0, where W is a zero-mean Gaussian process
with in case / = 0,
EW(xu yi)W(* 2 , yi) = S(x\ A x 2 , yi A y 2 ),
and in case / > 0,
W(x, y) = ~r (Wi(*> 0) + Wi(0, y) - Wi(x, y))
-V/S*(*. y)Wi(x, 0) - flSy(x,

y)Wi(0, y)

and
EW\(x\,yi)Wi(x2,y2)

=x\Ax2

+ y\ A y2 - lS(xu yi)
-/S(x 2 , y2) + /S(*i v x 2 , yi v y2) .

Next note that


n
Zs
i=l

{Xi>Xn-[kx]+l,n

and Ki>yn-[Jkjc]+l,n}

2^

{l-F1(n)(X/)<l-F1(n)(Xn_M+1,n)=[^]/nand

l-F 2 ( " ) (F / )<l-F 2 ( n ) (X n _ M + l ! n )=[^]/ W }

= 1{r>ir/[fc.]}
Hence with
i= l

268

7 Tail Probability Estimation and Asymptotic Independence

we obtain
^_ . ,

M-j.

(x) _

x l /

\ _ w ^

x)

in D([0, T]), for every T > 0. This relation is somewhat similar to Theorem 5.1.4,
which led to the asymptotic normality of the "usual" Hill estimator. For further details
we refer to the mentioned paper.

Exercises
7.1. Prove Theorem 7.2.1 for the ^/-dimensional case, i.e.,
(Xn , . . . , X ) are independent and identically distributed random vectors with
distribution function F in the domain of attraction of an extreme value distribution
G, with
L(xi,x2,

...,*):= -logG I
\

x7n-\
Y\

x~yi-\
Yi

,...,

x~YdYd

>

for (jti, JC2,..., Xd) R+, where y\, yi, , Yd are the marginal extreme value
indices and L(JC, 0 , . . . , 0) = L(0, JC, . . . , 0) = L ( 0 , 0 , . . . , JC) = JC, then for T > 0,
as n -> oo, k = k(n) > oo, /w -> 0,
sup

L(xi,x2,

...,Xd)-

L(x\,X2,

...,Xd) ^ 0

0<x\,X2,...,Xd<T

7.2. Consider Example 5.5.3. Determine the dependence coefficient H of the random
vector (X,X n + i).
7.3. Prove Theorem 7.2.2 with the natural extension to d dimensions, i.e., that under
the given conditions and for k = k{n) -> oo, k(n) = o (n 2 a f / ( 1 + 2 a ) ), a > 0, as
n -> oo,
\/fLOi,X2, ...,*/) L(xi,^2,...,*</) J - > #(*i,.*2, ...,*</)
in Z)([0, T]^), for every T > 0, and for (*i, Jt2,..., x^) R+, where
B(^i,X2, ...,Xd) = W(X\,X2, ...,Xd) L\(X\,X2, .>-,Xd)W(x\,0, . . . , 0 )
-L2(X\,X2, . . . , *rf) W(0, * 2 , 0, . . . , 0 ) - . . .-Ld(x\,X2, . , *</)W(0, 0, . . . , Xrf)
and W is a continuous Gaussian process with mean zero and covariance structure EW(x\,...,
Xd)W(x\, ...,Xd) = M (/?(JCI, . . . , Xd) H /?(JCI, . . . , JCrf)) with
/?(JCI, . . . , Xd) := {(MI, . . . , WJ) e R+ : 0 < I < x\ or . . . orO < Ud < *<*}

7.6 Estimation of the Residual Dependence Index r]

269

7.4. Show that under the conditions of Theorem 7.2.2 the proposed estimator of
Sibuya's dependence coefficient A (Section 7.4) satisfies
Jk (k - A.) 4> AT (o, L(l - 2LiL 2 ) + (Li + L 2 ) 2 + 2 (1 - Li) (1 - L 2 ) - 2) ,
where L := L(l, 1), L\ := Li(l, 1), L 2 := L 2 (l, 1), and N is a standard normal
random variable.
7.5. Let F(x\, JC2) be a probability distribution function in the domain of attraction of
an extreme value distribution G(x\, JC2), i.e., there are functions a\, a 2 > 0, &i, and hi
such that for all JCI,JC2 for which 0 < G(JCI,* 2 ) < l,lim^-^oof {1 F (b\(t) + x\a\(t)
^2(0+^2^2(0)} = logG(x\,X2) =: 4>(JCI,x 2 ). Suppose that the following
second-order condition holds: there exists a positive or negative function A with
limr_+oo A(0 = 0 and a function *I> not a multiple of <I> such that for each (jq, JC2)
for which 0 < G(JCI,JC2) < 1,
r
hm
t^oo

t(l-F

(*i(0 + jcifli(0, *2(0 + x 2 a 2 (0)) - *(xi, x2)

=
A(0

-,
V(xux2)

locally uniformly for (jq, JC2) (0, 00] x (0, 00]. Show that this second-order condition implies the second-order condition of Section 2.3 for the two marginal distributions. Show that the function A is regularly varying. Show that if the index of A is
smaller than zero, condition (7.2.8) of Theorem 7.2.2 holds pointwise.
7.6. Show that if X and Y are independent, the residual dependence parameter rj from
Section 7.6 for (X, Y) is \.
IH. Let X, Y, U be independent random variables, where X and Y have distribution
function 1 1/jt, x > 1, and U has distribution function 1 l/(x a logjc), x >
1, for some a e [1,2). Show that the distribution function of the random vector
(X v U,Y v U) is in the domain of attraction of an extreme value distribution with
independent components and residual dependence index \/a.

8
Estimation of the Probability of a Failure Set

8.1 Introduction
In this chapter we are going to deal with methods to solve the problem posed in a
graphical way in Chapter 6. The wave height (HmO) and still water level (SWL) have
been recorded during 828 storm events that are relevant for the Pettemer Zeewering.
Engineers of RIKZ (Institute for Coastal and Marine Management) have determined
failure conditions, that is, those combinations of HmO and SWL that result in overtopping the seawall, thus creating a dangerous situation. The set of those combinations
forms a failure set C.
Figure 8.1 displays the failure set
C = {(HmO, SWL) : 0.3HmO + SWL > 7.6} ,
as well as 828 independent and identically distributed observations of HmO and SWL.
This set is such that if an independent observation should fall into C, it would (could)
lead to a disaster. The problem is how to determine the probability that an independent
observation falls into this set.
In order to develop statistical methods to deal with this problem, we use the theory
developed in Chapters 6 and 7. We start by assuming that there exist normalizing
functions a\, a2 > 0 and b\, b2 real, and a distribution function G with nondegenerate
marginals, such that for all continuity points (x, y) of G,
lim Fl(ai(t)x

+ 6i(r), a2(f)y + b2(t)) = G(x, y) .

(8.1.1)

t->oo

Moreover, we choose the functions a\, a2, b\, b2 such that


G(x, oo) = exp ( - (1 + n ^ ) ~ 1 / n ) ,

1 + n * > 0,

(8.1.2)

G(oo, y) = exp ( - (1 + y2yyl/y2)

1 + y2x > 0,

(8.1.3)

and
,

where y\ and y2 are the marginal extreme value indices. Then from Section 6.1 (e.g.,
Theorem 6.1.11),

272

8 Estimation of the Probability of a Failure Set

Wave height (m)

Fig. 8.1. Failure set C, boundary point (wn, wn) and observations.
1/K2

1/Kl

b\(t)
x Y\ 1
Y-b2(t)
>
or
>
= lim t P
02 ( 0
Y\
( - a\(t)
r->oo
= -logG

,i
V

Y\

yn-V
Y2

.
Y2 J

Or more generally, with v the exponent measure defined in Section 6.1.3,


lim t P
f-oo

l((

X-bi(t)\l/n

y-^m\J/K2

ai(t)

eQ\=v(Q)

(8.1.4)
for all Borel sets <2 C M+ with inf (x,y)eQ max(x, y) > 0 and v(9g) = 0. Then, for
any a > 0 we know that
-l,
v(aQ)= a~Lv(Q)
(8.1.5)
where
aQ:= {(ax, ay) : (x, y) e Q}
(cf. Theorem 6.1.9), and this property will be our main tool. Note that v is the approximate mean measure of the point process formed by the observations (Theorem
6.1.11). Hence in principle v(Q) can be estimated by just counting the number of
observations in the set Q.
Now recall that we want to estimate P ((X, Y) e Cn). Clearly there is no observation in the failure set. In fact, the observations are all some distance away from
the failure set. There has been no dangerous situation around the dike during the
observation period. This suggests that in a first approximation, P(C) < l/n. This
particular feature is essential for extreme value problems and we want to capture this

8.1 Introduction

273

in our approach, based on a limit situation in which the number of observations grows
to infinity.
We have n independent observations (Xi, Y\), (X2, Y2),..., (X n , Yn) with common distribution function F and we have a failure set C with P(C) < \/n. This
means that if we assume n - 00, in order to preserve the extreme value situation,
we have to assume that the failure set is not fixed but depends on n: C = Cn with
P ( C n ) - > 0 , n - 00.
Now we write the probability we want to estimate in terms of the transformed
variables:
Pn

:= P((X, Y) e Cn)

with

fi :=

- Kl1+w"^rJ

'( 1 + y 2 -^r)

:(x,y)Cn

(8.1.7)
Since the set Qn* like the set Cn, does not contain any observations, we divide the set
Qn by a large positive constant cn such that Qn/cn contains a small portion of the
observations. This way we can estimate v(Qn/cn) and hence v(Qn) =
v(Qn/cn)/cn.
Summing up, the procedure involves the following steps:
1. Marginal transformations
l/Kl

i = 1 , . . . , n, in order to transform the marginal distributions approximately to


a Pareto distribution with distribution function 1 1/JC, x > 1. Note that after
this transformation the probabilities can be approximated by the measure v as in
(8.1.4).
2. Use the homogeneity property of the measure v, (8.1.5), in order to pull the
transformed failure set to the observations. Next estimate v by its empirical
measure.
The two steps for the given data set are indicated in Figures 8.2 and 8.3.
Next we write analytically (although not yet in a formal way) the reasoning developed above. Let k be an intermediate sequence, i.e., k = k(n) -> 00, k/n ~> 0,
n -* 00. If the failure set Cn can be written as

274

8 Estimation of the Probability of a Failure Set

200

300

400

Transformed wave height

Fig. 8.2. Transformed: failure set (8.1.7) (area above the curved line), boundary point (qn, rn)
and data set (8.1.8).

o
o

60 H

0*1,-S2)

1
a

S = Qn/Cn
3

*
,I P *
1

1
01

1
100

200

300

400

500

Transformed wave height

Fig. 8.3. Transformed data set (8.1.8), boundary point {s\,S2) '= (qn/cn, rn/cn) and pulled
set S (area above the curved line) from (8.1.9).
where cn is a positive sequence (generally cn > oo, n > oo) and 5 is afixedopen set
of R 2 , and the marginal transformations (8.1.8) applied to Cn give the set cnS (called
Qn above). Note that n/k is playing the role of the running variable t considered
before. Then, for some fixed open Borel set S C M+ with inf(Xjy)es max(;c, y) > 0
and v(dS) = 0 we can write (8.1.6) as

(('*"4sr-

1 + K2

Y-b2(

MI)" )

1/K2>

ecnS

(8.1.10)

This, by (8.1.4), is approximately equal to


-v(cnS)
n

v(S),
ncn

(8.1.11)

8.1 Introduction

275

where the last equality follows from (8.1.5). This leads to the estimator (defined in
more detail below; cf. Theorems 8.2.1 and 8.3.1)
pn := - * - v(S)
ncn
Note that S is not known since y\, y2, a\,a2,b\9b2 are not known.
Up to this point we have dealt with cn as if it were known. That is, it has played
a similar role to that of the intermediate sequence k = k(n) in the univariate estimation. This way it is to be chosen (under certain bounds) by the statistician. An
alternative way to deal with cn is to incorporate it in the problem itself, and consequently to estimate it along with the other unknown quantities. We shall discuss these
two approaches in two separate subsections.
We add some comments at this point. In the above discussion we assumed v(S)
positive, and this will be the case considered in the next section. In fact this is the case
if the random variables X and Y are not asymptotically independent or S contains (at
least part of) the axis
{(JC, y) : x > 0 and y = 0} U {(*, y) : x = 0 and y > 0} .

(8.1.12)

The notion of asymptotic independence was first introduced in Section 6.2. Recall
that a random vector (X, Y) is said to be asymptotically independent if its distribution function is in the domain of attraction of some extreme value distribution with
independent components, i.e., the limiting distribution is the product of its marginals.
In this case we know that the exponent measure from Section 6.1.3 is concentrated
in the positive axes given in (8.1.12). In terms of the spectral representation discussed
in Section 6.1.4, recall that for 0 < 0\ < 62 < n/2 either for all 0 < r\ < Y2 < 00
the set
I (x, y) :r\ < y x2 + y2 < r2, 6\ < arctan - < O2 J
has positive /x-mass or for no choice of 0 < r\ < T2 < 00 it has positive /x-mass,
depending on whether the spectral measure has positive or zero mass in [0\, #2]This means that v(5) is always positive as long as we do not have asymptotic
independence. But v(5) can be positive even under asymptotic independence, e.g.,
in case S D (^1,^2) x [0, y) for some 0 < x\ < X2, y > 0.
Note that the proposed transformations of the failure set Cn that lead to the set S are
such that certain features of the original set Cn are preserved after the transformation
to S. For instance, if Cn C [x, 00) x [y, 00) then S will also satisfy this, for possibly
some other x, y, or if Cn D (jq, xi) x (-00, y) then S D (JCI, X2) x [0, y), for
possibly some other JCI , JC2, y.
The case v(5) = 0 is discussed in Section 8.3. Clearly v(S) = 0 under asymptotic
independence and if S is contained in a set of the form (JC, 00) x (y, 00), for some
x, y > 0. The procedure is quite similar to that with v(5) > 0, and additionally
it involves the residual independence index rj introduced in Section 7.6. For testing
asymptotic independence we refer to Section 7.6.

276

8 Estimation of the Probability of a Failure Set

For simplicity we consider only the two-dimensional case. The generalization to


higher dimensions is straightforward. For a similar result in function space see Section 10.5.
Though the procedure is developed with the same choice of the intermediate
sequence k for both marginals, it can be adapted for different intermediate sequences.
Hence in practice one can use differentfc'sin the estimation of the marginals.

8.2 Failure Set with Positive Exponent Measure


Recall that on the basis of independent and identically distributed random vectors
(X\, Y\),..., (Xn, Yn) from F we want to estimate pn = P((X, Y) Cn), for some
given failure set. In this section we assume that there exists some boundary point of
Cn, (vn, wn) such that
Cn C {(*, v) : x > vn or y > wn}

(8.2.1)

for all n. Note that this is a rather weak assumption. For instance in Figure 8.1 we
took the diagonal point (wn, wn) for (vn, wn).
Now let us see what happens to this point after the marginal transformations
(8.1.8), with t replaced by n/k, where k = k(n) -> oo, k/n - 0, n -> oo. They are
ilustrated in Figure 8.2. Define

H-TUT)

H 1 + K ^ n

<822)

'-

and assume that lim^-^oo qn/rn exists and is finite; this avoids the predominance of
one marginal over the other so that the problem does not become a univariate one in
the limit.
8.2.1 First Approach: cn Known
In the next theorem we state the necessary conditions for the consistency of pn. We
opted for a long theorem, which in turn is mostly self-contained in all its conditions
and definitions.
Theorem 8.2.1 Let (X\, Y\),..., (Xn, Yn) be an i.i.d. sample from F. Suppose F is
in the domain ofattraction ofan extreme value distribution with normalizing functions
ai > 0, bi real, marginal extreme value indices yi, i = 1, 2, and exponent measure v
(cf (8.1.1)-(8.1.4)|
Consider some estimators ofyt, at (n/k) > 0, b( (n/k) such that for some sequence
k = k(n) -* oo, k/n -> 0, n - * oo,

^ (n - Yu ^ S - 1. M | ) T * ' ( ^ = (0,(1). <MD, 0P(l)) , (8.2.3)

8.2 Failure Set with Positive Exponent Measure

277

i = 1,1
Suppose the failure set Cn is an open set for which (8.2.1) holds. Suppose further
that Cn can be written as

y)es\,

(8.2.4)

where S is an open Borel set in R+ with v(dS) = 0 and v(S) > 0, and cn a sequence
ofpositive numbers with cn -> oo, n -> oo.
Finally, suppose 0 < g n /r n < oo (our conditions imply that qn/rn does not
depend on n),

lim " W ^ o ,

(8.2.5)

w/iere
log 5 ds, f > 1
~ - r / v '
(the function wy(t) has been defined in Theorem 4.4.1), and that (8.1.4) holds with
Q replaced by cnS, i.e.,

u; y (0

"-oo k

l + yi
\ \ \

^iM|)V/n

(1

^ F fl^n[

) *Ws}/v(cS) = l .

(8.2.6)

wc i = l

Ko^wrvwrH

(8.2.7)

w/iere

S:=

- I1 +W

Ml) / '
i/w>

-V

Ml) /

we /lave
Pn P

Pn

: (*, y) G C

(8.2.8)

278

8 Estimation of the Probability of a Failure Set

We postpone the proof to Section 8.2.3. We want to add some comments at this
point.
The estimation of y,, at(n/k), and >,(/&), i = 1,2, is known from univariate
extreme value statistics (cf. Chapters 3 and 4). For instance one can use the momenttype estimators of Sections 3.5 and 4.2:
1 *
M

n\

= J & g
* =1

n,n-i+l

" l o g X,n_k)J,

j =

1,2,

(MW-\

and
where for M^32, Yi, ci2(n/k), and i>2(n/k) replace X by Y in the previous formulas.
Then under the second-order regular variation condition (cf. Definition 2.3.1) for
both marginals with auxiliary functions A,-, / = 1, 2, and provided k = k(n) oo,
k/n -> 0, y/kAi(n/k) = 0(1), / = 1,2, as n -> oo, the 0/>-property for the
individual terms in (8.2.3) follows from Sections 2.2, 3.5, and 4.2. Then they are also
jointly 0P(\).
Remark 8.2.2 Note that pn and S may not be defined
ifl+y\(Xibi(n/k)/a\(n/k)
< 0 for some Xi and similarly with the second component. However, when checking
the proofs one sees that when n -> oo, the probability that this happens tends to zero.
Remark 8.2.3 Note that the relation between k = k(n) and cn may restrict the range
of possible values of the marginal extreme value indices. For y\ A yi < 0, condition
(8.2.5) implies
lim
->

Cn

^
^fr

= lim jfc-V2"*"**) (1)


n-+oo

= 0.

\cnJ

For instance, if we want to allow k/cn = 0(1), we must have k~l/2~(yi AYl) -> 0,
which is true only if y\ A yi > \ .
8.2.2 Alternative Approach: Estimate cn
Define, for some r > 0,

cn := Sd

(8.2.9)

r
where qn and rn are as in (8.2.2). According to (8.1.9) the point ($1,52) : =
(qn/cn, rn/cn) is on the boundary of S. Moreover, from (8.2.9) we have s\ + s\ =

8.2 Failure Set with Positive Exponent Measure

279

(qn/cnj1 + (rn/cn)2 = r 2 , that is, (s\,S2) is on a circle of radius r and hence close
enough to the observations (cf. Figure 8.3).
Let JC* and JC| be the right endpoints of the marginal distributions. Note that
vn t x* and wn f JC| imply qn -> oo and rrt -> oo respectively, under the domain
of attraction condition.
Corollary 8.2.4 Under the conditions of Theorem 8.2.1, with y\ A yi > \, Ji/Sm
the estimators
qn :=

(8.2.10)

1 + Xi
1/W

r n := I 1 + y 2 -

(8.2.11)

Ml) /

(8.2.12)

c : =
r
/or some r > 0 (to be chosen by the statistician), and

(8.2.13)

with
S*:=

y-Mi)'

i/>

: (x, y) C

(8.2.14)

Then
1.
Pn
Remark 8.2.5 Under much more stringent conditions it can be proved that in case
v(5) > 0,
q^AK^n)

\Pn

is asymptotically normal (de Haan and Sinha (1999)).


8.2.3 Proofs
We give some intermediate results from which the consistency of pn and p* will
follow.

280

8 Estimation of the Probability of a Failure Set

Lemma 8.2.6 Let fn (JC) and gn (x) be strictly increasing continuous functions for all
n, limn_>oo fn(x) = JC, and limn^oo gn(x) = xforx > 0. For an open set O, let
On:={(fn(x),gn(y)):(x,y)eO}.
Then
lon(x,y)

for(x,y)

: = l{(x,y)eOn)

~> lo(x,y)

:=

l{(x,y)eO)

O.

Proof Take (x, y) e O and e > 0 such that (x e,y e) e O. For > o w e have
/*"(*) > * - s and g^(y) > y - <?. Hence (/*"(*), gj~(y)) O for n > n 0 . It
follows that

M/r(*).sroo)-*i
for

(JC, y)

0 . Now

(x, y ) e O ^ (/ n ^(x), gf(y))

eOolo

(/*"(*). gfiy))

= 1.

Hence the conclusion.

Proposition 8.2.7 Let (X\,Y\), (X2 ,Yi),. ..bean i. i.d. samplefrom F. Suppose F is
in the domain ofattraction ofan extreme value distribution with normalizing functions
at > 0, bi real, marginal extreme value indices yi, i = 1, 2, and exponent measure
v. Let S be an open Borel set in R+ with inf (x,y)eS max(jc, y) > 0, v(dS) = 0, and
v(S) > 0. For the random variable

^):=JD
1=1

[((-wro^n-

with k satisfying k = k(n) * oo, k/n -> 0, n - oo, we have


v(5)4.v(5).
Proo/ By the domain of attraction condition (8.1.4) we easily obtain
lim Eeil^S)

= eitv

n->oo

for all/.
Next we define

m-.= li
i=l

|(hW)>W)>

Note that
V(S) =

Vn{Sn),

(8.2.15)

8.2 Failure Set with Positive Exponent Measure

281

where
Sn := {(/(*), gn(y)) : (*, y)eS],

(8.2.16)

|)zM|)\V
g(x) := I 1 +

fe(f)-^(f)\\1/w

/a 2 (g)^_!
n

U(D &

-2(f) ;;

Proposition 8.2.8 Asswme f/ie conditions of Proposition 8.2.7 with k satisfying k =


k(n) - oo, k/n -> 0, n - oo, awe/ tote 0(5) as m (8.2.15). 77ien

v(S)4>v(S).
Proo/ Under the domain of attraction condition,

MD'Mf)*

ilzM|)\

l(f)

converges, in probability, to (v(S), yi, y^, 1,1,0, 0). Next invoke a Skorohod construction so that we may pretend that this relation holds almost surely. Let Sn be as
in (8.2.16). By Lemma 8.2.6 we have
lSn(x,y)^ls{x,y)

(8.2.17)

for (JC, y) G 5. Note that the given conditions for Cn imply that there exist s\,S2 > 0,
(s\, 52) 35 such that x > s\ or y > S2 for all (JC, y) e S. It follows that
SC

{(JC, y)

: x > s\ or y > 52} =: D .

Define D n as
Dn:={(fn(x),gn(y)):(x,y)D)

Since /(si) -> s\ and gnfe) -> s 2 , for n > n0,


Dn C D :=

{(JC

- e, y - e) : (JC, y) D] .

Hence for n > o


lsn(x, y) < 1A,(JC, J) < 1 D ,(JC, y)

(8.2.18)

for all (x,y) e D.


Define the measure v* by
00

v*:=^2"wvn
with the convention vo = v. Let /i n be the density of vn with respect to v*. We know
that

282

8 Estimation of the Probability of a Failure Set


vn(S) = f hndv*-+

f h0 dv* = v(S)

a.s.

By Lemma 8.2.6, Proposition 8.2.7, (8.2.17), (8.2.18), and Pratt's (1960) lemma
(summarizing: if gn -> g pointwise, \gn\ < fn for all n, and f fn -> / / for some
functions /, gn, / , and g, then / gn-^ f g) we have
0(5) = vn(Sn) = J \Snhn dv* -> j lsh0 dv* = v(S) .
It follows that

v(S)-^v(S).

Proposition 8.2.9 Let (X\, Y\), (X2, F2),... be an i.i.d. sample from F. Suppose
F is in the domain of attraction of an extreme value distribution with normalizing
functions at > 0, b[ real, and marginal extreme value indices yt, i = 1, 2. Suppose
(8.2.3) for some sequence k = k(n) > 00, k/n -> 0, n > 00. Redefine fn(x) and
gn(x)as

fn(x) := 1 + Ki 7-7ST
C \
\ai ()

/I

gn(x) .= I 1 + K2 I ^TTTV
Cn \
" \a2 ( I )

r
V2

Ml)
+

^-FFj
a2 (

II

with cn -* 00, as n * oo, and


lim

/(*) -> *

r^Cn)

awd

= 0

(8.2.19)

g n (x) - x ,

/or a// x > 0.


Proof. We prove the first of the two statements. Invoke the Skorohod construction,
so that (8.2.3) holds almost surely. Next we consider the cases y\ ^ 0 and y\ 0
separately.
Start with y\ / 0. Then from (8.2.3),

, . W - I | . + (. + o(i))(fax,"-.) + o(^)p
= i{<)"+<c.,>"0(i) +
1+0

(cn*)*/* f

Cn

+< r

/ 1\

0 ()p
/ 1 xiVyi+o^)

b) - "b)l

8.2 Failure Set with Positive Exponent Measure

283

Now we deal with both factors separately and for that we use condition (8.2.19). Note
that (8.2.19) implies wn(cn)/Vk -> 0, i = 1, 2. Now

i^"0' n<0;
hence regardless of whether y\ > 0 or y\ < 0 we have CnYl/Vk
(log cn)/Vk -> 0, n -> oo. The result follows for yi ^ 0.
Next consider yi = 0. Then

-> 0 and

Vn

We first prove that lim n ^oo /*""(*) = JC and we write


/ i T W = exP
= x exp \

,08

Mi) H i
W 7

log(cx) +

Mf)-Mr)"
i(f)

Now note that

I (cnx)* - 1

log(cnx)

=ir('A-)-i=brrft-<\9i\(cnx)W

\og2(cnx)

= Vk \n\ x\*\ ^ I f t i a ^ W *

lQ 2

g fa*) .

Then use (8.2.3) with the Skorohod construction and assumption (8.2.19), which for
y = 0 implies \og2(cn)/Vk -> 0, n -> oo. Hence limw_>oo fn*~(x) x* a n c ^ n e n c e
also lim n ^oo / ( * ) = * .

Proposition 8.2.10 Assume the conditions of Theorem 8.2.1. Let v(S) be as in


(8.2.15) with S replaced by S. Then
v(S)-+v(S)

284

8 Estimation of the Probability of a Failure Set

Proof. From Proposition 8.2.8 we know that 0(5) -> p v(S). Invoke a Skorohod construction, so that
v(S) - v(S) = o{\) a.s.
and
(

V* [ Yi ~ J*. ^ 8 - 1 , * '

(f)

( l )

) = (Od). O(l). O(D) a.s.

for / = 1,2. Then from Lemma 8.2.6 and Proposition 8.2.9 we have
l(x,y)

-+ ls(*,y)

for (x, v) G 5. The rest of the proof is like that of Proposition 8.2.8.

Proof (of Theorem 8.2.1). By (8.2.6) and (8.1.11),


Pn =

v(S)(l+op(l)),

ncn

as n -> oo. Therefore from Proposition 8.2.10,


r

Pn

40(^

hm = lim ^-oo pn

n-^oo

pn

HS) P ,

= hm

> 1.

->oo v ( 5 )

Proo/ (of Corollary 8.2.4). It is enough to prove that cn/cn ->p 1, -> oo. For this,
note that

i/n

Then from Proposition 8.2.9,


-+1
Similarly rn/rn-+p

I. The result follows.

Remark 8.2.11 The estimation of qn and rn is practically the same as tail estimation
as discussed in Section 4.4. Since the conditions of Corollary 4.4.5 are satisfied (note
that qn - oo corresponds to dn -> oo there), one could alternatively invoke this
result to prove the consistency of cn.

8 Failure Sets Contained in Some Upper Quadrant

285

8.3 Failure Set Contained in an Upper Quadrant; Asymptotically


Independent Components
Let (X, Y) be a random vector with distribution function F and suppose that F satisfies
the domain of attraction condition (8.1.1). Under this condition, a particular case is
that in which the limiting distribution is the product of its marginal distributions.
Recall from Section 6.2 that a random vector (X, Y) whose distribution is in the
domain of attraction of such a max-stable distribution was defined to be asymptotically
independent.
Let us start with the failure set as an upper quadrant. From (8.1.1) one gets
hm t P
t-+oo

> x or
\

a\(t)

> y ) = - log G(x, y) ,


02(f)

and hence

(X-bi{f)

hm t P
t-+oo \

AY-b2(t)

> x and

> y
ai(t)
a 2 (0
/
= log G(x, y) - log G(x, oo) - log G(oo, y),

and in case of asymptotic independence the right-hand side is identically zero. More
generally if Q is any Borel set contained in [w, oo) x [v, oo), with u,v > 0 and
v(dQ) = 0, under asymptotic independence of (X, Y),

i+

H(( ^)'> ^)'>HThis gives too little information on the probability of the set Q.
To estimate
pn = P ((X, Y) C)

when

Cn C [u, oo) x [vn, oo) ,

we propose the following refinement of (8.1.4), which will lead to a new limit measure
v: for JC, y > 0 and some functions a\,a2,r positive and b\, b2 real, r -> oo,
^ r W P j ^ l + w - ^ - j .

>,and(^l

n ^ ^

> ,j

(8.3.1)
exists, and it is positive and finite.
Then, similarly as in Section 6.1.3, one can define the measure v as follows: for
any Borel set Q in R^_ with inf (x,y)eQ max(x, y) > 0 and v(dQ) = 0 let
v{Q)

-&H(o-^)>*^)>*)(8.3.2)

286

8 Failure Sets Contained in Some Upper Quadrant

Moreover, it follows that the function r is regularly varying with index greater than
or equal to 1. Using the notation introduced in Section 7.6, the index of the regularly
varying function r is l/rj, where n e (0, 1] is the residual independence index. Also
as in the proof of Theorem 6.1.9, it follows that v is homogeneous of order l/rj, i.e.,
u(flfi) = a - 1 / M f i ) ,

(8.3.3)

for any a > 0, where aQ is the set obtained by multiplying all elements of Q by a.
Note that (8.3.2) is valid if X and Y are not asymptotically independent. In this
case r(t) = t, rj 1, and v = /i, the exponent measure of Section 6.1.3.
We are now ready to proceed with the estimation of pn, which closely follows the
reasoning developed in the previous section. Using again (8.1.8),
Pn

= P((X, Y) e Cn)

'((("Hspr-o-^pn-i-

which, now by (8.3.2), is approximately equal to (cf. (8.1.9))


v(cS)

'(f)

v(S)

cj"r(f)'

where the last equality follows from (8.3.3). Comparing with the previous section,
apart from estimating S and v, we now have to deal with the parameter 77. But this
was the subject of Section 7.6, from where we know how to estimate rj.
In the next theorem we state the necessary conditions for the consistency of pn.
As in the previous section we opted for a long theorem that is mostly self-contained
in all its conditions and definitions. The proof is left to the reader (Exercises 8.1-8.3).
Theorem 8.3.1 Let (X\, Y\),..., (Xn, Yn) be Ltd. random vectors with distribution
function F, satisfying (S3.1) for some positivefunction r, with marginal extreme value
indices yi and normalizing functions ai > 0, b[ real, i = 1, 2.
Consider some estimators ofyt, at(n/k) > 0, bi(n/k), i = 1,2, and rj such that
for some sequence k/n > 0, r(n/k)/n > 0 (this implies k 00), n > 00,

^ (n - YI, ^ | S - 1. * ( g ) 7 , ? ( g ) ) = (o,(i>. Opd). oP(D) , (8.3.4)


\

i\k)

ai

\k)

i = 1,2, and
f^(ii-r,)

= Op(l).

(8.3.5)

Suppose Cn is an open set and that there exists some boundary point of Cn>
(vn,wn) such that
Cn C [Un.OO) X

[Wn,00)

8 Failure Sets Contained in Some Upper Quadrant

287

for all n, and that

where S is an open Borel set in [0, oo) 2 with v(dS) = 0 and v(S) > 0, and cn a
sequence of positive numbers with cn -> oo, n -> oo.
Finally, suppose 0 < qn/rn < oo with qn and rn as in (8.2.2) (our conditions
imply that qn/rn does not depend on n),

lim
lim
n->oo y

" W ^ o ,

(8.3.7)

J-^-log(qn)=0,

(8.3.8)

and that (8.3.2) holds with Q replaced by cnS, i.e.,

lim r (-r
n->-oo

(l

y-Mf)'
+ K2

1/K2N

ecnS\

/u(cS) = l .

27ie with
1

(8.3.9)

where
5:=

I 1 + Kix7
1/K2>

i( 1 + *^fi,

1:( ,. c .

we /zave

Remark 83.2 The sequence c can be estimated as in Section 8.2.2.

(8.3.10)

288

8 Failure Sets Contained in Some Upper Quadrant


HmO

100

SWL1

200

300

400

Fig. 8.4. Diagram of estimates of y with moment estimators.

8.4 Sea Level Case Study


As described at the beginning of this chapter, we have 828 independent and identically
distributed observations of the wave height (HmO) and the still water level (SWL).
These are illustrated in Figure 8.1, as well as the failure set
Cn = {(HmO, SWL) : 0.3HmO + SWL > 7.6} .
The first step for the estimation of pn = P((X, Y) e Cn) is the estimation of the
marginals. For that see Sections 3.7.3 and 4.6.2. In Figure 8.4 we show the diagram
of estimates of y, i.e., the estimates against the number of upper order statistics k,
for both samples. We use the moment-type estimators given after Theorem 8.2.1.
According to earlier results (Section 3.7.3) we take zero for the extreme value index
of SWL, yswL = 0. We show results for the window 40 < k < 110, which seems a
quite reasonable one. In Table 8.1 we illustrate point estimates for k = 100, which
were the ones used for Figures 8.2 and 8.3.
As discussed at the end of Section 8.1, the shape of the failure set may determine
the method to estimate pn. Since in our case the set S contains part of both axes,
we are in the conditions of Section 8.2. In Figure 8.5 we show transformed sets S
Table 8.1. Point estimates for k = 100.
HmO
Y
-0.22
a(n/k)
1.14
bin Ik)
4.10
qn(vn =5.S5) 6.85
rn(wn =5.85)
cn(r = 50)

SWL
0.00
0.30
1.34
2879640.
57592.80

8 Failure Sets Contained in Some Upper Quadrant

20

S(r=50)

S(r=10)

15

289

S(r=100)

>o
10

15

20

1~
0

~\

10

15

20

Transformed wave height

Fig. 8.5. Transformed data set and transformed failure sets (area above the curved line).
for r = 10, 50,100 (we use the approach of Section 8.2.2, where, recall, r denotes
the radius of the circle to which the boundary point C?i, $2) belongs, in the picture
illustrated with the diamond point). One sees that there is quite a difference in the
number of points belonging to each set 5, but the effect of this in the estimates of pn
is quite negligible. In Figure 8.6 one finds the diagram of estimates of pn over the
window 40 < k < 110 and for r = 10, 50,100.

1
1

/" 1U
1

TDU

~~ "

r=100 1
1

k
wPi I

8xl0~ 7

6xl(T 4xl(T7-

/ \A

2xl0~ 7 0-

' \m

Y^

60

Fig. 8.6. Diagram of estimates of pn.

Exercises
In the next three exercises one gradually proves Theorem 8.3.1.
8.1. Let (Xi, F i ) , . . . , (X, Yn) be i.i.d. random vectors with distribution function F,
satisfying (8.3.1) for some positive function r and limit measure u, with normalizing

290

8 Failure Sets Contained in Some Upper Quadrant

functions at > 0, b( real, and marginal extreme value indices yi9i = 1, 2. Let S be
an open Borel set in R+ with inf (X,y)es max(jc, v) > 0, v(dS) = 0, and v(S) > 0.
Introduce the random variable

with k satisfying k = k(n), k/n -> 0, and r(n/k)/n


n -> oo. Under the stated conditions prove that

-> 0 (this implies k -> oo), as

Hint: See Proposition 8.2.7.


8.2. Let (X\, F i ) , . . . , (Xn, Yn) be i.i.d. random vectors with distribution function F,
satisfying (8.3.1) for some positive function r and limit measure v, with normalizing
functions at > 0, b\ real, and marginal extreme value indices y/, i = 1, 2. Suppose
(8.3.4) for some sequence A: = k(n),k/n -> 0,r(n/k)/n -> 0 (this implies A: -> cx)),
n > oo. Let 5 be an open Borel set in R+ with inf (*,);) es max (*> )0 > 0, u(35) = 0,
and v(S) > 0. Define

Prove that

0(5) - u(5) .
#mf: See Proposition 8.2.8.
8.3. Prove Theorem 8.3.1.
8.4. Under the conditions of Theorem 8.3.1 and with cn as in (8.2.10)-(8.2.12) prove

Part III

Observations That Are Stochastic Processes

9
Basic Theory in C[0,1]

9.1 Introduction: An Example


Infinite-dimensional extreme value theory is not just a theoretical extension of the
theory to a more abstract context. It serves to solve concrete problems. We start with
a motivating example.
The two northern provinces of the Netherlands, Friesland and Groningen, lie
almost completely below sea level. Since there are no natural coast defenses like
sand dunes, the area is protected against inundations by a long dike. Since there is
no subdivision of the area by dikes, a breach in the dike at any place could lead to
flooding of the entire area. This leads to the following mathematical problem.
Suppose we have a deterministic function / defined on [0,1] (representing the
top of the dike). Suppose we have independent and identically distributed random
functions Xi, X 2 , . . . defined on [0, 1] (representing observations of high-tide water
levels monitored along the coast). The question is, how can we estimate
P(X(s) < f(s) for all s [0,1])
on the basis of n observed independent realizations of the process X (n large)?
Now, a typical feature of this kind of problem is that none of the observed processes
X comes even close to the boundary / , that is, during the observation period there
has not been any flooding. This means that we have to extrapolate the distribution of
X far into the tail. Since nonparametric methods cannot be used, we resort to limit
theory; that is, we imagine that n -> 00, but in doing so we wish to keep the essential
feature that the observations are far from the boundary. This leads to the assumption
that / is not a fixed function when n -> 00 but that in fact / depends on n and
moves to the upper boundary of the distribution of X when n -> 00. So, as in the
finite-dimensional case, in order to answer this question, we need a limit theory for
the pointwise maximum of independent and identically distributed random functions,
and this is the subject of the present chapter. In fact, this theory of infinite-dimensional
extremes is quite analogous to the corresponding theory in finite-dimensional space
explained in Chapter 6.

294

9 Basic Theory in C[0, 1]

Let X, X\, X2,. . be independent and identically distributed stochastic processes


on [0, 1] with continuous sample paths, i.e., belonging to C[0, 1], the space of continuous functions / on [0, 1 ] equipped with the supremum norm 1/100 = sup^Q l] I / (s) I
We consider extreme value theory in C[0, 1]. That is, we assume that there exist continuous functions as(n) positive and bs(n) real, n = 1,2,..., such that the
sequence of stochastic processes

/
{max
[i<n

Xj(s)-bs(n)\

I
as(n)

J 5 [ 0f i]

converges weakly (or in distribution) in C[0, 1] to a stochastic process


{Y(s)}se[ofi] with non-degenerate marginals, i.e., Y(s) is non-degenerate for all
s e [0,1], Hence we take the maximum pointwise and consider convergence of
the resulting stochastic process.
The parameter set [0, 1] has been chosen for convenience. Most results hold for
any compact subset of a Euclidean space.
For convenience we shall sometimes refer to the parameter / as "time" and to the
parameters as "space.''
We want to investigate the structure of the possible limit processes Y and for each
of them we want to characterize the domain of attraction.

9.2 The Limit Distribution; Standardization


As in the finite-dimensional case the problem becomes more tractable if we first
transform to processes with standard marginal distributions. For that assume that the
marginal distribution functions Fs(x) := P(X(s) < x) are continuous in x. Then in
order to transform to standard marginals note that the convergence in C[0, 1] entails
F? (as(n)x + bs(n)) -> P(Y(s) < x)

(9.2.1)

as n - 00, uniformly for s [0, 1] and locally uniformly for x. In particular, note
that it means that the distribution functions of X(s) and Y(s) are continuous in s.
We can choose the functions as(n) and bs(n) in such a way that P(Y(s) < x) is an
extreme value distribution in the von Mises form (cf. Section 1.1.3). Then
P(Y(s) <x) = exp ( - (1 + y(s)x)-l/y(s)^

(9.2.2)

for s e [0, 1] and all x with 1 + y(s)x > 0, where y is a continuous function.
From (9.2.1) and (9.2.2) we get
lim n {1 - Fs (as(n)x + bs(n))} = (1 + y(s)x)-l^(s)

(9.2.3)

n-oo

uniformly for s e [0, 1] and locally uniformly for x with 1 + y(s)x > 0. Since
convergence of a sequence of monotone functions is equivalent to convergence of
their inverses (Lemma 1.1.1), (9.2.3) is equivalent to

9.2 The Limit Distribution; Standardization


,.
Us{nu)-bs(n)
lim
=
n-+oo
as(ri)

295

uy{s)-l

y(s)

uniformly for s e [0,1] and locally uniformly for u e (0, oo), where Us is the
left-continuous inverse of 1/(1 Fs) for s e [0,1].
We have
max -Xi(s)-b
J ^ -s(n)}
^ ^ \
Ad {F(s)} 5[(U]
(9.2.4)
i<n
as(n)
j 5 [o,i]
in C[0,1] and (9.2.3). By combining the two and using the uniformity in both statements we get
max

- i 1(1 +

y(s)Y(s))l/y{s)]

in C[0,1]. We have proved the following result.


Theorem 9.2.1 Let X, Xi, X2, ...be i.i.d. stochastic processes in C[0,1]. Letas(n)
positive and bs(n) real be continuous functions, {Y{s)}se\,\} & stochastic process in
C[0,1],
Fs(x):=P(X(s)<x)
continuous in xy and Us the left-continuous inverse of 1/(1 Fs). The following
statements are equivalent:
f
Xj(s)-bs(n)\
d
{max

}
-> {y(5)},[0,i]
[i<n
as(n)
Jse[01]
in C[0,1], where as(n) and bs(n) are chosen such that log P(Y(s) < x) =
(1 + y{s)x)-xlv{s) for all x with 1 + y(s)x > 0.
2.
Imax n
V,Y,\
^ W^seiOM
1
i<n / l ( l - F s ( X i ( 5 ) ) ) J5[0,1]
in C[0,1], and for all u e (0, 00),
Us(nu)-bs(n)
Uy^ - 1
hm

n->oo
as(n)
y(s)

(9.2.5)

(9.2.6)

uniformly for s e [0,1].


The relation between Y and Z is {Z(s)}se[o,\] =d {(1 +

y(s)Y(s))l/y{s)}se[o,i].

Remark 9.2.2 Relation (9.2.6) means that the function Us(t) is extended regularly
varying with an extra parameter (see Section B.4). We call the continuous function
y = y(s) the index function.
Since relation (9.2.6) is not difficult to handle (cf. Section B.4 in Appendix B),
this theorem reduces our problem to studying the limit relation
1

-\fb-iri
"ill

(9.2.7)

inC[0,1] for independent and identically distributed stochastic processes 1, 2, -in

296

9 Basic Theory in C[0,1]


C + [0,1] := {/ G C[0, 1] : / > 0} ,

wherefore e [0, 1],


Pfofo) < 1) = e~l .

(9.2.8)

Theorem 9.2.3 Suppose 771,772,... tfre /./.J. copies of the process rj from (9.2.7).
Then for all positive integers k,
1 *

rX/m^rj.

(9.2.9)

Proo/ Let A: and n be positive integers. We write


1

nk

*=1

7=1

r=l

where the r, are independent and they all have the distribution of the ,-. Now keep
k fixed and let n tend to infinity. Then the left-hand side tends to rj in distribution and
the right-hand side tends to k~l Vy=i *lj m distribution where the rjj are independent
and have the same distribution as rj.

Definition 9.2.4 A stochastic process on C + [0, 1] with nondegenerate marginals is


called simple max- stable if (9.2.9) holds and P(r](s) < 1) = e~l for all s e [0, 1]
(i.e. it has standard Frechet distribution).
Hence the class of limit processes of Theorem 9.2.1(2) is the same as the class of
simple max-stable processes.
More generally we have the following:
Definition 9.2.5 A stochastic process Y on C[0, 1] with nondegenerate marginals is
max-stable if there are continuous functions as(n) positive and bs(ri) real such that
if Y\, Y2,..., are i.i.d. copies of Y, then
W
;=i

Yi

bn

forn = 1,2,....
Hence the class of max-stable processes coincides with the class of limit processes
in (9.2.4).

9.3 The Exponent Measure


Next we take (9.2.7) as our point of departure. So let , 1, 2, be independent and
identically distributed stochastic processes in C + [0, 1] for which (9.2.7) and (9.2.8)

9.3 The Exponent Measure

297

hold. It is useful at this point to introduce a sequence of measures vn, n = 1, 2, 3,


For Borel sets A c C + [0, 1] define the measures vn by
vn(A) :=n P (n~1^ e A \ .

(9.3.1)

We shall prove that as in the finite-dimensional case, the sequence of measures vn


converges in a certain sense. In order to do so, we want to show first that the sequence
vn is relatively compact in a suitable complete separable metric space (CSMS, cf.
Daley and Vere-Jones (1988)). Note that C+[0,1] is not a CSMS. Hence we need to
extend the space C + [0,1] and we also need to change the metric.
Since the transformation
/ ^

(l/loo,//l/loo)

is one-to-one on C + [0,1], we can write


C + [0, l] = (0,oo)xC+[0, 1],
where
C+[0,1] := {/ e C[0,1] : / > 0, |/|oo = 1} .
Next we enlarge the space (0, oo) x C+[0,1] to (0, oo] x Cx [0,1], where
Cfro, 1] := {/

C[0, 1] : / > 0, |/|oo = 1}

(9.3.2)

The space Cx [0,1] equipped with the supremum norm is CSMS, and we turn (0, oo]
into a CSMS by introducing the metric g(x,y) := \l/x l/y\. Hence finally we
introduce
c[0,1] := (0, oo] x C+[0,1],
(9.3.3)
with the lower index Q meaning that the space (0, oo] is equipped with the metric Q,
and we consider vn as a measure on CQ [0,1] for each n. Despite the introduction of the
new normed space, for convenience we still use the notation |/|oo for sup0<5<i f(s).
Theorem 9.3.1 Let , i, fr,... be i.i.d. stochastic processes in C + [0,1]. If

i=l

mC+[0, 1], then


d

in CQ [0,1], where vn{A) := nP(n~1^ e A)forn = 1, 2 , . . . .


Equivalent^ for every Borel set A in {/ e C[0, 1] : / > 0} such that inf {|/|c
/ A] > 0 and v(dA) = 0, we have
lim vw(A) = v(A) .
n-*oo

298

9 Basic Theory in C[0, 1]

The relation between the probability distribution of rj and the measure v is that
form = 1, 2 , . . . ,
P(neAK,x)=exp(-v(AcKx))
with, for K = (K\,...,

Km) compact sets in [0, 1] and x = (x\,...,

AK,x := {/ C+[0, 1] : f(s) < xjfors

xm) positive,

e Kh j = 1, 2 , . . . , m) .

Later on we shall need a refinement of the result.


Corollary 9.3.2 The conditions of the theorem imply
lim tP(t~l% e A) = v(A) < oo,
where t runs through the reals, for each Borel set A in {f e C[0, 1] : / > 0} such
ftof inf{|/|oo : / A] >0andv(dA) = 0.
Proof Let m = 1, 2, 3 , . . . , let K\, K2,...,
positive numbers. Define
*:={/

ATW be compact sets, and x\, X2,..., xm


= h2,...,m}c

: f(s)<XjforseKj,j

(9.3.4)

Forf > 1,
MJ* ( M - 1 ^ 5 ) > [^tP (t~li= e Z?) > Y ^ r d + lt])P ( d + M ) " 1 * G 5 ) ,
and both the right- and left-hand sides converge to v(5), for all sets B of the form
(9.3.4), when t -> 00. Since the measure v is determined by its value on sets of that
form, the proof is complete.

Remark 9.3.3 Note (Daley and Vere-Jones (1988), A.2.6) that the conclusion of the
theorem amounts to vn ->d v in "weak hash" topology (it>#) or equivalently to weak
convergence in any subspace of the form {/ :
\f\oo>a},a>0.
For the proof of the theorem we need two lemmas.
Lemma 9.3.4 Let rj be a simple max-stable process on [0, 1]. Then
P(\r}\oo <x)= exp ( J

, x >0 ,

with c a positive constant.


Proof With rj\, 772,... independent and identically distributed copies of rj we have
1

V= ~V W
for all n. Hence
P(\r)\oo<x)

The result follows.

Pn{\n\00<nx).

9.3 The Exponent Measure

299

The most important step in the proof of Theorem 9.3.1 is the following result.
Lemma 9.3.5 Under the conditions of Theorem 9.3.1, for each s > 0 the sequence
of measures {vn,e} defined by
v,e(A) := vn{f e A : |/|oo > e}
for each A e CQ [0,1] is relatively compact.
Proof. We need to prove two things: First, that the sequence
vn,e (c+[0,1]) = vn {/

C+[0,1] : l/loo > e\

is bounded. This follows from Lemma 9.3.4 since


vn,e (c+[0, l])=nP

( n - ^ l o o > e) -> -logP(|f?|oo < e) = J .

(9.3.5)

Second, that the sequence {vn,e} is tight. Note that since vn,e(CQ [0,1]) has a
finite limit as n -> oo, we can check tightness for the sequence {vn,g} as if it were
a sequence of probability measures. According to Billingsley (1968), Theorem 15.3,
this is equivalent to the following:
1. For each positive ft there exists an a > 0 such that
Vn,e(S<*)<P
for all n, where
Sa := {f C+[0,1] : |/|oo > of}
for each a > 0.
2. For each positive a and ft, there exist a <$,0 < <5 < 1, and an integer no such that
(a)
vn,*{/:u/;()>a}<0
for n >no with
ft/}(*):=

sup m m ( | / ( * ) - / ( 5 i ) | , |/(s 2 ) - / ( * ) ! ) ;
,s

2 *! ^

vn,e\f

(b)
I

sup | / ( J ) - / ( 0 I >
0<s,t<8

for n > no;


(c)
Vn,e\f
I

for n > no.

sup
15<5,f<l

\f(s)-f(t)\>a\

<p
I

300

9 Basic Theory in C[0, 1]

Now (1) follows from the first part of the proof. Next we prove (2a); the other parts
are similar. Relation (9.2.7) implies convergence in distribution, hence tightness, of
{Mn v ( a / 2 ) } ^ ! with Mn := n~l V?=i & Consequently, for any * > 0,

^Kv(/2)(')^(/2))</l*
forn > no*. Define
Qn,a = (Mn V ( a / 2 ) ) l{|&|oo>n(a/2) for some/, |y|oo<n(a/2) fory^i}-

Since <2n,a is either 0 or Mn v (a/2), we have


^*(^(*)>(/2))
=

for

{^Qn.a^ -

" / 2 ) ' %loo>(/2) for some,', I f , - ^ / , ^ ) fory^} = 0 j

( a , Q , ^ ) - ( a / 2 ^ ' ^Iftloc^n^/Z) for some,', |f ; | 0O <n(a/2) fory>,'} = l j

^ wo*. Hence by the definition of Qn,ai


n P"-1 (|f Ico < y ) / (4%v ( a / 2 )(*) > f ) = P (&(*) > | ) < /T
(9.3.6)
for n > o*Now
P - 1 (if Ico < y ) = ^ ( " - 1 ) / n (|A/loc < f ) - /> (Moo < ) = : d > 0 .
(9.3.7)
Hence by (9.3.6) and (9.3.7),
/ ,,
a\ IB*
nP(^/nvW2)(S)>-)<^-=:p
for n > no. Since
a>}(8) < co"fy(a/2){&) +co"fA(a/2){&) < o# v ( a / 2 ) () + | ,
we obtain
*p(o> / n ( * ) > ) < ,
i.e.,
vn,fi{/:a#()>a} <j8.

Proof (of Theorem 9.3.1). Note that since we have convergence in C + [0, 1], for
m = 1, 2 , . . . , K\, K2,..., Km compact sets in [0, 1], and positive x\, X2,..., xm,

9.3 The Exponent Measure

301

lim vn{f : / 0 ) < XJ, for s e Kj, 7 = 1, 2 , . . . , mf


n->oo

= lim nil - P(n~li-(s) < */, for s e KjJ


l

= lim n\ogP(n~ (s)


= lim -\ogP\-\l

= 1,2,..., m))

< JC7, for5 G Ku 7 = 1,2,... ,m)

t=i(s)<xh

fors e KjJ

= l,2,...,m )

= - log P(r](s) < XJ, for s e Kj, j = 1, 2 , . . . , m) .


Now there is exactly one measure, say v, satisfying, for each choice of m,
A-l, . . . , A m , Xi, . . . , xm,

v{f

: f(s)<xj,
forseKj,
7 = 1,2,. . . , m } c
= - log P^Cs) < Xj, for 5 G ^ - , j = 1, 2 , . . . , m) .

Since for any s > 0 the sequence {vn?} is relatively compact by Lemma 9.3.5,
every convergent subsequence has this same limit.

Definition 9.3.6 We call the measure v the exponent measure of the simple maxstable process. This is analogous to the exponent measure in finite-dimensional space
(Section 6.1.3).
The characterizing property of the exponent measure is the following homogeneity
relation.
Theorem 9.3.7 For any Borel set A in {f e C[0,1] : / > 0} such that inf {|/|oo :
/ e A] > 0 and v(3 A) = 0, and any a > 0,
v(aA)=a~lv(A),

(9.3.8)

where the set a A is obtained by multiplying all elements of A by a.


Proof. On the one hand, from Corollary 9.3.2, for any a > 0,
lim t a P (t~~l e aX] = v(A) .
t-+oo

But the left-hand side also converges to a v{aA).

Remark 9.3.8 Hence by Theorems 9.3.1 and 9.3.7, for any K some compact subset
of [0,1], and each x > 0
P I sup rj(s) < x J = exp ( - v { / : f(s) > x for some s e K})
= exp I v{f : f(s) > 1 for some s e K] J ,
i.e., supJG rj(s) has an extreme value distribution.

302

9 Basic Theory in C[0, 1]

As in the finite-dimensional case, a nice intuitive background for the role of the
exponent measure is provided by the following theorem.
Theorem 9.3.9 Assume the conditions of Theorem 9.3.1. Define the random measures
Nn on CQ [0, 1] as follows: for any Borel set A with v{dA) = 0,

iV(A):=X;i { l l -i 6 i l ) .
i=l

Let N be a Poisson process on CQ [0, 1] with mean measure v. Then Nn converges


in distribution to N, i.e., for m = 1,2,... and Borel sets Aj with v(dAj) = Ofor
j = 1, 2 , . . . ,m,
(Ak(Ai), Nn(A2),...,

Nn{Am)) -i (N(AX), N(A2),...,

N(Am))

Proof. Without loss of generality we consider the sets Aj, j = 1, 2 , . . . , m, disjoint.


Let Ai, A.2,..., Am > 0. It is sufficient to prove that the Laplace transform of the
left-hand side,
^ e x P I -^*.jNn(Aj)

J = I exp I -

YlkJl{n-HeAj}

= ( l + E p (n~lse AJ) (e'kj - l ) ) (9-3-9)


converges to that of the right-hand side,
E exp f - f ] A,iV(Ay) j = exp | f ] v (A ; ) ("*; - l) j
The convergence follows from the conclusion of Theorem 9.3.1.

9.4 The Spectral Measure


Recall that

_
Cfro, 1] := {/ C[0,1] : / > 0 , |/|oc = 1} .

In Section 9.3 we proved that the exponent measure v satisfies a homogeneity property:
for a > 0 and any Borel set A in C^"[0, 1] = (0, oo] x C^[0,1],
v(aA)=a'lv(A) .

(9.4.1)

9.4 The Spectral Measure

303

As we did in Proposition 6.1.12 for the finite-dimensional case, we now apply a


polar coordinate-type transformation / -> (|/|oo//l/loo) that leads to a spectral
measure.
_
_
Let A be a Borel set in C*[0,1]. For r > 0 define the Borel set Br,A C C*[0,1]
by
Br,A := (r, oo] x A .
Clearly
# r A := rB\A ;
hence by (9.4.1)
v(BrfA) = r- 1 v(fli f A ).
This relation means that after the transformation / -> (I/loo, //I/loo) the measure v becomes a product measure. Define the measure p on Cx [0,1] by
p(A) := v(Bi fA )

(9.4.2)

for each Borel set A in Cx [0,1]. This finite measure is called the spectral measure
of the limiting process rj in the relation
1

",=i

Theorem 9.4.1 (Gine, Hahn, and Vatan (1990)) Suppose^, &,... are Ltd. stochas
tic processes in C + [0, 1],

V&4

(9.4.3)

"ill
in C + [0, 1], and P(r](s) < 1) = e~l for s [0, 1], i.e., rj is simple max-stable in
C + [0, 1]. Then there exists a finite measure p onCx [0, 1] with

f(s) dp(f) = 1

(9.4.4)

C+[0,1]

for all s e [0, 1] such that for m = 1, 2 , . . . , K\, #2, , Km compact sets in [0, 1],
andx\,X2,...,
*m > 0,
- log P ( ? ; ( J ) < Xj, fors e Kj, j = 1, 2 , . . . , m)
=

max [xj1 sup g(s)\ dp(g) .


Jci[0,l] ^J<m \ J seKj
J

(9.4.5)

Conversely, any finite measure p on Cx [0, 1] satisfying (9.4.4) gives rise to a


simple max-stable stochastic process in C + [0, 1]. The connection is given by (9.4.5)
(note that even the finite-dimensional distributions determine the distribution of a
process in C + [0, 1]; cf. Billingsley (1968), p. 20).

304

9 Basic Theory in C[0, 1]

Proof. Let r\ be simple max-stable in C + [0,1 ]. We have already obtained the measure
p in (9.4.2). Next we prove (9.4.5) for this measure p. We proceed as in the proof of
Theorem 9.3.1. On the one hand, as in the mentioned proof,
lim vn \f : f(s) < xt, for s Kj, j = 1,2,... , ra}c
= lim nP({%(s) <nxj,

fors e Kj,j

= - log P (17 CO <XJ, fors e Kj,j

= 1,2,

\,2,...,m)c)
...,m)

and on the other hand,


lim vn {/ : f(s) < xJ9 fors e Kj, j = 1, 2 , . . . , m}c
= v { / : f(s)<xj,

forseKj,j

l,2,...,m}c

= v { / : l/|oo/(*)/l/loo<*y, for^G iT7, 7 = 1,2


= v | / : |/|oo > i min *,- /
I
l<J<m j

sup J
l/loo I

seKj

^gCi [0,1] ./r>> nun1 XJ/


Xj / sup
SUp,sK
j?(5)
pJr_. g(s)
\<]<m

'

m} c

'

SUp g ( j )

max

dp(g) .

Ci [0,1] !</<

*/

For (9.4.4) note that P(r](s) < 1) = e~l, s e [0,1]. Hence for each s e [0,1],
I = -logP(rj(s)

< I) = v{f : f(s) > 1}

= v { / : l/loo > ( / W / l / l o c r 1 )

= L

% dp(g) = f

g(s) dp(g) .

For the converse statement of the theorem assume that p is a finite measure on
Cj [0, 1] satisfying (9.4.4). The measure v on CQ [0,1] is defined by
v { / : |/|oo > r a n d / / l / l o o A} = r~ 1 p(A)
for r > 0 and A a Borel set in Cx [0,1]. Let N be a Poisson point process on CQ [0, 1]
with mean measure v (cf. Theorem 9.3.9). Let
?i>?2, ft.
be a realization of the point process. Define
CO

n:=\JKi-

(9A6)

9.4 The Spectral Measure

305

We claim that rj is a simple max-stable process in C + [0, 1].


First we show that the process r\ is finite:
^(Moo < x) = P(iV has no points in the set {/ : |/|oo > JC})
= exp(-v{/ : |/|oo>^})=exp(-x-1/Q(c^[0,l])) .
Next we check the distribution of rj: for m = 1, 2 , . . . , K\, K2,...,
in[0, l],*i,*2> ...,*m > 0,

Km compact sets

POK*) <XJ, for5 G / , . / = 1,2, . . . , m )


= P(the graph of every f e N avoids the set Kj x (JC/, 00], 7 = 1, 2 , . . . , m)
= exp I - /
max I xjl sup gO) ) rfp(g)) .
l
\ Jet[0,1] J*m \ J seKj
)
)
Then we check that the process r\ is max-stable. Take k independent copies of the
process as defined by (9.4.6). Then
k

;=i

00

;=1 i=i

and it is clear that this process has the same structure as the process rj except that the
measure v is replaced by kv. On the other hand we write
00

*!? = V*&
1=1

and again the process has the same structure as the process r\ except that for a Borel
set A c C J 0 , l ] the mean measure is now
v{f

: kf eA} = v \k~lA\

= kv{A]

by Theorem 9.3.7. Since the two processes Vy=i *lj an( * ^ ^ a v e ^ e s a m e distribution,
the process r\ is max-stable.
Finally, we prove that rj is in C + [0,1], that is, we prove that t] has continuous
sample paths and that P(r] > 0) = 1. In order to prove continuity we show that (1)
liminfj-^o r)(s) > t](so)'9 (2) limsup^ 5 o rj(s) < rj(so) for each so [0,1] with
probability one.
1. Take any realization fi, &, 3, CQ [0,1]. Since 77 := v ^ l ^ f o r e a c h e > 0
there is a f,- such that f,- Oo) > rj(so) s. Since & is continuous, lim5_>50 f,- 0 ) =
&0o)- Hence
liminf r](s) > lim f,-(,?) > 77 Oo) s .
5>-5o

2. Define

*so

306

9 Basic Theory in C[0,1]


A] := {/ e C+[0,1] : f(s) < x for* e l}c,
where x > 0 and / is a closed interval in [0, 1]. Then
P (N (A)) < oo) = 1
since v (A^) < oo. Then also
P (N (Axj) < oo for all x > 0 rational and / 1Q) = 1,

(9.4.7)

where J g is the set of closed intervals in [0, 1] with rational endpoints.


Now let us take a realization of the point process satisfying the statement in (9.4.7).
Suppose that for some so [0, 1] and real y > 0,
oo

*i(so) = Vb(so) <y .


It is sufficient to prove that lim s u p ^ ^ v ? ^ f,- (s) < y.
First we note that this implies
N(Ay{so])=0.

(9.4.8)

Next take a monotone sequence of intervals In e XQ such that n ^ / , , = {so}Then


noo

Ay

Hence, since TV is a measure,

It follows that N \Ayj\ = Oforw > wo and hence sup5G/n V/=i f/C*) ^ yforn > oIn particular, lim sup5_^0 V S i f* 0s) - ? This proves the continuity.
The last statement we need to prove is
P(n> 0) = 1 .
We have
1

"ill
with 77,171,772 >.. .,y)n independent and identically distributed.
Note that for s e [0, 1] we have n~l V?=i Vi (s) = 0 if and only if 77* (.s) = 0 for
1 = 1,2,...,/!.
Define A := {5 [0, 1] : rj(s) = 0} and A( := {* e [0, 1] : rn(s) = 0},
/ = 1, 2 , . . . , . We have
P(A ^0) = P (n?=lAi / 0) < P (D?=1{Ai 0}) = Pn(A 0)
for all n. Hence either P(A ^ 0) = 1, i.e., there is some s with rj(s) = 0, which is
impossible since P(rj(s) < x) = exp ( - 1 / J C ) for * > 0, or P(A j 0) = 0, and that
is what we want to prove.

9.4 The Spectral Measure

307

In the course of the proof we established the following representation.


Corollary 9.4.2 Let rj be simple max-stable in C + [0, 1]. Then
OO

1=1

where f; = Z,-7r,- and the (Z{, TZI) form a realization of a Poisson point process on
(0, oo] x Cj [0, 1] with mean measure v satisfying dv = (dr/rL) x dp.
Conversely every stochastic process with the given representation is simple maxstable in C+[0, 1].
Example 9.4.3 Consider a Poisson point process on R 2 \ {(0, 0)} with mean measure
(x1 + j 2 ) - 3 / 2 dxdy. Let {(X;, F;)} be an enumeration of the points of the point
process. Note that there are only finitely many points outside the unit circle. We
show that the simple max-stable process {2 _1 v ? ^ Xt cos# + F; sin0}o<0<2;r has
the representation of Corollary 9.4.2. With x = r cos 0 and y = r sin 0 we have
(x2 + v 2 )~ 3 / 2 dx dy = r~2 dr d<p. Write Xt = Rt cos <f>; and F, = Rt sin <D,-. Note
that for each 0 the half-plane {(x, y) : JC cos# + y sin0 > 0} contains infinitely
many points of the point process. Hence for 0 < 0 < In,
1

oo

oo

- \f Xt cosO + Yi sin^ = \J -Rt ((cos *,- cos^ + sin O, sin^) v 0) .


i=i

i=i

Corollary 9.4.4 With probability one there exists a finite collection f i , . . . , & (hence
k is random) from Corollary 9.42 such that
k

ri(s) =

\/^i(s)
*=1

for all s e [0,1].


Proof Excluding the null set we can assume that r](s) > 0 for s e [0,1] and that for
each e > 0 only finitely many Z; (of Corollary 9.4.2) are larger than s. The result
follows.

Corollary 9.4.2 leads to the following simple representation.


Corollary 9.4.5 All simple max-stable processes in C + [0,1] can be generated in
the following way. Consider a Poisson point process on (0, oo] with mean measure
r~2 dr. Let {Zi}^ be a realization of this point process. Further consider i.i.d.
stochastic processes V, V\, V2,... in C + [0, 1] with EV(s) = 1 for all s [0, 1]
and E sup0<5<i V(s) < 00. Let the point process and the sequence
VJV\,V2,...be
independent. Then
00

i=l

308

9 Basic Theory in C[0,1]

Conversely, each process with this representation is simple max-stable.


One can take the stochastic process V such that
sup V(s) = c a.s.

(9.4.9)

0<s<l

with c some positive constant.


Example 9.4.6 A nice example of a simple max-stable process has already been
given by Brown and Resnick (1977): for the independent and identically distributed
processes {V/}?^ of Corollary 9.4.5 take
{V,(*)}.eR := { ^ - ' , 1 / 2 ]
I

J seR

where the W/ are independent Brownian motions, i.e.,


U=l Z
^WL 6 R :={V
'^' W H S | / 2 ! seR

where {Z, } ^ z l is a realization of a Poisson point process on (0, oo] with mean measure
dr/r2 and independent of {Wf }flx. The process rj is stationary (cf. Section 9.8 below).
For the proof of Corollary 9.4.5 we use the following result:
Lemma 9.4.7 Suppose P is a Poisson point process on the product space S\ x 52
with S\ and S2 metric spaces and the intensity measure is v = v\ x V2, where v\ is
not bounded and V2 is a probability measure. The process can be generated in the
following way: let {U(} be an enumeration of the points of a Poisson point process
on S\ with intensity measure v\ and let V\, V2, be independent and identically
distributed random elements ofS2 with probability distribution V2. Then the counting
measure N defined by
00

N(Ai x A2) := Y^ l{(Ui,vi)eAlxA2}


1=1

forBorel sets A\ C Si, A2 C S2, nas

tne

same distribution as the point process P.

Proof We need to prove that the number of points of the set {(/,-, V/)}?^ in two
disjoint Borel sets are independent (which is trivial) and that the number of points
N(A\ x A2) in a Borel set A\ x A2, with A\ c Si and A2 C S2, has a Poisson
distribution with mean measure vi(Ai)v2(A2). Now
P (N(Ai x A2) = k)
00

= Y^ P (N(A\ x A2) = k I the number of points in Ai = m)

9.4 The Spectral Measure

309

m!
00

?*

A , 4 ^ * n _ ,*, r ^ W - *

= E
/ -" LJfc)!ifc!
. M (V2 (A2* (1 - V2 (A2)>
*', (m
kuk\
(m
=

(MAQi^Aa))*..,^ ^
*!
^
(vi(Ai)v 2 (A 2 ))*
it!

(vi

(Ai))me-VlUl)
m!

(l-y2(A2)r^
(m-fc)!

_,

(A,^(A?)

Proo/ (0/ Corollary 9.4.5). In order to establish the representation we start from
the result of Corollary 9.4.2. Let {(Z,-, jr,-)}^ be an enumeration of the points of a
Poisson point process on (0, oo] x Cl [0,1] with mean measure
i_
dr
do
P ( C > , 1]) -j x - ^
.
r2
p(C7[0, 1])
Then the Poisson point process represented by {Zi^r,-}?^ has the same distribution
as that represented by {Z,-7r,- } ^ t .
Next define for f = 1 , 2 , . . . ,

% .= _J
p(C+[0,1])
*,- := Hi p(ct[0,1])

Then {(Z,-, J?,-)}?^ represents a Poisson point process on


(0, oo] x { / C + [0,1] : l/loo = p (C^IO, 1])) .
We now argue that its intensity measure is r ~2dr x d Q with Q a probability measure.
The intensity measure of the first component is
y{z:z/p(c+[0,l])A}

' Z2

J A Z2

for a Borel set A of (0, oo]. The intensity measure of the second component is
P { / : / > 0 , l/loo = 1 , fp (cj"[0,1]) G )

GO :=

P (cfro, i])

which is a probability measure.


Moreover, for a random element V with probability measure Q we have V (s) =
1 for s e [0,1] by (9.4.4). Hence we have the stated representation with V satisfying
(9.4.9) by Lemma 9.4.7.

310

9 Basic Theory in C[0,1]

In order to prove that conversely the stated construction represents a simple maxstable process, just follow the steps back of this proof.
It remains to prove that for the converse the requirement sup0<5<i V(s) = c
a.s. can be relaxed to E sup0<5<1 V(s) < oo. Note that the former is used to ensure
the finiteness of the process rj. But this also follows from the following weaker
assumption: we consider now a probability measure Q on the space
C* := {/ C[0,1] : / > 0 , |/|oo > 0}
with the property

Then
P

l/loo dQ(f)

< oo .

sup rj(s) < x l = P l\Zi(s)TCi < x for0 < s < 1 , / = 1, 2 , . . .J}

0<s<\

exp|- J J

^dQ(f)

^l/loo>^

ttp(-f\f\oodQ(f)\>0.

Hence the process rj is bounded.

Remark 9.4.8 Note that in the finite-dimensional situation of Part II an analogous


result holds.
Corollary 9.4.9 Under the conditions of Theorem 9.4.1, for any positive continuous
function f,
-logP(r,(s) < mforO

< s < 1) = /

JCJ[0,1]

|*//loo dp(g)

The proof of Corollary 9.4.9 is left to the reader (cf. Gine, Hahn, and Vatan (1990)).
Combining the results of Theorems 9.2.1 and 9.4.1, we get the following characterization of max-stable processes in C[0, 1].
Theorem 9.4.10 For each limit process [Y(s)}se[o,\] in (9.2.4) that satisfies
P(Y(s) <x) = exp ( - ( 1 +

y(s)xyl/y{s)}

fors e [0, 1] there exist a continuous function y and a finite measure p onCl [0, 1],
satisfying (9.4.4) of Theorem 9.4.1, such that with rjfrom Theorem 9.4.1,
ir,(s))
{Y(s)}sem] o,i] &
J^
- j\ "

'|

(9.4.10)

9.5 Domain of Attraction

311

Conversely, any pair (y, p), with y a continuous function and p a finite measure on
C
C+[0,1]
x [0, 1] satisfying (9.4.4) of Theorem 9.4.1, gives rise to a max-stable process via
(9.4.10).

9.5 Domain of Attraction


Once again we consider the limit relation

{max
[i<n

Xi(s)-bs(n)\

as{n)

}
- M F Cs)}s[0,i]
J5[o,i]

in C[0,1]. Define
Us(x) := F,

(9.5.1)

H)

for x > 1, s e [0,1]. Theorem 9.2.1 states that if (9.5.1) holds with proper choices
of as(n) positive and bs(n) real, then
v

lim
n->oo

Us(nu)-bs(n)
as(n)

uyW-l

y(s)

uniformly for s e [0,1] and locally uniformly for u (0, oo). Moreover, the processes
&(S) : =

l-FsiX^s))

'

s [0,1], satisfy

|-V&(*)I
I n

1\Q

1= 1

+ YWY(S))1,Y<')\

I
J

m=:hW}.i]
\se[0,l]

5G[0,1]

in C + [0,1], where according to Theorem 9.4.1 the probability distribution of rj is


characterized by a spectral measure p. This means that any limit process Y in (9.5.1)
is characterized by a continuous function y and afinitespectral measure p. We call the
function y the index function. This situation is quite similar to the finite-dimensional
case (Chapter 6).
We shall now establish domain of attraction conditions, that is, for each choice
of y and p we shall find necessary and sufficient conditions on the distribution of X
such that (9.5.1) holds with a limit process Y characterized by these y and p.
Theorem9.5.1 Suppose X\,X2,...
are i.i.d. random elements of C[0,1]. Let
{y(s)}.s[0,i] be a max-stable stochastic process in C[0,1] with index function y
and spectral measure p onCx [0,1]. Define

Us(X):=Fr(l-l)
for x > 1, s [0, 1]. The following statements are equivalent.

312

9 Basic Theory in C[0, 1]

max
i<n

Xi(s)-bs(n)\

as(n)

\sem]

->

{Y(s)}sem]

in C[0, 1], where as(n) positive and bs(n) real are chosen such that
- l o g P(Y(s) < x) = (1 + Y(s)x)-l^s)forallx
with 1 + y(s)x > 0;

lim "'<*">-'<"> = ^

(9 . 5 . 2)

uniformly for s e [0, 1] and 0 of the following equivalent conditions holds


(and then all of them are true) with %i(s) := 1/(1 Fs(Xt(s))), i = 1,2,...,
s [0, 1];
(a) n~l VIU ft -** >7 w C+[0, 1] wiYA IJ(J) := (1 + y(s)Y(s))l/y^
for s
[0,1].
(b) For each Borel set A in {/ e C[0, 1] : / > 0} JMCA that inf {|/|oo : /
A} >0am/v(8A) = 0,
lim tP (rl% e A) = v(A)

(9.5.3)

(with t running through the reals), where


v(A) = f f
%dp(g) .
J JrgeA rl

(9.5.4)

(c) For each r > 0 and each Borel set B C Cx [0, 1] (defined in {93.2)) with
p(dB) = 0,
lim fP(||oo > *r) = r - ^ ^ t O , 1])
(9.5.5)
awd
lim P ( J -

J*B)

e B | l^loo > t) =

(9.5.6)

Proa/ We have already proved in Theorem 9.2.1 that (1) is equivalent to (9.5.2), and
(2a). It remains to prove that (2a), (2b), and (2c) are equivalent.
We start with the equivalence of (2a) and (2b). The direct statement has been
proved in Theorem 9.3.1 and Corollary 9.3.2. The proof that (2b) implies (2a) consists
in rearrangement of the equalities in the proof of Theorem 9.3.1. For (9.5.4) note that
with r := |/|oo and g :=
f/\f\oo,
v(A) = v { /

: | / |

0 0

- Z -

A J = V { /

: rg e

where
v {/ : r > r0 and g e B] =
far Be

(^[0,1].

r~lp(B)

A],

9.5 Domain of Attraction

313

Next we prove that (2b) is equivalent to (2c). Take A in (9.5.3) to be


{/ : l/loo > r , //l/loo 2?},

(9.5.7)

where r > 0 and B is a Borel set in Cx [0,1]. By taking B = Cx [0,1] we get for
r >0,
lim fP(|$|oo > " 0 = r ~ V (cj"[0,1]) ,
(9.5.8)
which is (9.5.5). For general B and r = l w e have
lim tPmoo
r-*oo

> t mdi=/\i=\oo e B) = v{f : |/|oo > 1 ,//l/l<x> fl}= p(B) .

Combining with (9.5.8) gives (9.5.6).


Next assume (9.5.5) and (9.5.6). It suffices to prove convergence for a family
of sets that is a convergence-determining class for convergence in C. According to
Theorem 2.2, p. 14, of Billingsley (1968) this is the case if the family is closed under
the formation of finite intersections and if each open set is a finite or countable union
of elements of the family. Clearly it is sufficient to prove that any open sphere in C,
Sf0,c := {/ : 1/ - /oloo < c) ,
with fo e CQ [0,1] and c > 0, is a countable union of elements of the family.
We use the family
{/ P < l/loo < q and a; < < bt < 1 ,
1
1/ loo
for s in S( < s < s,-+i , i = 1, 2 , . . . , m J,

(9.5.9)

where s\, S2 . . . , sm+\ are rationals such that 0 = s\ <S2< - < Sm+i = 1, and /?,
q, a\,..., am, b\,..., bm are positive rational numbers.
Take $i, $2 , ^m+i such that for all i = 1, 2 , . . . , m,
sup

f0(s) -

Si<S<Si+l

inf

f0(s) < 2e .

Si<5<Si + l

Now for any function / w i t h \f fo \ <stdkcat

<frjsuch that fori == 1,2, . . . , m ,

sup5
i<S<Sj+\

fo(s) 3s
"l/loo

^si<s<si+i fo(s) + 3e
< a t < b i

" ' l/loo

'

andonij < 5 < 5,-+i,

a,<

mz<b"

and /?, ^ with |/oloo - s < p < q < |/ 0 |oo + e such that p < \f\oo < q.
Remark 9.5.2 It is easy to see that (9.5.2) can be extended to
H m U,(fu)-bs(lt]) = uYM - 1
'-+<*>
as([t])
y(s)
uniformly, where t runs through the reals.

314

9 Basic Theory in C[0, 1]

9.6 Spectral Representation and Stationarity


This section is a continuation of Section 9.4, i.e., we study max-stable processes,
not their domains of attraction. Our point of departure is Corollary 9.4.5: a simple
max-stable process rj in C + [0, 1] can be written
oo

ri^yZiVt,

(9.6.1)

/=i

where {Z;} is an enumeration of the points of a Poisson point process on (0, oo] with
mean measure dr/r2 and V, V\, V2,... are independent and identically distributed
nonnegative stochastic processes in C + [0, 1] with EV(s) = 1 for all s e [0, 1]
and sup0<5<i V(s) = c a.s., where c is a positive constant. The point process and
V,V\,V2, are independent.
Let Q be the probability distribution of the process V.
9.6.1 Spectral Representation
In order to make this representation more analytical, we use Theorem 3.2 of Billingsley
(1971), which says that for each probability measure on a metric space S with its Borel
sets, there is a random element of S, defined on the unit interval (that is the unit interval
with its Borel sets and Lebesgue measure X as the probability measure) with the same
probability distribution. Let
C+[0, 1] := {/ C[0, 1] : / > 0 , |/|oo = c]
for some c > 0. It follows that there is a measurable mapping h : [0, 1] -> Cc [0,1]
such that for each Borel set A of Cc [0,1],
Q(A) = X ({t e [0,1] : h(t) e A}) .

(9.6.2)

We are going to use the mapping h to build an alternative version of (9.6.1). Note
that with dv\ \ (dr/r2) x dk (A Lebesgue measure on [0,1]), A\ a Borel set of
(0, 00], and A a Borel set of C^"[0, 1],
vi ({(z, 0 (0, 00] x [0, 1] : (z, h(t)) Ai x A}) = Q(A) f

% .

Hence if {(Z/, 7})}?^ is a realization of a Poisson point process on (0, 00] x [0, 1]
with mean measure dv\ := (dr/r2) x dk, then
{(Z,-,Aai))}i

(9.6.3)

is a realization of a Poisson point process on (0, 00] x Cc [0, 1] with mean measure
dv := (dr/r2) x dQ, where g is the probability measure of h(T) on Cc [0, 1]. It
follows that
00

rj^yZihiTi).
!=1

(9.6.4)

9.6 Spectral Representation and Stationarity

315

Now note that h is a mapping from [0,1] into Cc [0,1]. Hence for each t [0,1]
the mapping provides us with a continuous function, fs(t) say, with fs(t) e [0, oo),
Jo / ^ r ) df = 1 for 0 < s < 1, and s u p ^ ^ fs(u) = c for all t e [0,1].
This leads to the following result.
Theorem 9.6.1 (Resnick and Roy (1991)) Let {(Z;, 7/)}?^ be a realization of a
Poisson point process on (0, oo] x [0, 1] with mean measure (dr/r2) x dX (X Lebesgue
measure). If the process n is simple max-stable in C + [0, 1], then there is a family of
functions fs (t) with
1. for each t e [0, 1] we have a nonnegative continuous function fs(t) : [0, 1] >
[0, oo),
2. for each s e [0,1],
[ fs(t)dt
Jo

= l,

(9.6.5)

3.
I
JO

sup fs(t) dt < oo ,


0<s<\

such that

imheioM = \y % f*(Ti)\

(9-6-6)

Conversely, every process of the form exhibited at the right-hand side of (9.6.6) with
the stated conditions is a simple max-stable process in C + [0, 1].
Remark 9.6.2 The family of functions {fs} is called a family of spectral functions
of the simple max-stable process. Note that the spectral functions are by no means
unique.
Remark9.6.3 By defining f*(u) := H\u)fs(H(u))
for u e R, where H is a
probability distribution function and Hf its density, one can take the spectral functions
in Li(R) rather than Li([0,1]).
Remark 9.6.4 There is also a weaker form of this theorem, where for the process
x] a.s. continuity is replaced by continuity in probability and for the functions fs(t)
continuity is replaced by continuity in measure: X{t : \fSn(t) fs(t)\ > s] -> 0 as
n -> oo for each e > 0 when sn -> s (de Haan (1984)).
Remark 9.6.5 It is not difficult to see that it is not essential that the max-stable
process be defined on [0,1]. One can take any compact set in a Euclidean space.
9.6.2 Stationarity
In this subsection we consider stochastic processes defined on the whole real line
rather than on the unit interval as in the previous sections. We do this mainly in view
of applications and of some examples to be considered in Sections 9.7 and 9.8. Since
the proofs of the results in this section are quite lengthly, we refer to the original
papers for some key points.

316

9 Basic Theory in C[0,1]

Definition 9.6.6 A stochastic process rj on C + (R) with non-degenerate marginals is


called simple max-stable if for rji, 772,..., Ltd. copies of the process t],
1

andP(r}(s) < 1) = e'1 for alls e R.


We start by reproving Theorem 9.6.1 in the present setting.
Theorem 9.6.7 Let {(Zi, 7})}?^ be a realization of a Poisson point process on
(0, 00] x [0, 1] with mean measure (dr/r2) x dk (k Lebesgue measure). If rj is
a simple max-stable process in C + (R), then there exists a family of functions fs(t)
(s e R, t [0,1]) with
1. for each t e [0, 1] we have a non-negative continuous function fs(t) : R
[0, 00),

2. for each s e R

fs(t)dt

=l,

(9.6.7)

JO

3. for each compact interval I e R

s u p / i ( 0 dt < 00 ,

/0 sel

such that

{^)},R=|yz'/^-)|

<9-6-8)

l'=l
heR
Conversely every process of the form exhibited at the right-hand side of (9.6.8) with
the stated conditions, is a simple max-stable process in C + (R).
Remark 9.6.8 The family of functions {fs} is called a family of spectral functions
of the simple max-stable process.
Proof (of Theorem 9.6.7). The proof is semi-constructive. First consider an infinite sequence of positive random variables W := (Y\, Y2,...). We assume that this
sequence is simple max-stable, i.e. for W\, W2,... independent and identically distributed copies of the sequence W and all k

\\/wt=dw.
Moreover we assume that P(Yi < 1) = e~l,i > 1.
We extend the line of reasoning of Chapter 6 (finite-dimensional extremes) to
this situation. The process W introduces a probability measure on the infinite product
S := R + x R + x . Since for
anyn>l,Yi,Y2,...,Yn>0

9.6 Spectral Representation and Stationarity


Pk{Yi <kyi,...,Yn<

kyn] = P[Yi <yi,...,Yn<yn)

317
(9.6.9)

for all k = 1,2,..., we find (similar to the reasoning in Section 6.1.3)


- l o g P{Yi<yu...,Yn<

y] = lim -A: log P {Yx <kyx,...

,Yn < kyn]

k->oo

= lim kP{(Yx <kyx,...,Yn<kyn)c\


Take a\, 2,.. positive such that ^AL\ a7
/ v

\l/2

oo

/ v

v 1/2

< oo. Then


oo

It follows that sup,->! Yi/ai < oo almost surely. For any n = 1, 2 , . . . the random
variable supj <,- <n Yt /at has a Frechet distribution by the results of Chapter 6, i.e. there
exists positive constants b\, Z?2,... such that P(supi<i<n Yi/ai < x) = exp(bn/x),
for x > 0. Hence & := limn_+oo &n exists in (0, oo) and
P ( sup < x ) = e~b/x ,
V>1 fli /

for x > 0 .

Next we introduce a set function v on S. For any Y\, F2, , Yn > 0


v {((*i, *2, ) xt < yt for i = 1 , . . . , n) c }
:=-logP{Fi <yi,...,F <y}.
Next for any > 0 w e determine a consistent family of finite-dimensional probability
distributions v on the set
Se,a := {((xi, X2,...) : xt < sai for i = 1, 2 , . . .)c} ,
with a :.= ( 0 1 , . . . , a n ), as follows: for y; > <z* for / = 1 , . . . , n
ve {(Oi, *2, ) : xt < yt for i = 1 , . . . , n)c)
-iogp{7i <yi,...,yw<yn}
-iogP{yi < ^ i , y 2 < ^ 2 , . . . } "
By Kolmogorov's existence theorem this defines a probability measure vs on Se,a
consistent with the finite-dimensional distributions. Hence the set function v can be
extended in a unique way to a measure on the set 5 e , a . Now s is arbitrary, hence in
fact the measure v is defined on the whole of S. Since for Y\, Y2, , Yn > 0 and
k = 1,2,... we have
*v {* ([0, yi] x . x [0, yn])c} = v {([0, yi] x x [0, yn])c},
we have for any Borel set B in S

318

9 Basic Theory in C [0, 1 ]


kv(kB) = v(B)

fork = 1,2,...

(9.6.10)

As in the finite-dimensional case k may in fact be any positive number. Moreover for
8 >0
v {((*i,*2,.. ) : *i < #* for* = 1,2,. ..) c } < oo
(9.6.11)
(i.e. the measure v is finite outside a neighbourhood of the origin).
Next, as in Section 6.1.4, we move towards a spectral measure. Using the transformation L, with the a;'s as before:
w : = sup(jt,7fl,-)
>i
Xk/W , W > 0
Zk '=

u; = 0

0,

* = 1,2,...

(mapping from S into 5) we get for c > 0, n = 1, 2 , . . . , w; > 0 (/ = 1, 2 , . . . , n)


using (9.6.10) that
CV {(*1, *2 ) "> > C, Zl < "1> . Zn < W/i)

= v j r t . c " 1 ^ , . . . ) : w > c, zi < ui,...,zn

< un\

= V{(JCI,JC2, ...) : u; > l,zi < i, . . . , z < wn}


< V

(xi,x 2 ,...) : supl ) > 1 \ < oo


/>i \aij

by (9.6.11). This gives


V{(X\,X2,

.) : V) > C, Zl < U\, . . . , Zn < Un]

= C~lV {(X\,X2,

. . . ) : W > l , Z i < U\,...9Zn

< Un}

i.e. the transformed measure d(voL*~) = (dr/r2) x d\x on [0, oo) x S for some
measure /x on 5. Note that
fi(S) = v | (xi, X2,...) : sup ( ) > 1
[
i>i\i/
= ^ {((JCI, JC2,...) : xt < a,- for / = 1, 2 , . . .)c} < oo

by (9.6.11). Hence /x is a finite measure.


From the definition of v we can write, for n > 1, Y\, Y2,...,

Yn > 0,

P{Fi<yi,...,r<y}
= exp - v {((xi, x 2 , . . . ) : xt < v/ for i = 1 , . . . , n) c }
= exp-v{((u;,zi,Z2, ) : ziw < yt for/ = l , . . . , ) c }
= exp v {(w, zi, Z2, ) nun < w
[

l<i<n Zi

9.6 Spectral Representation and Stationarity


= e x p - / / /
-//

319

- j li(d(zi,Z2, ))

^mini<i<(yi/zi)<iy

= exp - / / max ( ) /z(d(zi, z 2 . . . ) )


Note that since v is homogeneous (cf. (9.6.10)), if we multiply all a\s in (9.6.11) by
a constant, we can transform [i into a probability measure.
Next we apply Theorem 3.2 of Billingsley (1971): for each probability measure
on a metric space S with its Borel sets there is a random element of S defined on
the unit interval ([0,1] with its Borel sets and Lebesgue measure X as the probability
measure) with the same probability distribution. It follows that there are non-negative
functions fn defined on [0,1] such that
P{Yi<yu...,Yn<yn}
= exp - / /

max ( ) \x(d(z\,zi

Jsi<i<"\yiJ

. . . ) ) = exp -

max d t .

Jo l<<"

yt

Next we give a representation of the process (Fi, Y2,.. ) using the Poisson point
process of the statement of the theorem: we have

oo

oo

\J Zif\{Ti),\J
since

i=i

Zif2(Ji),...\

i=i

P \ \J Ztfi(Ti) < XI, . . . . \ / Zifn(Ji) < X


1>=1
1=1
J
= P {ZifjiTt) < xj for j = 1 , . . . , n; i = 1, 2 , . . . }
= p\zt<

inf -g

fori = 1,2,...]
dr

= exp- / /

x~2dt
J Jr> m
T
U<j<n Tjfe

= exp /

max

dt .

Now let us consider the process r\. By the results obtained so far we have a spectral
representation for the process {^(rn)\^L\ where r\, ri,... is an enumeration of the
rationals of R: with some abuse of notation we can write

The next step - finding a similar representation of the process r](s) for real s is done by using continuity. The process Y\ has continuous sample paths hence in
particular it is continuous in probability.

320

9 Basic Theory in C[0, 1]

We use without proof the auxiliary result: any sequence of random variables
( V S i Zifn(Yi)}%Li w * m /n* spectral functions converges in probability asrc ooif
and only if the sequence f* converges in Lebesgue measure. This gives representation
(9.6.8) of the process rj(s) for real s.
The final step, proving the continuity of fs(t) for almost each t and the convergence of the integrals in 3., is provided by Theorem 3.2 of Resnick and Roy (1991):
for a compact interval / in R, the process {rj(s)}sei has continuous sample paths if
and only if the family fs(t) of spectral functions is continuous in s for almost all t
and if moreover
/ sup fs(t) dt < oo .
Jo sei
The proof of the converse statement of the theorem is easy.
Next we turn to the issue of stationarity.
Definition 9.6.9 A mapping O from L* (the non-negative integrable functions on
[0, I]) to L* is called a piston if for h e L+
4>(h(t)) =

r(t)h(H(t))

with H a one-to-one measurable mapping from [0, 1] to [0, 1] and r a positive measurable function, such that for every h e L~^
I <3>(h(t))dt= J h(t)dt .
Jo
Jo
Theorem 9.6.10 Let {(Z/, 7})}?^ be a realization of a Poisson process on (0, oo] x
[0, 1] with mean measure (dr/r2) x dk (k Lebesgue measure).
If the stochastic process {rj(s)}se^ is simple max-stable, stricly stationary and
continuous a.s., then there is a function h in L* with fQ h(t)dt = 1 and a continuous
group of pistons {&s}sdBL (continuous, i.e., Q>Sn(h(t)) -> <&s(h(t)) as sn -> s for
almost all t e [0, I]) with

J0

sup<y (h(t)) dt < oo


sei

for each compact interval I, such that


{v(s)}seR

\/Zi<l>s(h(Ti))\

(9.6.12)

Conversely every stochastic process of the form exhibited at the right-hand side
of (9.6.12) with the stated conditions, is simple max-stable, strictly stationary and
a.s. continuous.

9.7 Special Cases

321

Proof. We apply de Haan and Pickands (1986). It says that if rj is a simple maxstable process on R the representation of this theorem holds with "77 has continuous
sample paths" replaced by "rj is continuous in probability" and with the statement
" * * ( / ( ) ) "* **(/()) for almost all u e RM replaced by "/J |* J f l (/(ii)) <Ps(f(u))\du -+ 0."
Next, replacing convergence in probability with a.s. convergence on the one hand
and replacing convergence in Li-norm by a.s. convergence on the other hand can be
done locally, i.e., for each compact interval. Theorem 3.2 of Resnick and Roy (1991)
again justifies such replacement (cf. the proof of Theorem 9.6.7).
Conversely consider a process with the given representation.
For 51, 52 e R we have with fs(t) := &s(h(t))
- l o g P (rj(si) < xu rj(s2) < x2)
= - log P (Z(fSl (Tt) < Xl and Z ^ f f i ) < x2 for all i)
= - l o g P I max max I
V i
\

,
*\

J< 1 1
)
)

*2
dr_

x dt
72
r

- / /Jmzx(rfSl(t)/xi,rfS2(t)/x2)>l
J l Jmi

(fSl(t)

fs2(t)\.

= / max I -, -
JO

*1

X2

at .
J

Hence it is sufficient to prove that for s e R

( ' m a x ^ ^ ^ ^ W

funfMl.Mr)*.

(9613)

Jo
V xi
x2 )
Jo
\ x\
x2 )
Now fs (t) = 0 5 (h(t)) for s e R. Hence, since the Oj form a group, the left-hand
side of (9.6.13) is
I max I
JO

I dt

X\

X2

= / max I

Jo
Jo
Jo

\
\

x\

x\

rs{t)dt
x2

\ x\
\

x2
x2

)
))

by assumption. In a similar way one deals with the higher-dimensional marginal


distributions.

9.7 Special Cases


An interesting special case of Theorem 9.6.10 occurs when O is a shift: $s(h(u)) :=
h(u s). Here h is a probability density function. Hence examples of stationary

322

9 Basic Theory in C[0,1]

simple max-stable processes can be constructed using some well-known probability


densities:
1. The exponential model: take for P > 0,
h(u):=^exp(-/3\u\).
2. The normal model: take for /3 > 0,
h(u) := P(27tyl/2exp(-p2u2/2)

3. The Student-f model: take for (t > 0 and v a positive integer,

r(v/2)0r
Note that all conditions, in particular condition (3) of Theorem 9.6.1, are fulfilled
since the densities are continuous and unimodal.
In all three cases the parameter ft has been introduced in order to control the
amount of spatial dependence: note that when f$ increases the amount of dependence
between values at two fixed sites decreases.
The two-dimensional marginal distributions can be calculated explicitly in all
three cases (and also for their two-dimensional analogues where s is a vector in R 2 ;
see de Haan and Pereira (2006)).
1. For the exponential model:
0 < v < xe~^s\

r 1
y '

logP(t](0)<x,r](s)<y)

X
L

'

1
x '

V*y

, xe-PW <y

<xeW.

y > xePW ;

hence the spectral measure is concentrated on a proper subinterval of [0, n/2]


and has two atoms.
2. For the normal model:
-\ogP(r1(0)<x1r](s)<y)
= -<l> ( - ^ +
log x V 2
\s\fi *x)
Compare with Example 9.4.6.
3. For the Student-f model:
-logP(rj(0)<x,ri(s)<y)

+ -3> -^- +
log - .
y V 2
\s\fi
*y)

9.8 Two Examples


j ,

0<y<xL'2

Pi(B,s,z)

323

-(v+D/2

,xL-iv+1)/2<y<x,

+ (l-P2(B,s,z))

lP(TvA<^).

x = y,

\{\-Px{B,s,z))

+ \P2(B,s,z),

x<y<xL7iv+l)/2,
(v+l)/2
y > xLx

where
B2s2

B2s2

B2s2

B2s2

L2 = l + ^-+B\s\Jl
Pl(B,s,z)

P\\TvA

P2(B,s,z)

= P

7v.i-

Bs

+ ^-,

<P

Bsz
\-z

s2z
(1 - z)2

1
B2

s2z

(1-z) 2

P2

Tv, i is a random variable with a Student-f distribution with v degrees of freedom and
scale parameter one, and z = (;c/y) 2 / (v+1) . These results can be used for constructing
an estimator for the dependence parameter p.

9.8 Two Examples


Let us go back to Example 9.4.6 of Section 9.4:

{i?W} I6 R:={y z ^ WH '^|

(9.8.1)
seR

where {Z;}?^ is a realization of a Poisson point process on (0, oo] with mean measure
dr/r2 and independently, {W,-}?^ is a sequence of independent Brownian motions.
The process r\ is stationary on R. For the proof it is sufficient to prove that all
marginal distributions are stationary. We shall show this for the two-dimensional
distributions. Let 0 < s\ < S2 and write u := S2 s\. Then for x, y e R,
- log P(rj(si) < e*9 r](s2) < ey)
= E

W(si) Sl/2 x

m^(e

- ,eW(S2)~S2/2~y)

324

9 Basic Theory in C[0,1]

E e W(*,W2 m a x

(^-x

eW(s2)-W(sx)-(S2-sX)/2-y^

max(e-x,ew^-w^-^-s^2-y)

=E

= e~xP (w(s2) - W(S!) - ^ i l < y - x\


+ e

- y l f

etJU-u,2e-ty2

dt

\lln Jty/u-u/2>y-x

-it-vu)2/u
\

V" /

dt

V2TT Jt-JU
*Ju>-<s/u/2+(y-x)/Ju
>-JH

=--*(f ^)--'*(f ^)
with <E> the standard normal distribution function. Clearly the distribution depends on
s\ and S2 only through u = s2 s\. The reasoning is similar when s\ < s2 < 0.
Finally consider the case s\ < 0 < s2:
- log P(rj(Sl) < ex, rj(s2) < ey) = E max (ew+*'2-x,
Denote the distribution function of eW(s\)+s\/2-x b y ^
0 f eW(s2)-s2/2-y b y F 2 T h e n t h e expectation is
/OO

/00 /

r dFi(t)F2(t)

JO

rt

tl F[{t) /
Jo V
JO
/00

= /
Jo

^ e distribution function
pt

F^(II)

Jw + f(f) / FfCw) Jw ) A
Jo
J

/00

an(j

^ta)-*/2-;^

/00

tF[(t) dt Ffa) du+ /

Ju

/00

fF(0 A F{(W) dw

Jo Ju

Note that

and

F'(f) = ^- 0 (^ + + \
2
t<Jsi \ 2 Jsi <Jsi)
with <p = <$'. Hence
/.00

/.00

kil

log

isn vw\
Jlogu Vkll
Now

isn

9.8 Two Examples


f

\s\\

325

VkT
(_ 1 /lll
exp \ 2\A
e~x

= e~x(/)

v2

2vx\\

x1

v2

2vx\\

X
\s

1 (\s\\

/vi^rT _ _^
v 2 ^n

^_\
V\H\)

'

Hence

/.

tF{{t)dt e

\s\\\

LA-M 7m-)

dv

,W(Jl)-5l/2-JC

and
/OO

/-00

/
/
JO Ju

tF[(t) dt F^(u) du
eW(Sl)-Sll2-x

= e-xP[eW(sx)''sxl2'x

>

u) dP (ew^~^2-y

>

W{si) S2/2 y

< u)

= e~xP

(Wo - W(s )

>x-y-

W^-^i

SJ

-^)

Similarly
/00

/00

/
/
JO Ju

tF^(t)dt F[(u) du = e~y<$>

Hence for s\ <0 < S2


- log P(?j(si) < ex, rj(s2) < ey)
\Vs2-si

VV^2-^1

which depends on s\ and 52 only through s2 s\.


We now exhibit a stochastic process in the domain of attraction of this simple
max-stable process.

326

9 Basic Theory in C[0,1]

Example 9.8.1 Let Y be a random variable with distribution function 1 1/JC, x > 1.
Let W be Brownian motion independent of Y. Consider the process
ti(s)heR

:= [ r e w - l * l / 2 )

(9.8.2)

We claim that this process is in the domain of attraction of the process in (9.8.1). Consider independent and identically distributed copies of the process: {Yle Wl: (*)-1*1/2 J R
for i = 1, 2 . . . . Now consider the point process consisting of the points

J^.^W-W^j" .

(9.8.3)

These are elements of (0, oo] x C + (R). We already know from Theorem 2.1.2 that the
point process constructed from the points {Y(/n}"=l converges in distribution to the
point process constructed from the points {Z; }?^ of (9.8.1). Since the second component is independent and does not change, the point process (9.8.3) converges in distribution to the point process constructed from the points {(Z,-, eWi (*)-M/2) J . Then
the point process constructed from the points {( _ 1 F;e W l ^~l J l/ 2 )}" = 1 converges
to the Poisson point process constructed from the points {(Zi^ W | '^~' 5 '/ 2 )}. = 1 . The
points are continuous functions. Since {sup,<rt n~l YieWi(<s>)~^^2}se^ is a continuous
functional of the point process, we have indeed

\supn-lYiew^-^2

"faCOheR
seR

inC+(R).
The process (9.8.2) has an interesting property. For a > 0 the distribution of
{%(s)/a}sei& given (0) > a is the same as the distribution of {Cs)}5eRThis propertywhich we call excursion stabilityis analogous to the defining
property for the (generalized) Pareto distribution; see Exercise 3.1.
Next we consider maxima of independent and identically distributed OrnsteinUhlenbeck processes. We show that if we apply a suitable time transformation the
limit process is max-stable.
Example 9.8.2 Let {^COL^ be a Ornstein-Uhlenbeck process, i.e.,
X(s) = f

e-(s~u)/2dW(u)

Joo

for all s e R with W Brownian motion on (oo, oo), i.e., two independent Brownian
motions starting at 0 and going off in two directions of time. Since for s ^ t the
random vector (X(s), X(t)) is multivariate normal with correlation coefficient less
than one, Example 6.2.6 tells us that, relation (9.5.1) can not hold for any max-stable
process in C[0, 1]: since Y has continuous sample paths, Y(s) and Y(t) can not be

9.8 Two

Examples 327

independent. Hence we compress time in order to create more dependence, i.e., we


consider the convergence of

,9,84)

(v*Hi)-\>L

'

in C[5o, so] for arbitrary so > 0, where Xi, X2,... are independent and identically
distributed copies of X and the bn are the correct normality constants for the standard
one-dimensional normal distribution, e.g., bn = (21ogn log log n \og(4jt))1^2
(cf. Example 1.1.7). In order to show convergence we write
X{s) = e~s'2 (x(0) + fe u ' 2 dW{u)\ ;
hence

= , - / * (b, (X<0) - b.) + b. ['* e"2dW(u) + (l - * " * ) A

Note that uniformly for \s\ < so,

Further, since euf2 = 1 + 0 (l/fcj) for \u\ < s0/b%,

fc/;'^"^(=(1+o(^)).w().
Finally, for \s\ < so,

(i-^8.).-| +

o(^).

It follows that

= i+o

6 <x<O) i, )+

( (4))( "

+o

- " ''" '(i)-0 (4)-

We write W*0) := bnW (s/b2). Then W* is also Brownian motion. We have

328

9 Basic Theory in C[0, 1]

=('+o (j|)) jv (<* - w+r<* - u + (4)


Hence the limit of (9.8.4) is the same as that of

\\/(bn(Xi(0)-bn)

+ W?(s))--

Li=l

(9.8.5)
yeM

The rest of the proof runs as in the previous example.


One finds that the sequence of processes (9.8.5) converges weakly in C[so, so],
hence in C(R), to

K/0ogz, + w;w)-M
Exercises
9.1. Show that the constant c in Lemma 9.3.4 is p(Cx [0, 1]), where p is the spectral
measure of Section 9.4. Argue that this constant is an analogue of L ( l , . . . , 1), where
L is the dependence function defined in Section 6.1.5.
9.2. In Section 7.4 (finite-dimensional extremes) a quantity K has been introduced
that quantifies the strength of dependence. It was shown that for ^/-dimensions K
i s d / ( - l o g P ( Z i < l,...,Zd
< 1)) = d / L ( l , . . . , l ) , where ( Z i , . . . , Z r f ) is
a random vector with distribution function Go, where Go is from Theorem 6.1.1.
Argue that \/c could serve as an infinite-dimensional analogue of this coefficient,
where c is the constant from Lemma 9.3.4. How would one estimate this quantity c?
9.3. Show that all marginal distributions of the process of Example 9.4.3 are indeed
exp( 1/JC), x > 0.
9.4. Check that the regular variation condition of Theorem 9.5.1 (2c) implies the regular variation condition of Theorem 6.2.1(1) for all marginal distributions.
9.5. Consider the stochastic process defined by %(s) := YV(s) for s e R, where Y
has distribution function 1 1/JC, x > 1, and V is a continuous stochastic process
independent of V satisfying EV(s) = 1, for all s and E supa<s<b V(s) < oo for a <
b. Show that is in the domain of attraction of a simple max-stable process that has
the representation of Corollary 9.4.5 with the same auxiliary process V. Moreover, for
a > 1 and V(0) = 1, {t-(s)/a}s>o given (0) > a has the same distribution as . This
property resembles a corresponding property for a generalized Pareto distribution in
finite-dimensional space. Find that the one-dimensional marginal distribution equals,
for each s > 0 and V(s) > 0 a.s.,

9.8 Two Examples

329

1 f
P(g(s) >x) = - / P(V(s) >u)du
x Jo

u > 0.

Note that they depend on s and do not follow a generalized Pareto distribution (cf.
Section 3.1).
9.6. Consider independent and identically distributed random vectors (R, $>),(R\, 3>i),
(/?2, $2), , where R and <& are independent, P(/ > r) = exp(r2/2), and 4>
has a uniform distribution over [0, 2TT]. This means that (Z? cos 4>, /? sin 4>) has a
standard normal distribution. Prove that {v"=lbn(Ri cos(0/bn <&;) bn)}e converges to {v"=17} + #Z; 02/2}Q, where {7}} is an enumeration of the points of a
point process on R with mean measure e~* dx and Z,- independent and identically
distributed random variables (Eddy and Gale (1981)).
Hint: Expand cos(0/bn 4>,-) and proceed as in Example 9.8.2. Note that by Corollary

5.4.2,

w^Ri/bn^l.

10
Estimation in C[0,1]

10.1 Introduction: An Example


In Section 9.1 we considered the following mathematical problem: given n independent and identically distributed random functions X, Xi, X 2 , . . . , Xn in C[0,1]
whose distribution is in the domain of attraction of an extreme value distribution in
C[0,1], estimate the probability
P (X(s) > f(s),

for some s e [0,1]),

where / is a given continuous function.


An essential feature of the problem is that all the observed processes X,-, i =
1, 2 , . . . , 71, are well below / . This means that in an asymptotic setting, i.e., with
n -* 00, we are forced to assume that / is not constant but depends on n and moves
to the tail of the distribution as n -> 00. In fact, analogous to the finite-dimensional
case, we assume
fn(s) = Us (^-h(s)) ,
(10.1.1)
where Us is the inverse of 1/(1 Fs) with Fs(x) := P(X(s) <x),k = k(n) -> 00,
k/n -> 0, n ~> 00, cn is a sequence of positive constants, and AT is a fixed (but
unknown) function.
Let
pn := P (X(s) > fn(s) , for some s e [0,1]) .
We show informally how to approximate pn. As in the beginning of Section 9.5 define

f (') := ,

dO.1.2)

We write
pn = P (X(s) > fn(s),
=

\1

cww

for some s [0,1])


^

>

vf*,^

for s o m e s e

t' 1 ] )

332

10 Estimation in C[0, 1]
= P I - f (s) > cnh(s) for some s e [0, 1] 1,

and by Corollary 9.3.2, this is approximately equal to


k
-v \g
e C + [0, 1] : g(s) > cnh(s) for some s e [0, 1]},
J
n l
which equals, by Theorem 9.3.7,
k
v \g G C + [0, 1] : g(s) > h(s) for some s e [0, 1]}
J .
ncn l
For the estimation of pn we thus need an estimator for v as well as for cn and the
function h from (10.1.1). This involves estimation of Fs in the tail and for this we also
need estimators for the index function y(s), the scale as(n/k), and location bs(n/k).
Our aim in Sections 10.2-10.4 is to develop estimators for those four quantities. We
come back to the estimation of pn in Section 10.5.

10.2 Estimation of the Exponent Measure: A Simple Case


For ease of exposition let us consider now the "simple" case. Suppose ?i, ?2 ?3
are independent and identically distributed stochastic processes in C + [0, 1], i.e., the
processes are continuous and positive, and assume that

- V &<*>
n ,

-+{ri(s)}se[o,\]
se[0,l]

\<Kn

in C + [0, 1] with P(rj(s) < 1) = e~l for 0 < s < 1 (i.e., standard Frechet). Then
r] is a simple max-stable process. Obviously in this case the exponent measure v is
the only unknown feature characterizing the process. We are going to develop an
estimator for v. As in the finite-dimensional case the estimator is based on a small
fraction of higher observations only. Define for k < n the estimator vn^ as
1
k

,-i

We claim that vn^ is a consistent estimator for v if k = k(ri) -> oo, k/n -> 0,
n -> oo.
Theorem 10.2.1 Let f, fi, f2> ?3, be i.i.d. stochastic processes in C + [0, 1]. If

-Mb^n

(10.2.1)

10.2 Estimation of the Exponent Measure: A Simple Case

333

in C + [0, 1], then for any c > 0 as k = kin) -> oo, k(n)/n > 0, n -> oo,
vn,k\Sc -+ v\se,

(10.2.2)

where at both sides we consider the restrictions of the measures to the set
Sc := {/ C + [0, 1] : l/loo > c }
anJ convergence is in the space of finite measures on C [0, 1]. The measure v is the
exponent measure of the process rj (cfi Section 9.3).
Proof. According to Daley and Vere-Jones (1988), Theorem 9.1.VI, we need only to
prove that the finite-dimensional marginal distributions converge, i.e., for any Borel
v-continuous sets E\, E2,..., Em C Sc,
_
(Vn,k(El)>

__
Vn,k(El),

_
, Vn,k(Em))

p
- * ( v ( E i ) , v ( 2 ) , . . . , v(Em))

Since the limit is not random, this is equivalent to the following: for any Borel vcontinuous set E C Sc,
vntk(E)-+v(E)

Using characteristic functions, we see that this is equivalent to

Hb **)**>
which has been proved in Corollary 9.3.2.

A corollary of this theorem is the uniform convergence of the marginal tail empirical distribution functions as well as the tail quantile functions. This will be useful
later on.
Corollary 10.2.2 For each s let fi, n (s) < f2,nC?) < < f,*(*) be the order
statistics oft;\(s), &($), tn(s) and define
1
1 - Gn,s(x)

:= - ^
i=l

UkSi(s)/n>x}

Suppose the domain of attraction condition (10.2.1) holds. Then for any c > 0,
1
sup

1-GnA*)--

^0

(10.2.3)

0<s<l,x>c

and

sup
0<s<l, x>c

/ x

-Sn-[kx],n(s)
n

1
X

0.

(10.2.4)

334

10 Estimation in C[0, 1]
Later on we shall also need
1

sup
and
sup
0<S<1,JC<C

(10.2.5)

0<s<l, x<c

l-Gn,s(x)

*
Sn-[k/x],n(s)
n

x\

(10.2.6)

0.

Proof. Fix c > 0. By changing the probability space and using a Skorohod construction, we can pretend that the result of Theorem 10.2.1 holds a.s., i.e.,
V

n,k\Sc -

as

\SC

(10.2.7)

This means convergence of finite random measures. A metric characterizing this type
of convergence is given in Daley and Vere-Jones (1988), A.2.5:
d(v, IX) := inf {s > 0 : v(F) < ti(F) + s and fi(F) < v(Fe) + e
for all closed sets F e C + [0,1]},

(10.2.8)

where Fe := {/ C+[0,1] : \f - g\oo < e for some g F}.


Take 0 < s < c/2 and take n so large that (from (10.2.7))
d v

( n,klSc>v\sc)

<

(10.2.9)

a.s.

For J C > 0 , 0 < 5 < 1, define the closed set


Ex,s:={feC+[0,l]:f(s)>x}

Note that Ex s is in fact the same as Ex-yS and that v (EXfS) = 1/x. It follows from
(10.2.9) that for x > c, 0 < s < 1,
1 - Gn%s{x) = v * {/ C + [0,1] : f(s) > x]
< Vn,k (Ex,s) < V (Ex-e,s) + 8
X 6

+e

and
1 - GnAx)

2: Vn,k (Ex+e,s)

> V (Ex+2e,s)

- e

x + 2e

This proves that


sup

1-Gn,s(x)~-

a.s.

0<5<1,JC>C

as n > ex), hence we have convergence in probability, i.e., (10.2.3).

10.3 Estimation of the Exponent Measure

335

Now clearly from


-> 0

1-GS(X)

sup

(10.2.10)

a.s.

0<s<l, c<x<b

it follows, since k/ni;n-[kx],n(s) is the inverse function of 1 Gn,5(jc), that for 0 <
c < b < oo,
k
sup

/ x

-> 0

-$n-[kx],nVS)

0<s<l, c<x<b

(10.2.11)

a.s.

and hence by monotonicity


sup

-> 0

Sn-[kx],n(S)

0<S<1, X>C H

a.s.

This proves (10.2.4) and (10.2.6). Next from (10.2.10) we get


1
sup

-* 0

l-Gn,s(x)

0<s<l, c<x<b

a.s.

and by monotonicity (in x) we obtain (10.2.5).

10.3 Estimation of the Exponent Measure


Next we look at the general case. Let X, Xi, X 2 , . . . be independent and identically
distributed stochastic processes in C[0,1] with continuous marginal distribution functions and suppose that there are continuous functions as(n) > 0 and bs(n) such that
Xi(s)-bs(n)
max
i<n

as(n)

)se[0,l]

(10.3.1)

{Y(s)l *e[0,l]

in C[0,1]. From (10.3.1) we get that (cf. Theorem 9.2.1)


1
max
i<n n{\-

Fs(Xi(s))}\se[0l]

Ul + Y(s)ns))1^s)\
I

(10.3.2)

J5G[0,1]

i.e., (10.2.1) holds with


^(5):=

1
i-Fs(Xi(s))

and rj(s) := (1 + y(s)Y(s))l/y(s\


0 < s < 1. Hence, according to the results of the
previous section, we would be inclined to define as an estimator for v the quantity
1 n
Vn,k( ) = T ] C l{k/Ml-F.{Xi(.))})eAh
A :

i=l

336

10 Estimation in C[0, 1]

where A C C + [0,
[0, 1]. This is cconsistent for v. However, this is not a statistic since
1 Fs is unknown. Hence we replace 1 Fs by its empirical counterpart
1 - - FnAx)

7=1

This leads to the estimator

1 "
Kk(') := j t i L 1 ! ^ - / ^ . }
l
J

(10.3.3)

i=l

with
ti(s) :=

1n ^^l ^^)^^)}

l-Fn,s(Xi(s))

(10.3.4)

We know by Theorem 9.2.1 that for these processes (10.2.1) holds with rj(s) :=
(1 + y(s)Y(s))l/y(s\
s [0, 1]. This leads to a simpler way of writing (10.3.4):

"&(j)

:==

1.-1 r*

=1 ^

(10.3.5)

with Gn>iy as in Corollary 10.2.2. Hence we can analyze this estimator using the results
of Section 10.1.
Theorem 10.3.1 Let X, X\, X2,... be i.i.d. stochastic processes in C[0, 1] and assume that their distribution is in the domain of attraction of a max-stable process in
C[0, 1], i.e., (10.3.1) holds.
Let
1

l-FnAx):=-J2l{Xj(s)>x}
",=.

for n = 1, 2 , . . . , 0 < s < 1. Z)e?/we


1
1 - F,, (*,(*))
and

77ien, as k -> 00, &/n > 0, n -> 00, for all c > 0,
p

Vn,*|Sc ~> V|5c

(10.3.7)

in the space of finite measures on C [0, 1], where on both sides we consider the
restriction of the measure to the set

Sc:= | / e C + [ 0 , l ] : | / | o c > c ) .

10.3 Estimation of the Exponent Measure

337

Proof. We have only to prove that for a v-continuous Borel set E c SC9
vn%k{E) -> v(E)
in probability (cf. proof of Theorem 10.2.1). Write
E = (E H S[cM) \J(En
with

Sb) =: Ei U 2

_
S[cM := {/ C + [0,1] : c < l/loo <fc] .

Let k$/n G i . Then k$(s)/n < b for 0 < 5 < 1, hence by (10.2.5) of Corollary
10.2.2 for sufficiently large n,
A:
n

1
l-Gn,s(ki;(s)/n)

k
n

0 < s < 1; hence the function (l - Gn,5 (fcf (s)/w))~ is in


f := j / G C + [ 0 , l ] : \f - g\oo < e for some g e i ) .
It follows that
v,jfc(i) < Vn,k(ED .
Similarly one can prove, now using (10.2.6), that
VnMEl) < V,jfc(f ) .
We already know by Theorem 10.2.1 that vn^(E\)-+p
v{E\) and
Vn,k{E\) - > p v(E\). Also by the v-continuous property of E\ we have v(E\) ->
v(E\) as s \ 0. This proves
vn,k (E H S[Ctfc]) -^ v ( n S[Cffc])
for 0 < c < b < oo. Next we consider v n ^(^2)- According to (10.3.5),

i=\

1 *
=

i=l

l
l^
\^-GnAHi{)/n))-XeE2J\
l

1 "
=

t ^ 1 {^i(-)/=(l/(l-G n ,.))*"g(.) withg 2 }


* i=l

= vn,k \f : /() = (X^G

*(> with g 2 | .

(10.3.8)

338

10 Estimation in C[0,1]

Hence
vn,k{E2) < v(Sb) = vn,k If : /() = L_lGn

* . with l^loo > b\

Since g e Sb we have g(so) > b for some so [0, 1]. Hence


( z -p; ) (g(so)) > (~
-p, ) (b) = $n-[k/b],n(so) $> b
\ 1 - Lrn,SQ J
\ 1 - CjniS0 J
by (10.2.6). Hence

'p (rV)'

P \ sup ( -;

(g(s)) > * - } - 1 ,

n^oo,

[*[0,]
0,1] V 1 " ^ n , 5 /

i.e., with probability tending to 1,


V(E2)

< VnfJc (Sb-e)

and by (10.2.2) the right-hand side tends to v (5^_ ), which by Theorem 9.3.7 equals
C/(b s) for some positive constant C. By choosing & large enough this can be made
smaller then s. Then as n - oo,
Ptf,,,* (EOS*) > * ) ( > .
The proof is completed by combining (10.3.8) and (10.3.9).

(10.3.9)

10.4 Estimation of the Index Function, Scale and Location


Recall the domain of attraction condition. Let X\, X2,... be independent and identically distributed stochastic processes in C[0,1]. Suppose that there are continuous
functions as(n) > 0 and bs(n) such that
{max
[i<n

as(n)

}
-+{Y(s)}se[0,i]
J J[ o,i]

(10.4.1)

in C[0,1]. Then 7 is a random element in C[0, 1] satisfying (for a judicious choice


of scale function a and location function b)
P(Y(s) <x)=

exp j - ( l + y(s)xy1/y{s)\

(10.4.2)

for each s e [0,1], where y C[0,1]. The function y is called the index function.
In this section we develop estimators for / , the scale and the location. The estimators
will be based on the moment estimator of Section 3.5, but similar results should hold,
for example, for the maximum likelihood estimator.
Now, since in applications one does not need estimators of scale and location for
extreme order statistics, but rather for intermediate ones such as those of Section 2.4,
we shall specify what we want to estimate.

10.4 Estimation of the Index Function, Scale and Location

339

Define for t > 0, 0 < s < 1,


as(t)

:=as([t]),
(10.4.3)

bs(t) := bs([t])

In (10.4.1) we used those with t = n. But as in the finite-dimensional situation (cf.


Section 4.2 and Chapter 8) we need estimators of the quantities in (10.4.3) with
t = n/k, where k = k(n) -> oo, k/n -> 0, as n -> oo, i.e., we need estimators for
as(n/k) and bs(n/k). In fact, by Theorem 1.1.2 we can replace bs(n/k) by Us(n/k)
with, for x > 1,

U,M:=Fr(l-1-).

(10.4.4)

where Fs(x) := P(X(s) < x).


Finally, in order to construct the moment estimator we need positive random
variables; hence we assume
(10.4.5)

inf Us(oo) > 0,


0<s<l

which can be achieved by a shift. Next we introduce the estimators. They are simple
extensions of the ones used in the finite-dimensional case. Define the sample functions
i

*-

M{nJ\s) := - J^ (18 Xn-iAs)

~ lo8Xr,-k,n(s))J

(10.4.6)

1=0

j = 1,2, where Xhn(s)

< X2,(s) <

< X (s) are the order statistics of

Next define
(10.4.7)
-l

y-(s) := 1 - l \ 1
y() = Y+(s) + Y-(s),

(10.4.8)
(10.4.9)

as(n/k)

= X_t,(5)y+(i)(l-7-W),

(10.4.10)

bs(n/k)

= ^n-A.n^)-

(10.4.11)

10.4.1 Consistency
We have the following consistency result.
Theorem 10.4.1 LetX\, X2, ...be i.i.d. stochastic processes in C[0,1] andassume
that their distribution is in the domain ofattraction ofa max-stable process in C[0,1],
i.e., (10.4.1) holds. Ifk = k(n) -* 00, k/n -* 0, n -* 00, then

340

10 Estimation in C[0, 1]
sup |p+(,s) y+(s)\ -> 0 with y+(s) := y(s) v 0 ,

(10.4.12)

0<s<\

sup \y-(s) y~(s)\ -> 0 with Y-(s) '- Y(s)

(10.4.13)

0<5<1

sup

\?(s)-Y(s)\-*0,

(10.4.14)

0<s<\

sup
0<s<l

sup
0<S<1

as(n/k)
- 1
as(n/k)

(10.4.15)

0,

bs(n/k)-Us(n/k)
as(n/k)

(10.4.16)

For the proof of Theorem 10.4.1 we need two technical lemmas. The first one has
been taken from Appendix B.
Lemma 10.4.2 Suppose that the functions log as (t) and gs (t) > 0 are locally
bounded in 0 < s < 1, 0 < t < oo, and for some Y C[0, 1] and all x > 0,
r

hm
t^oo

gs(tx)-gs(t)
as(t)

Xy^-i

(10.4.17)

Y(s)

uniformly for 0 < s < 1. Then for x > 0,


lim ^ ^

= xY{s)

(10.4.18)

uniformly for 0 < s < 1, and for any > 0 f/iere emte /o > 0 swc/i that for t > to,
tx > to,
as(tx)
- r^W
(10.4.19)
*jc y ( , ) max ( * * , * " * ) ,

a5(0

or alternatively,
as(tx)
(1 - e ) * ^ min (x , JC" ) < ^ ^ < (1 + s)xy(s) max (*, *"*)

(10.4.20)

and
gs(tx)-gs(t)
gs(t)

xyw-i

Y(s)

exy{s)max(xs,x-e)

(10.4.21)

Further,
hm

= y+m

(10.4.22)

uniformly for 0 < 5 < 1 >Wf/i X+C?) := max (y(s), 0), and for any s > 0 f/iere exwta
fy swc/i that for t,tx > to,

10.4 Estimation of the Index Function, Scale and Location


log gs(tx)~

log gs(t)

xy-^-\

as(t)/gs(t)

< sxy-(s) max (JC, x~e)

y-(s)

341

(10.4.23)

with y-(s) := min(y(s), 0).


The second lemma is probabilistic in nature.
Lemma 10.4.3 Let i, &, .. be Ltd. stochastic processes in C + [0,1] and suppose

iV*w
i=l

Ms)},

(10.4.24)

5G[0,1]

) se[0,l]

in C + [0, 1]. Let t;\As) < ^2,n(s) < < ZnAs) be the order statistics of
fi(s), ?2(*) n(s). A/50 /ef /x and X be continuous functions defined on [0, 1]
with n < 1, X < 1, /x + X < 1. 77iew
1^

sup
0<s<l

i=0

(fn- t n(j)/gn-M(j)) W J " 1


fl(s)

1
-^ 0
1 - H(s)

(10.4.25)

and

sup

1 ^

0<^<l

i=0

- 1 (U-i,n(s)/Sn-kAs))US)

{U-iA^/Sn-kAs))^
//,($)

(1 _ ^s)

~ 1

Ms)
2 - /x(5) - X(J)
_ X(s)) (1 - /x(s)) (1 - X(s)) I

0.

(10.4.26)

Proo/ We shall prove (10.4.25). The proof of (10.4.26) is similar. Observe that
1 t \ (tn_iin(s)/tn_Kn(s)fis)
fi(s)

*=0

\krn-kAs)/

- 1

J(k/n)Sn-ktn{s) \J(k/n)U-k,n^)

(?r- L 7^)

\AC ?-*,($)/

A*/Kn-*+l,,.(*)

/n

(i-Gn,,w)^>-^

\M(5) /OO

= J ^
+ (ijr-t-rX

(l-G,,(x))^)-^
f

(1 - GH,,(x))xM-ldx

342

10 Estimation in C[0,1]

By Corollary 10.2.2 ((10.2.3) and (10.2.4)) the second part converges to 0. So we


need only to prove
I C
sup /
(1 - Gn,s(x))x^s)-ldx
0<5<1 \J\

1
- -^ 0 .
1 - MO)

(10.4.27)

Let Yi := sup J6[0 ^ bis), i = 1, 2 , . . . , n. These are independent and identically


distributed random variables. By the continuous mapping theorem,
n

-.

V W" i = V W "

/=l

/=i

SU

P h(s) = sup \J -k(s)

5G[0,l]

-> sup iy(j) =: F

5e[0,l].^1"

se[0,l]

in distribution, where by Lemma 9.3.4,


/(y<x) = p ( - ( ) )
for some c > 0.
Let 7 w -j> be the order statistics of Y(, i = 1, 2 , . . . , n, and
l-F

1 n
W:=-J]l{

* i-i

V w /

i}.

We have for 0 < s < 1,


l-Gn,5(x)<l-FwW.
Hence by one-dimensional results,
poo

/oo

(1 - G n , , ^ ) ) * ^ " 1 ^ < /

J\

(1 - F n W ) * ^ - ^ *

l - ^W

Hence by Pratt's lemma (summarizing, if gn -> g pointwise, |g n | < / for all n and
ffn-+ff,
for some functions / , g, / , g, then / g n -+ Jg; Pratt (I960))

Ji

Ji

1 - MO)

uniformly in s. Hence we have proved (10.4.25).

Proof (of Theorem 10.4.1). Define for 0 < s < 1 and / = 1, 2 , . . . ,

& (5 > := i rL^

(10A28)

l-F 5 (X,(,s))
as in Section 10.3. Then, according to Theorem 9.2.1, for f i (5), ft fa). - t h e results
of Section 10.2 hold (the "simple" case).
We first prove (10.4.16). Note that by (10.4.28)

10.4 Estimation of the Index Function, Scale and Location


Xi(s) =

343

Us^i(s));

hence

bs (f) - Us (f)

X_*,(.) - Us (I)

Ml)

Us (I {|?-*,(*)}) - tf, (f)

*(f)

M!)

which by Corollary 10.2.2 (10.2.4) and Theorem 9.2.1 (note that as in Theorem 1.1.2
one sees that (9.2.6) holds with n replaced by t running through the reals) converges
to zero, in probability and uniformly in s. For the proof of the other statements of the
theorem we start with the following:
My(1)
n
Jfc-1

1^

=
k

- log Xn-.kAs)

log Xn^n(s)

i^a

s(Sn-k,n(s))/Us(Sn-k,n(s))

1 t \ log Us (?-*. (J) {g B -i,n(g)/C-t,i.(g)}) ~ togU* ( f r - M f r ) )

- E

^(U-WVWUW)

i=0

Since t;n-k,n(s) -> oo in probability, as n -> oo, uniformly in 5 by Corollary


10.2.2, we have with high probability by Lemma 10.4.2,

M?\s)
fl5(fn-MW)/^(fn-MW)

*S

M*)

*-l

<4E(<-*.w/f-t.w))'-(')+e
/=o
Upon applying Lemma 10.4.3 we then get
1

9+(s)
sup aS (?*-*, n (S))/ Us (?-*,#! (s))

-o.

l-y-(j)

(10.4.29)

0<s<l

Similarly we get

M?\s)
sup
0<5<1

{as( f n - M ( * / ^ ( f - M W ) )

2
a-y-w: ( l - 2 y _ ( 5 ) )

4-0
(10.4.30)

and, using (10.4.22) of Lemma 10.4.2,


sup

\aS(Sn-k,n(s))

0<s<l UsUn-k,n(s))

- y+to

(10.4.31)

Combining (10.4.29) and (10.4.31) gives (10.4.12). Combining (10.4.29), (10.4.30),


and (10.4.31) gives (10.4.13). Then (10.4.14) follows. For (10.4.15) note that

344

10 Estimation in C[0,1]
a

* (?)
as ( | )
and as(tx)/as(t)
(10.4.18).

-*

* (f [jiSn-kAs)])
as ()

JCK(5)

M?\s)(l - y-(s))
as(Sn-k,n(s))/Us(Sn-k,n(s))

uniformly by Theorem 9.2.1 and Lemma 10.4.2, relation

10.4.2 Asymptotic Normality


Next we discuss the asymptotic normality of the estimators (10.4.7)-(10.4.11) in the
appropriate function space. Recall that for the consistency, the uniform convergence
result for the "simple" tail empirical distribution function was essential (cf. Corollary
10.2.2). Now we start with the corresponding asymptotic normality in the "simple"
case, which is basic for the later results. The proof of the result is rather technical and
follows from an entropy with bracketing central limit theorem for empirical processes
in van der Vaart and Wellner (1996) (Einmahl and Lin (2006)).
Theorem 10.4.4 Let f, ft, fr, ?3> be i.i.d. stochastic processes in C + [0, 1]. Suppose
n

,=i

in C [0, 1]. Assume also that the following smoothness condition holds: for all 0 <
and
0 there exists K > 0 andfor large enough v there exists 8Q > 0 such
P < \2 an
d c>Q
that for all 8 e [0, Sol
sup P \r < Es,8\
<s<l
0<s

sup

f(II) > t; 1 < c ( - l o g < 5 ) - ( 2 + 2 ^ / ( 1 - 2 ^ ,

s<u<s+8

(10.4.32)

where
< K(-log6r3

ES, = \h e C[0,1] : h > 0 , ^ Z ^ l

forallue[s,s

.
+ 8]\ .

(10.4.33)

Then for a special construction,


sup
0<s<l,x>c

x&

Vk ((1 - G,s(x)) - j ) ~ W(CS,X)\ 4- 0

as n * oo, where W is a zero-mean Gaussian process indexed by sets CStX defined


for 0 < s < I and x > 0 by
CStX := {h e C[0, 1] : h(s) > x)
and, with 0 < u < 1,

EW(cu<x)W(Cs,y)

= v(cUiX n cs,y),

where v is the exponent measure of Section 10.3.

10.4 Estimation of the Index Function, Scale and Location


Remark 10.4.5 Note that v (Cs,\/yi n Cs,\/y2) = v (CSui/yiAy2)
for fixed s the process W(CS,\/X) is a standard Wiener process.

345

= yi A y2. Hence

We also need the following result on the inverse empirical distribution function.
Corollary 10.4.6 Under the conditions of Theorem 10.4.4, for any function a e
C[0, 1], with a special construction,
sup Vk

\a(s)

l(k

^0

(s)W(Cs,i)

0<s<\

as n -> oo.
The asymptotic normality for our estimators is as follows.
Theorem 10.4.7 LetX\, X2, X 3 , . . . be i.i.d. stochastic processes in C[0, I]. Assume
(10.4.1)-(10.4.11). Moreover, adopt assumption (10.4.32) of Theorem 10.4.4 with
?(*) := {1 - Fs(X(s))}-1. Finally, we need a uniform second-order condition: for
some positive or negative function As(t) defined for 0 < s < 1 and t > 0 and
satisfying sup 0 < 5 < 1 As(t) -* 0 as t > 00,
\ogus(xt)-\ogus(t) _

xy-w-i

as(t)/Us(t)

y-(s)

(10.4.34)

-> Hy-(s)*P(s)(x)

As(t)

uniformly for s 6 [0, 1] with p e C[0, 1], p(s) < 0, 0 < s < 1, and
Hy_{s),p{s)(x) = f

yy-(s)-1

u^'ldudy

If, as n > 00,


(10.4.35)

Vfc sup \AS ( 7 ) 1 - > 0


and
Vk sup
0<S<1

* (f)
7 ^(f)

y+(*)

-0,

(10.4.36)

f/ien we /lave
sup \Vk {y+(s) -

y+(s))

y+(s)V(s) - 0 ,

(10.4.37)

4o,

(10.4.38)

4o,

(10.4.39)

-o.

(10.4.40)

0<5<1

sup

\Vk(y(s)-y(s))-g(s)

0<s<l'

sup
0<5<1

sup Vk
0<5<1

/Ml)

U (f) ;

A(s)

346

10 Estimation in C[0,1]

where V,Q,U< and A are defined in terms of the process W as follows:


V(s)

"

dx

= 1 W(C,)l-y-(s)
?

i _

W(CS,X)

xr-(s)

-2((1 - y-(s))(l

_ i

y-(s)

y_(s)

W(CsA) ,

dx

xl-r-^

- 2y-(s))rl v 1 1W(C,,i) ,

G(s) = \y+(s) - 2(1 - y-(s))2(d

- 2y_(s))} P(s)

+i(l-y_(J))2(l-2y_W)2S(i),
W() = W(C,,i),
./*(*) = y(j)W(C s ,i) + (3 - 4y_(j))(l - y-('))?(*)
-i(l-K_W)(l-2y_(j))2Q(5),
5 [0, 1].

Proo/ The proof is somewhat similar to that of Theorem 10.4.1. We sketch the line
of reasoning. Condition (10.4.34) implies that for any s > 0 there exists to such that
for t > to, x > 1, and 0 < s < 1,
\ogUs(tx)-logUs(t)
as(t)/Us(t)

Y (s)+e

Hy_(s)<f)(s)(x) < e (l + x -

As(t)

(10.4.41)

We start with (10.4.39) and write


iz > (t) - " (r)

- _ti(B'

"

,\lo^(l)-lBUs(i)

U(f) 7

f lnr /Mf)\l

*(!)/Mf)

PWf)/J

//rtog^(f(^-MW))-log^(|)\

*(f)/Mf)

/
(10.4.42)

\
/
We consider the second factor of (10.4.42)first.Note that Corollary 10.4.6 implies
sup
0<s<\

Sn-k,n(s) - 1

(10.4.43)

10.4 Estimation of the Index Function, Scale and Location

347

Since (10.4.17) implies


l i m Us(tX>> _
t-*oo
Us(t)

xmuL(0,y{s))

locally uniformly for x > 0, we have

M l &-*.()}) PX

sup
, .
0<*<1
I/, (f)
Hence the second factor of (10.4.42) tends to one in probability, uniformly in s.
For thefirstfactor of (10.4.42) note that by (10.4.41) and (10.4.43),

/ r l o g M f {!?"-*.(*)})-logMl)

*(f)/*Mf)
y-C0

+ Op(l)V*A, Q j 1 + (^n-kAs)Y
For the first term apply Corollary 10.4.6 with a(s) := y-(s). Further, note that by
the boundedness of HY_(S),p(s)((k/n) t;n-k,n(s)) and (10.4.40) the second and third
terms converge to zero in probability uniformly in s. Relation (10.4.39) follows.
For the other relations we use again (10.4.41). Applying this with
X := Sn-i,n(s)/Sn-k,n(5)

a n d

' '=

<n-k,n(s)>

we get with
k-i
w 1} s

< > : = r X>g*_,-,(*) -log*-*.,,(')


i=0

as before that

M?\s)
as{Xn.ktn(s))/Ut(XH-k,n(s))

has the same limit behavior as


i y ^ (c-,-,(g)/g.-t,,(j)) y - (,) - 1
k^
y-(s)
The latter equals (cf. proof of Lemma 10.4.3)
1

f (k

Y-(s)[\n

\ - y - t o roo
)

A*/)6.-i,W

348

10 Estimation in C[0,1]

with
1

1 ~ Gn,s(x) := - X ] 1{(k/n)^i(s)>x}
k
=i

as before. Now we have expressed everything in terms of (k/n)^n-k,n(>s) and 1


Qn,s W, so that we canfindthe asymptotic behavior by Theorem 10.4.4 and Corollary
10.4.6. We still need some rearrangement:
K (
M?\s)
\as(Xn-k,n(s))/Us(Xn-ktn(s))

\-y-(s) noo

-?-*,(*)
n

(1 " GnA*))xy-(S)-1

-Cn-k,n(s)

1
\
1 - Y-(s))

dx

J(k/n)U-k,n(s)

/
VkUl- GnA*)) " * ~ *
l
J
J{k/n)!n-ktn(s)

\-r-(s)

1 /.oo

J J(k/nKn-k,n(s)

X^^dx

/OO

+ V /

.,(J)

JCy"(,)"2rfjC.

./(*/)<ii-*,n

Hence

Vk (

\ _ V(s)

\as(Xn-k,n(s))/Us(Xn-k,n(s))
(k

l-y-(s)J

\ - y - W /-oo

(-?n-*,(*) J
V"

(Vt 1-&,,W-* - 1

/(*/)?-*,(') V

'

-W(CS,X) V-W- fr
^

\-y_(s)

^oo

~ 11 /
W(Cs,x)xY-{s)~l
) J(k/n)t,,-.k,n(s)

-?-*.(*)
n
)

+ {^((^w)"y"W-)

+ y-(j)W(C,,i)l /
JC^-W-2^
J J(k/n)l;n-.Kn(s)
+ Vk f

xv-U)-2 dx +

W(C)

JC^"

w(Cs,i)
1

rf*

dx

10.5 Estimation of the Probability of a Failure Set

349

(*/*)_*, (s)

+ Y(s)W(C.

rY-(s)-2

' > /

dx,

which by Theorem 10.4.4 and Corollary 10.4.6 converges to zero in probability,


uniformly in s. The proofs of the other parts of the theorem are similar.

10.5 Estimation of the Probability of a Failure Set


We return to the problem sketched in Section 1.1.4. The coastal provinces Friesland
and Groningen of the Netherlands are protected against inundation by a long dike and
a breach in the dike at one place could mean inundation of the whole area. Suppose
that we have n independent observations of the water level along the dike during a
wind storm over a certain period. We want to estimate the probability
P{X(s) > f(s) for some 0 < s < 1},
where {X (s)}se[0, l] is the stochastic process representing the water level along the dike
during a wind storm, and the continuous function / represents the top of the dike. Since
in modern times no flood has been recorded, all our observations Z i , X 2 , . . . , Xn are
well below f. So we have to estimate the probability of an event that might never
have taken place. Since this is an essential feature of the problem, we want to keep
it in the asymptotic analysis, where we imagine that the number of observations n
tends to infinity.
More generally, consider a failure set Cn with P(Cn) = 0 ( 1 /n) and a sample
X\, Z 2 , . . . , of elements of C[0,1]. We want to estimate P{Cn).
Our conditions are quite similar to those in Chapter 8:
1. The domain of attraction condition (cf. Theorem 9.3.1): for each Borel set A c
{/ C[0,1] : / > 0} with v(3A) = 0 and inf {|/|oo : / e A] > 0,
n
lim P(RnXe

A) = v(A)

n-oo k

for some k = k(n) > 00, k/n


RnX{s):=[l

0, where for 0 < s < 1,


+ Y(s)

X(s)-h

i/y(s)

(|)V

(!)
)
The functions as(n/k) > 0 and bs(n/k) are suitable continuous normalizing
functions.
Estimators y (s),as (n/ k), bs(n/ k) such that with some sequence k = k(n)
00,
k(n) = o(n), n > 00,
sup
0<s<l

sTky(s) - y(s)

v^

(f)

(!)

v^ Mf)-Mf)
* (!)
= 0P(l)

3. Cn is open in C[0,1] and there exists hn e 8C such that

350

10 Estimation in C[0,1]
/ < hn = / i Cn .

4. (Stability property) We require

where S is a fixed set in C[0,1] with


(a) / > 0 f o r / S ;
(b) v(dS) = 0andinf{|/|oc : / e S] > 0;
(c)
c := sup I

* (f)

0<s<l \

5. Sharpening of (1):
P(/?(X)ec5) P
>1,
* v (c.S)

n - oo .

6. With )/ := info<j<i y(s):


1
and
W y (c r t )

lim

r.

= 0.

Finally, we define the estimator pn for pn := P(Cn):


1 n
^^ [Rn(X)eSny

n:=

where
VYU)
yic/
,
as ( r )
J

c : = sup ll + y(s)
o<s<i \
Rnf(s):=ll + m

&

,H\

and
Sn '= Rn(Cn).
Cn

Remark 10.5.1 Note that c is not defined if

,for0<5<l,

10.5 Estimation of the Probability of a Failure Set

351

hn(s)-bs(i)

Ml)
for some s e [0,1]. However, when checking the proof, one sees that as n - oo, the
probability that this happens tends to zero.
Theorem 10.5.2 Under our conditions,
Pn P

1,

n -> oo ,

Pn
provided v(S) > 0.
The proof of Theorem 10.5.2 follows from three lemmas and four propositions.
The proofs are very similar to those in Chapter 8 and will be mostly omitted.
Lemma 10.5.3 LetGn be an increasing and invertible mapping: C[0,1] -> C[0,1].
Suppose that lim^oo Gnf = f in C[0,1] for all f e C[0,1]. For an open set O
let
On := {Gnf
:feO}.
Then for all f e O,
l o n ( / ) := 1{/GO}-> l o ( / ) := l{/eO}
Lemma 10.5.4 For all x > 0,
'
lim

l + y(s)

n-*oo\

<
xY(s)+o2(l)_l
( 1 + ^ ( 1 ) ) T r - +03(D

4)

yW + o 2 (l)

^
=x,

uniformlyfor 0 < s < 1, provided}/ is a continuousfunction on [0,1] and the o-terms


tend to zero uniformly in s.
Lemma 10.5.5 For all x > 0 and cn -+ oo,
lim -

n^oocn

I 1 + yn(s) U l + Ox (yn(s) - y(s)))


\

{CnXY

Y(S)

l/Yn(s)

uniformly for 0 < s < 1, provided yn and y are continuous functions,


sup |y(s)-y(.s)| = 0 ,
0<5<1

am/
lim

sup |y n (j) - y(s)\ c~Y{s) f " s ^ " 1 log* a\s = 0

352

10 Estimation in C[0,1]

Proposition 10.5.6 Let S be an open Borel set in {f C[0, 1] : / > 0} with


v(S) > 0, v(dS) = 0 and such thatinf{\f\00 : / S] > 0. Suppose condition (1)
holds. Define
1
i=l

77*e/i as n -> oo,


v(S)->vn(S).
Proposition 10.5.7 Assume the conditions of Proposition 10.5.6. Let y(s),
bs(n/k) be estimators such that
sup \y(s)-y(s)\

ds(n/k),

-* 0

0<s<\

sup
0<5<1

sup
0<s<l

'Ml).,
o.(f)

Mf)-Mf)
Ml)

0.

Define

^ ^ j E/ = 11 * XeS} *
Then, as n > oo,
vn(S) -> v(5) .
Proof. Invoke a Skorohod construction so that we may assume that by virtue of
Lemma 10.5.4,
RnRf -+ Identity a.s.
Write
c :=inf {l/loo : / } > 0 .
For all 0 < c$ < c, / e S => there exist s [0, 1] such that f(s) > c$.
Take no such that for n > no,
RnR^co

>

co

Then for each / e S there exists 5 G [0,1] such that


RnR^fis)
Hence

> RnR^CQ >

co

10.5 Estimation of the Probability of a Failure Set


[RnRff

: / e S) C {/ : / < RnR^c0}C

353

C {/ : / < y } ' =: D ,

i.e.,

Now
vn(D) -> v(D )
by Proposition 10.5.6. Hence as in the proof of Proposition 8.2.8,
vn (RnR?{S))

-* v(S)

almost surely, and hence in probability.


Proposition 10.5.8 Under the conditions of the theorem,
Cn

P 1 ,

n - > oo .

/V00/ For 0 < s < 1, write


rn(s):=

i/y(j)

y(5) M*)-Mf)

l+

* (!)

Then
cn = sup

1 + yO)-

* (!)

0<s<\ \

= sup rn(s)
0<s<\

1/xW

n(s)~bs(l)\

1 + Hs)

\-r
I rn(S)

I
'a(D (r,fr)) yW - 1

* (f)

**>

Ml)-Ml)'
"s (?)

Hence, since the expression inside the curly brackets tends to one in probability
uniformly in s by Lemma 10.5.5, we have
Cn

cn

sup rn{s)

1.

0<s<l

Proposition 10.5.9 Under the conditions of the theorem


Vn ( T - ^ RfifinSU

4> V(5) .

Part IV

Appendix

A
Skorohod Theorem and Vervaat's Lemma

One form of the Skorohod representation theorem is as follows.


Theorem A.0.1 Let S be a complete and separable metric space. Suppose that the
sequence of probability measures Pn on S converges to the probability measure PQ
in distribution. Then one can find Borel measurable functions (random variables)
Xo, X\,... defined on the Lebesgue interval ([0, 1], S, X), with B the Borel sets and
X Lebesgue measure, such that Xn has probability distribution Pnforn = 0, 1, 2 , . . .
and
lim Xn(o>) = X(co)
n-oo

for all a) e [0, 1]. For a proof we refer to Billingsley (1971).


We combine this theorem with Vervaat's lemma and give an example of the
application of both.
Lemma A.0.2 (Vervaat (1972)) Suppose y is a continuous function and x\, xi,...
are nondecreasing functions on some interval [a,b]. Further, we have a function g
on [a, b] with a positive derivative gf. Let 8n be a sequence ofpositive numbers such
that 8n > 0, n -> oo, and

hm

Xn(s)-g(s)

n->oo

dn

/A A IX

= y(s)

(A.0.1)

uniformly on [a, b]. Then


lim <{s)
n-oo

- g*~is)

-(g*-)\s)

y (g+-(s))

(A.0.2)

8n

uniformly on [g(a), g(b)], where g % xf~, x~,... are the inverse functions (rightor left-continuous or defined in any way consistent with monotonicity).
Proof. We first prove the result for g(s) = s for all s. Note that (A.0.1) implies that
xn(s) converges uniformly to s and hence x~(s) converges uniformly to s. Take any
sequence sn -> so (g*~(a), g^(b)). The local uniformity in (A.0.1) gives

358

A Skorohod Theorem and Vervaat's Lemma

lim

Xn (x^(sn)

Sn) - XJ-(sn) T Sn
Sn

At-* 00

= lim y (x^(s) sn) = y(s0)

with sn positive and 0 < Sn v (g(b) - < ~ 0 ) ) v (xj~(sn) - g(a)). Now,


Xn \xn

\sn) ~~ Sn)

< sn < xn (x~(sn) + sn) and hence


sn

xn \sn)
Sn

-y(so)\
Xn (x^~(Sn) + Sn) -

< max

X^(s)

Sn

Xn (xj-(sn)

- Sn) - XJ~(S)
Sn

- y(so)

- y(so)

o.
Next note that from (A.0.1),
V(g^(s))

Sn

uniformly for g(a) < s < g(b), which implies, as we just proved,

g+-(xf(s))-s
-y(g*~(s))

Sn

Now
g^ (xf(s))

-s

= g^ (xfis))
= (xf(s)

- g<~ (g(s))

- g(s)) (g^Y (g(s) + 0 (xf(s)

- g(s)))

for some 6 [0, 1]. The result follows.

An application of the combination of both results is the following.


Example A.0.3 Let U\, U2,... be independent and identically uniformly distributed
random variables on [0,1]. The empirical distribution function Un is defined as
1

Un(x):=-J2Uui<x}i=l

From Billingsley (1968), Theorem 16.4, we have


{Vn(Un(x)

- *)}()<,<! -*{#o(*)}o<*<i

in D[0, l]-space, where Bo is a Brownian bridge. Skorohod's theorem implies that


there are random processes {/*} and {BQ}, Brownian bridges, such that U* =d Un
for all n and

A Skorohod Theorem and Vervaat's Lemma

359

lim sup |>/n (U*(x) - x) - B^(x) I = 0


(A.0.3)
^0<x<l
with probability one.
Next consider the nth order statistics ofU\,U2,...,Un,
indicated by U\ , n , U2,n,
. . . , Unyn. The random function
n

{U[nx],nh<x<l

is easily seen to satisfy


SUp \Un (U[nxin) - x\ = O
0<*<1

G>

hence
sup \Vn~(U[nxln - Uf(x))|

-> 0 .

0<*<1

Since (/*)

d U^~ for all n, we obtain from (A.0.3) via Vervaat's lemma

lim sup \jn((U*)*~


^0<x<l
with probability one, which implies

(x) - x) + B$(x)\ = 0

{y/n (Uj~(x) -x)]0<x<x

-+{-B0(x)}o<x<i

in D[0, l]-space and finally via (A.0.4),


{Vn(U[nx],n

in Z)[0, l]-space.

~ *)}0<*<1 "^{-^0U)}0<*<1 ={0(*)}0<*<1

(A.0.4)

B
Regular Variation and Extensions

B.l Regularly Varying (RV) Functions


Regular variation is an important tool in this book, mainly the more recent developments. Since these cannot be found in any book, we need to discuss them in detail. But
then it seems more natural to start from the beginning and develop the main results
on regular variation from scratch. This, along with the newer developments, is the
topic of this appendix. The basic material has been taken from Geluk and de Haan
(1987) with kind permission of J.L. Geluk.
One way to think about regular variation is as a derivative at infinity. For a real
measurable function g write the differential quotient
<? + >-<?>,
(B.l.l)
h
where h ^ 0. Now, we do not take the limit h -> 0 for fixed y as usual, but we take
trie limit v -> oo for fixed h. If this limit exists for all h ^ 0, then it follows (Theorem
B.l.3 below) that the limit does not depend on h and we can write (see Proposition
B.1.9(3)) g(y) = go(y) + o{\), as y -> oo, where go is differentiable and
r
it \
v
8(y + h) - g(y)
hm g0(y) = hm
.
y-*oo

y-oo

(B.1.2)

If the limit in (B.1.2) as y -> oo exists, the function / : E4" -> R + defined by
f(t) = expg(logf) satisfies

lim IM.

=x*

(B.1.3)

t->oo f(t)
for all x e R + for some a e R. Then / is called a regularly varying function.
In this appendix these functions are studied thoroughly. Moreover, we study the
more general class of functions / : R + -> R for which
f{tx)
r
"
hm

*->
a{t)

b{t)

(B.l.4)
'

362

B Regular Variation and Extensions

exists for all x R + , where a > 0 and b are suitably chosen auxiliary functions. The
results for functions satisfying (B.1.4) are surprisingly similar to those for functions
satisfying (B.1.3).
Definition B.l.l A Lebesgue measurable function / : R + -> R that is eventually
positive is regularly varying (at infinity) if for some a R,
\im^^-=xa,
r^oc f(t)

x>0.

(B.1.5)

Notation: / RV a .
The number a in the above definition is called the index of regular variation. A
function satisfying (B.1.5) with a = 0 is called slowly varying.
Example B.1.2 For a, 0 R the functions xa, xa(\ogx)P, * a flog log JC)^ are RV a .
The functions 2 + sin (log log x), exp((logJc)a), with 0 < a < 1, Jc _ 1 logr(x),
J2k<x V* (logO s i n ( l o g l o g ^ are slowly varying. The functions 2 + shut, exp[logJC],
2 4- sin log x, x exp sin log x are not regularly varying.
Our next result shows that it is possible to weaken the conditions in Definition
B.l.l.
Theorem B.1.3 Suppose f : R+ > R is measurable, eventually positive, and
lim ^

'->

(B.1.6)

fit)

exists, and is finite and positive for all x in a set ofpositive Lebesgue measure. Then
f e RVa for some a e R.
Proof. Define F(t) := log f(e<). Then lim r _^ 00 (F(r 4-JC) - F(t)) exists for all x in a
set K of positive Lebesgue measure. Define 0 : K -> R by 0(JC) := limf_>00{F(? +
x) - F(t)}. By Steinhaus's theorem (cf. Hewitt and Stromberg (1969) p. 143) the
set K K := {JC v : x, y e K] contains a neighborhood of zero. Since K is an
additive subgroup of R, we have K R and thus (j>(x) is defined for all x e R and

<M* + y) = 000 + 000

(B.1.7)

forall;c,y R.
It remains to solve equation (B.1.7) for measurable 0: Consider the restriction of
0 to an interval L c R. By Lusin's theorem (cf. Halmos (1950) p. 242) there exists
a compact set M c L with positive Lebesgue measure XM such that the restriction
of 0 to M is continuous. Let s > 0 be arbitrary. Then there exists 8 > 0 such that
0(v) 0(x) (, ) whenever x, y e M and |JC v| < 8 (since the restriction of 0
to M is uniformly continuous) and also such that M M contains the interval (6, 8)
(by Steinhaus's theorem). For each s e (5, 8) c M M there exists xo M such
that also xo + s e M. Then 0(x 4- ^) (f>(x) = 0(5) = 0(*o + $) 0(*o) (e, )
for all x G R; hence 0 is uniformly continuous on R. Since <p(n/m) = 0 ( l / m ) =
Ai0(l)/m for n, m e Z, m ^ 0, we have by the continuity of 0 that 0(*) = 0(l)x
for x R. Now (B. 1.5) follows.

B.l Regularly Varying (RV) Functions

363

Theorem B.1.4 (Uniform convergence theorem) Iff e RVa, then relation (B.l.5)
holds uniformly for x [a, b] with 0 < a < b < oo.
Proof. Without loss of generality we may suppose a = 0 (if not, replace f(t) by

f(t)/n.
We define the function F by F(x) :=logf(ex). It is sufficient to deduce a contradiction from the following assumption: Suppose there exist 8 > 0 and sequences
tn -> oo, xn -* 0 as n -> oo such that
|F(frt+xn)-F(fn)|>6
for = 1, 2,

For an arbitrary finite interval / c l w e consider the sets


Yhn = lyeJ

: |F(fn + y) - F(tn)\ > *- j

and
^

= | j e /

: \F(tn+xn)-F(tn+y)\

> -

The above sets are measurable for each n and Y\,n U >2,n = ^; hence either A(Fi,w) >
A(/)/2 or A(F2,) > k(J)/2 (or both), where A denotes Lebesgue measure.
Now we define
Zn = \z : |F(f + *) - F(tn +xn-z)\>
= {z : xn-ze

- , xn-ze

Y2,n} .

Then A(ZW) = X(Y2,n) and thus we have either X(Y\in) > k{J)/2 infinitely often or
MZ) > HJ)/2 infinitely often (or both).
Since all the yi, n 's are subsets of a fixed finite interval we have
A(limn_+oo sup Y\9n) = limk^oo kQJn=kY\,n) > A(/)/2 or a similar statement
for the Z n 's (or both). This implies the existence of a real number XQ contained
in infinitely many Y\yH or infinitely many Z n , which contradicts the assumption
lim^oo F(t + x0) - F(t) = 0.

Theorem B.1.5 (Karamata's theorem) Suppose f e RVa. There exists to > Osuch
that f(t) is positive and locally bounded for t>to.Ifa>l
then
lim

*f

^ fto

= a +1.

f^ds

(B.1.8)

If a < 1, or a = 1 and /0 / ( s ) dj < oo, then


tf(t)
lim ^

= -a - 1 .

(B.1.9)

'-" f? f(s) ds
Conversely, if (B.1.8) holds with 1 < or < oo,tftew/ /?Va; i/ (B.1.9) holds with
oo < of < 1, then f RVa.

364

B Regular Variation and Extensions

Proof. Suppose/ e R Va. By Theorem B. 1.4, there exist to, c such that/ (tx)/f(t)
cfort> t0, x e [1, 2]. Then for t e [2nt0, 2"+%] we have
fit)

fito)

f(2~lt)

fit)

fil-h)

fj2-nt)

fil-h)

<

+1

fito)

< c

Hence f(t) is locally bounded for t > to and / r ' f is) ds < oo for t > to.
In order to prove (B. 1.8), we first show that ft f is) ds = oo for a > 1. Since
f(2s) > 2 _ 1 / 0 ) for s sufficiently large, we have for n > no,
/

fis) ds = 2
n

fi2s) ds > /
n 1

J2

fis) ds .
n x

J2 ~

J2 ~

Hence
poo

fis) ds=T

J2no

n=/iQ

rzn

p2n+l

fis) ds>T
J2n

/2"0+1

n=no

fis) ds = oo .

S^Zn J2no

Next we prove Fit) := ff fis) ds e RVa+i for a > 1. Fix JC > 0. For arbitrary
e > 0 there exists fi = fi() such that fixt) < (1 + e)xa fit) for t > t\. Since
lim^oo F(f) = oo,

^('*) _ -C ^(5) j 5 _ f"x f^ ds _ x ft[ f(xs>> ds


~m " J;Q f(S) ds ~ /,; /(*> ^ " /,; / w ^ '
as f -> oo, and hence
^<(l+2*)*

+ 1

(B.1.10)

for r sufficiently large. A similar lower inequality is easily derived and we obtain
F RVa+i for a > - 1 .
In case a = 1 and Fit) -* oo the same proof applies. If a = 1 and Fit) has
a finite limit, obviously F e RVo.
Now for all a,
Fitx) - F(f)

tfit)

/* /(*)

fit)

dw ->

xa+1 - 1
,

a +1

t -> oo ,

(B.l.ll)

by the uniform convergence theorem (Theorem B.1.4). Since F e RVa+\, (B.1.8)


follows. For the proof of (B. 1.9) we first show the finiteness of the function G defined
by
OO

fis) ds .

Since in the case a < - 1 there exists 8 > 0 such that fi2s)
sufficiently large, we have, for n\ sufficiently large,

< 2~l~8fis)

for s

B.l Regularly Varying (RV) Functions


oo

/2"

+1

f(s) ds=Y

n=ni

/>2"l

2-4<"-">> /

f(s) ds<T

365

+1

f(s) ds < oo.

n=n\

The rest of the proof is analogous.


Conversely, suppose (B.l.8) holds. Define
fit)
F(t)
Without loss of generality we suppose f(t)>0,t>
0. Integrating both sides of
b{t)/t = f(t)/F(t) we obtain for some real c\ and all x > 0 (note that log F is
indeed an absolutely continuous function)
x

bit)
dt

= log F{x) + c\

(B.l.13)

(since the derivatives of the two parts exist and are equal almost everywhere). Using
the definition of b again we obtain from (B.l.13)
(f)

f(x) = cb(x)exp(f

dt)

(B.l.14)

for all x > 0, with c = e~ci > 0; hence for all x, t > 0,
f(tx)
fit)

b(tx) e x p( [* bjts) - 1
b{t)

-^ds)

Now for arbitrary e > 0 there is a fy such that \b(ts) a 1| < e for t > to and
,s > min(l, JC). Hence the function / satisfies (B.1.5).
The last statement of the theorem ((B.1.9) implies that f e RVa) can be proved
in a similar way.

Theorem B.1.6 (Representation theorem) If f


functions a : R+ -> R and c : R+ -> R w/f/i
lim c(0 = co (0 < co < oo) and

e RVa, there exist measurable


lim <a(0 = a

(B.l.15)

and to R + swc/i that for t > fy>

/(f) = c(t) exp ( 7 ^ ^ )

(B.1.16)

Conversely, if (B.1.6) holds with a and c satisfying (B.l. 15), then f e RVa .
Proof Suppose f'e RVa. The function t~af(t) is slowly varying and hence has a
representation as in (B.1.6) by (B.l.14). Then / has such a representation with a(s)
replaced by a (s) + a and c(t) replaced by t^cit). Now the result follows. Conversely,
one verifies directly that (B. 1.5) follows from (B.1.6).

366

B Regular Variation and Extensions

Remark B.1.7 1. In formula (B. 1.6) we may take to [0, oo) arbitrarily by changing the functions c(t) and a(t) suitably on the interval [0, to].
2. The functions a(t) and c(t) (given in (B.1.6)) are not uniquely determined. It can
easily be seen that it is possible to choose a(t) continuous: define
/o(f) := exp ( / a(v)

and

b0(t) := t -r

/o(0
Ms) ds

Since /o R Va we get (B. 1.14) with / and b replaced by fo and bo respectively,


i.e.,
/ ( * ) = c(x) c b0(x)cxV(J\bo(t)-l)j)
for all x > 0 with bo(t) 1 continuous. It is possible to put all the undesirable
behavior of the function / into the function c(t).
We are going to list a number of consequences of the above theorems. For that
we need the following definition.
Definition B.1.8 Suppose / : (to, oo) -> R for some to > oo is bounded
on intervals of the form (to, a) with a < oo and lim,_>oo / ( 0 = oo. Since
linif-xx) f(t) = oo, the set {v : f(y) > x] is nonempty for all x R. Hence
oo < inf {y : f(y) > x] < oo for x R. Note that this infimum is nondecreasing
in x. Since / is bounded on intervals of the form (fo, a), limjc-^oo inf {v : f(y) >
x] = oo. Hence there exists JCO R such that inf {y : f(y) > JC} > oo for all
x >xo. The generalized inverse function f : (JCO, oo) -> R is defined by
f(x) := inf {y : f(y) > x} .
In case / is a nondecreasing function, / is its inverse function. In that case we write
/ * " instead of/.
Proposition B.1.9 (Properties of RV functions)
-> a, t - oo. This implies

lim/(0 = {'

1. Iff e RVa, then log f(t)/ log t


a<

'

r-*oo
[ oo , a > 0.
2. / / / i flV^, /2 G #V a2 , tfien f\+fi
/?Vmax(ai,a2)- # moreover
limr_^oo /2(f) = oo, f/itt f/ie composition f\ o /2 e i?V aia2 .
3. If f RVa with a > 0 (a < 0) f/ien / w asymptotically equivalent to a strictly
increasing (decreasing) differentiable function g with derivative g' e RVa-\ if
a > 0 and -gf RVa-\ if a < 0.
As a consequence of this; if f e RVa(a > 0) is bounded on finite intervals of
R+, then
sup /(JC) - f(t)
(t->oo).
(B.1.17)
0<*<f

Iff

flVa (a

<0),then
inf/(*) ~ / ( f )

(f-*oo).

(B.1.18)

B.l Regularly Varying (RV) Functions

367

4. If f e RVa is integrable on finite intervals o / R + and a > 1, then f0 f(s) ds


is regularly varying with exponent a + 1.
If f e RVa and a < 1, then j^ f(s) ds exists for t sufficiently large and is
regularly varying with exponent a + 1. The same is true for a = 1 provided
f f(s) ds < oo.
5. (Potter, 1942) Suppose f RVa. If 81,82 > 0 are arbitrary, there exists to =
to(8\, $2) such that for t >to,tx > to,
(l-8i)xarmn(x82,x-82)

< ^ y

< (1+i)jc"max(jc* 2 , x~82).

(B.l.19)

Note that conversely, if f satisfies the above property, then f e RVa.


6. Suppose f RVa is bounded on finite intervals o / R + and a > 0. For i- > 0
arbitrary there exist c > 0 and to such that for t > 10 and 0 < x < ,
^ < .

(B.L20)

7. Tjf/ /?V a ,a < 0, is bounded on finite intervals ofM+ and 8,i- > 0 are
arbitrary, there exist c > 0 ant/ fo SMC/I that for t > to and 0 < x < ,
^

<-*.

(B.1.21)

& //

/(0=exp(Ta(5)y)

(B.1.22)

vw7/i a continuous function a(s) -> a > 0, 5 00, f/ien /*~ l?Vi/ a , vvftere
Z^ - w f/i inverse function of f.
9. Suppose f e RVa,a > 0, is bounded on finite intervals of'R+. 77in / /?Vi/ a .
(Formally, f is defined only on a neighborhood of infinity; we can extend its
domain of definition by taking f zero elsewhere). In particular, iff e RVa,a >
0, and f is increasing, the inverse function f*~ is in RV\/a.
10. If f RVa, a > 0, there exists an asymptotically unique function h such that
f(h(x)) ~ h(f(x)) ~ x, x - 00. Moreover, h ~ / if f is bounded on finite
intervals ofM.+.
11. Iff e RVa, a > 0, and f(t) = /(/o) + L is(s)dsfor t > to with yjs monotone,
then
hm = a .

'-> fit)

Hence in case a > 0 we have \j/ RVa-\.


Moreover, if f e RVa, a < 0, and f(t) = ft \ls(s) ds < 00 with i/r nonincreasing, then t^r(t)/f(t) -> a, as t -> 00. Hence in case a < 0 we have
* RVa-i.

368

B Regular Variation and Extensions

Proof. (l)-(5) Properties 1, 3, and 5 follow immediately from the representation


theorem (Theorem B.1.6). In order to prove regular variation of \f'\ in property 3
one also needs Remark B.1.7 following Theorem B.1.6. Properties 2 and 4 are easy
consequences of the uniform convergence theorem (Theorem B.1.4) and Theorem
B. 1.5 respectively.
(6) Take > 0. By property 5 there exists t^ such that if t > t'^
l^<2xa+l

for

fit)

x>l.

Also, by property 3, if t > t0,


f(tx)

sup w< , f{u)


<
=
< 2
fit) ~
fit)
Hence if t > to := max(^, / 0 ),

fit)

< max(2, 2 a + 1 )

for

0<x <1 .

for

0 < x <.

(7) Apply property 6 above to the function t~a+b fit).


(8) Since fit)^>oo,t-^>
oo, and / is eventually strictly increasing and differentiable, there exists, for x sufficiently large, a unique differentiate inverse function
g(jt) = f*~(x) for all x and
figix))

= gifix))=x

for

x>x0.

(B.1.23)

Differentiating the second equality in (B.1.23) we get, using (B.1.22),


g'ifix))fix)
gifix))

1
aix) '

(B.1.24)

Since / is continuous and fix) > oo, x > oo, (B.1.24) implies
tg'it)
git)

1
oo .

Application of Theorem B.1.5 gives gf e RV-\+\/a\ hence g = f*~ e RV\/a by


property 4 above.
(9) Suppose / RVa, a > 0. By Theorem B.1.6 and the remarks thereafter /
has the representation (B.1.6) with to = I and a continuous. For arbitrary e > 0 there
exists JCO = xo() s u c h m a t f r x > *0
(co - s)gix) < fix) < (co + e)gix) ,
where gix) = exp (/^ ais)/s ds).
The inequality (B.1.25) implies

(B.1.25)

B.l Regularly Varying (RV) Functions


8*~ (-?)

> /(*) > **" (-^)

369

(B.1.26)

for x sufficiently large. By property 8 above we have g^~ e


RV\/a.llenceg^'(x/(co
e)) ~ (coe)~l/ag*~(x). Since e > 0is arbitrary, (B.1.26) implies / ~ c$ g*~ e
RVi/a.
(10) Without loss of generality we may and do suppose / bounded on finite
intervals of IR+. Then the proof of property 9 gives the existence of functions g and
g^ such that f{x) ~ g(x), fix) ~ g+~(x),<isx -> oo, g(g^(x)) = g*~(g(x)) = x
for x sufficiently large. This implies x = g*~(g(x)) ~ g*~(f(x)) ~ / ( / ( * ) ) , as
x -> oo, where the first asymptotic equivalence follows from f(x) ~ g(x)> x -> oo,
g^~ i^Vi/a, and the uniform convergence theorem. The statement fifix))
~ JC,
x > oo follows similarly.
Suppose now
f(ht(x))

~ hi(f(x))

~ X(JC - oo)

for

i = 1, 2 .

Now lim^oo f(hi(xn))/f(h2(xn))


= \imn^oo(hi(xn)/h2(xn))a
for any sequence
xn -> oo by the uniform convergence theorem; hence /&i(x) ~ /12OO, x -> oo.
(11) Suppose first that V" is nondecreasing and /(f) = /(?o) + ft iris) ds for
f > *o- Then for a > 1 and f > ?o we have
t{a - l ) i K 0 ^ r ^ ( f i ; ) d v

fit)

*~ /Jx

/(fa) - / ( f )

f(t)

Since / e RVa we find that limsup^^tf


a > 1. Letting a -> 1 we get

it)/fit)

tfit)

fit)

lim sup
^ o o F f(t)

< iaa - l)/ia - 1) for all

<a .
~

Similar inequalities for 0 < a < 1 lead to liminf f_oo f ^(O/ZXO > a. The cases ^
nonincreasing and a < 0 can be proved similarly.

A stronger version of Proposition B. 1.9(5) is as follows.


Proposition B.1.10 (Drees (1998)) Iff
tois, 8) such that for f, tx > to,

f(tx)
fit)

e RVa, for each e, 8 > 0 there is a t0 =

<emax(jc a + < 5 ,jc a - 5 ).

Proof Clearly it is sufficient to prove the statement for a = 0. We apply Proposition


B.l.9(5): for t, tx > to, and 8 > 82,
e-8\logx\

^(1 _

8l)e-82\logx\

_ ^ <

e-8\togx\

(ftf*l

_ A

< ^ (^^2)|logJC| / Q -\-S\)

^ 5 2|l0gJC|\

370

B Regular Variation and Extensions

We prove that the left-hand side of the inequality tends to zero when 81,82 -> 0
uniformly for x > 0. The proof for the right-hand side is similar. Let, as the <5's go to
zero, x be such that 821 log x | -> 0. Then the second factor goes to zero (note that the
first factor is bounded). If 82 | log x | -* c (0, 00], then | log x | -> 00 and hence the
first factor goes to zero (and the second is bounded).

Remark B.l.ll 1. There is no analogue of property (3) in case a = 0; even if


lim^oo f(t) = 00 with f e RVo, then / is not necessarily asymptotic to a
nondecreasing function, as the following example (due to Karamata) shows.
Define f(x) := exp (/^ e(s)/s ds), where
0,

0<s

s(s) = { an ,

< 1,

(In - 1)! < s < (2/i)! , n = 1, 2, 3 , . . . ,

-an/2

, (2n)\ < 5 < (In + 1)! , /i = 1, 2, 3 , . . . ,

where the sequence^ is such that an > 0,n > oo,and n log > 00, n 00.
Then
f(x)
0 < J S + i ) ! /((2n + D!)
= exp I I
\ J(2n)!

/((2n)!)
f((2n + 1)!)
ds I = exp ( log(2n + 1)) -> 00 ,
V2
/
/

n -> 00 .

Hence (B.1.13) does not hold.


2. Using the representation theorem for regularly varying functions, it is possible to
show that if / is locally bounded and f e RVQ, then the function sup 0<;c<r f(x)
is slowly varying.
3. Note that/J f(s) ds e RVa+\ witha > 1 does not imply / e RVa. Example:
/ ( 0 = exp[logf].
The following result is a generalization of Theorem B.1.5 (the kernel function k
below is constant in Theorem B.1.5).
Theorem B.1.12 Let f RVa and suppose f is (Lebesgue) integrable on finite
intervals o/R + .
7. If a > 1 and the function k : R + > R is bounded on (0, 1), then
lim f

'-Wo
+s+a

k(s)^-

fit)

ds=

I k(s)sa ds .

(B.1.27)

Jo

2. Ift k(t) is integrable on (1, 00) for some s > 0, then ff k(s)f(ts)
for t > 0, and
r
f(ts)
r
lim / k(s)i-i- ds = / k(s)sads .

ds < 00

(B.1.28)

B.2 Extended Regular Variation (ER V); The class n

371

Proof. (1) Note that for 0 < s < a + 1 the function ta~sk(t) is integrable on (0,1).
Since there exist c > 1 and 6 > 0 such that f(tx)/f(t)
< cxa~ for tx > to, 0 <
x < 1, by Proposition B. 1.9(5), we can apply Lebesgue's dominated convergence
theorem to obtain
/

k(s)t-t

ds -> / fc(s)sa ds ,

t -> oo .

Furthermore,

l/o

tw

wH 5 ?7(o/o K T ) ' W

<is - 0 ,

since k is bounded and tf{t) > oo, as t > oo.


(2) The second statement is proved in a similar way.

t - oo,

Remark B.1.13 It is easy to see that the conditions in Theorem B. 1.12(1) can be
replaced by / bounded on (0, 1) and ta~sk(t) integrable on (0,1) for some s > 0.
Remark B.1.14 De Bruijn (1959) noted that for any slowly varying function L there
exists an asymptotically unique slowly varying function L*, called the conjugate
slowly varying function, satisfying L(JC)L*(JCL(JC)) -> 1, Lk{x)L{xLif{x)) -* 1,
x -> oo. Note that one can obtain L* as follows: Define h{x) := xL(x). Then
L*(JC) ~ h(x)/x, x - oo. In special cases one has L*(x) ~ 1/L(x), x -> oo.
Example: L(JC) ~ (log;c) a (loglogx)^, x - oo, a > 0, /? e R, i.e., if /*(*) ~
jcOogjc^OoglogJc/, JC -> oo, then /i(jc) ~ x(logJc)~ a (loglogx)~^, x -* oo. If
we replace x by jc*" and take f$ = 0, we obtain / ( x ) ~ * y (logx) 8 , y > 0, 8 e R,
implies /(JC) - y ^ j c ^ G o g J c ) " ^ , JC -> oo.

B.2 Extended Regular Variation (1? V); The class II


By way of introduction for the class n , which is a generalization of the class RV, we
formulate the RV property somewhat differently. A measurable function/ : R + -> R
is RV if there exists a positive function a such that for all JC > 0 the limit
hm

'-oo

a(t)

exists and is positive.


An obvious generalization is the following: Suppose / : R + -> R is measurable
and there exists a real function a > 0 such that for all JC > 0 the limit

lim IM^m
/-OO

0.2.1)

fl(f)

exists and the limit function is not constant (this is to avoid trivialities). First note that
(B.2.1) is equivalent to the existence of

372

B Regular Variation and Extensions

., ,
fitx) - f{t)
fix) := rhm
t-+oo

(B.2.2)

ait)

for all x > 0 with f not constant.


Next we identify the class of possible limit functions f.
Theorem B.2.1 Suppose / : R - R is measurable and a is positive. If iB.2.2) holds
with f not constant, then
xy - 1
fix)

=c

x >0,

(B.2.3)

for some y e R, c ^ 0 (for y = 0 read fix) = clogx). Moreover, (B.2.1) holds


with a function a that is measurable and is RVy.
Proof. Since f is not constant, there exists JCO > 0 such that f (JCO) # 0. From (B.2.1)
it follows that we can choose a(t) = ifix^t) fit))/fixo).
Hence without loss of
generality we may assume a to be measurable. For y > 0 arbitrary we have
u

aity)

ait)

f(tx0y)-f(t)
=

a(t)

f(ty)-f(t)

a(t)

fi*oy)-fiy)

>

f(tx0y)-f(ty)

^(JC())

a(ty)

Hence Aiy) := lim^oo aity)/ait)

t > 00 .

exists (and is nonnegative) for all y > 0. Since

aitxy)
ait)

aitxy)
aitx)

aitx)
ait)

we have
Aixy) = Aix)Aiy)

for all JC, y > 0 .

(B.2.4)

Since a is measurable, the function A is measurable. Moreover, the only measurable


solutions of Cauchy's functional equation (B.2.4) are Aiy) = yy for some y e R
(see the proof of Theorem B.1.3) and Aiy) = 0 for v > 0.
However, if Aiy) = 0 for v > 0, then since Aiy) fix) = fixy) fiy) for all
JC, v > 0, we have that f is constant, contrary to our assumption. Hence a e RVy
for some y e R. As a consequence we have
yyfix)

= fixy)-fiy)

for all JC, v > 0 .

(B.2.5)

If y = 0 we have Cauchy's functional equation again and fiy) = clogjc for some
c ^ 0 and all JC > 0.
Next suppose y ^ 0. Interchanging JC and v in (B.2.5) and subtracting the resulting
relations, we get
fix)il-yy)
Hence fix)/il

= fiy)il-xy)

xy) is constant, i.e., fix)

for

JC, y > 0 .

= c(l - xy)/y

for x > 0, with c / 0 .

B.2 Extended Regular Variation (ER V); The class n

373

The following theorem states that for y ^ 0 relation (B.2.1) defines classes of
functions we have met before. Note that it is sufficient to consider (B.2.3) with c > 0
since replacing / by / in (B.2.1) changes the sign of c.
Theorem B.2.2 Suppose the assumptions of Theorem B.2.1 are satisfied with y ^ O
and c > 0, i.e.,
f(tx) - f(t)
xy - I
lim

= c
.
f-*oo
a{t)
y
1. Ify > 0 then lim^oo fit)/ait) = c/y, and hence f e RVy.
2.1fy<0
then /(oo) := limJC_+00 f(x) exists, \imt-+oc(f(o) fit))/ait)
c/y, and hence /(oo) f(x) e RVy.

Proof The proofs of Theorem B .2.9 and Corollary B .2.11 below can easily be adapted
to show that if y > 0 (y < 0) there is a nondecreasing (nonincreasing) function g
such that
fit) - git) = oiait)) ,
t -> oo .
(B.2.6)
Since we may assume a e RVy (Theorem B.2.1), it follows that we also have
xY l
r
*('*)-*(*>
~
run,
hm
= c
.
(B.2.7)
t->oo
ait)
y
It will become apparent that it is sufficient to prove the theorem for g. Take y > 1
arbitrarily and define t\ = 1 and tn+\ = tny for n = 1, 2, We have, by (B.2.7),

gitn+l) ~ gitn+l)
y
m0Q^
lim r
= r (B.2.8)
n
-* gitn+l) ~ gitn)
Suppose y > 0. Then (B.2.8) immediately implies gitn) -> oo,/z -> oo. Further,
for any e > 0 there exists no such that for any n > no,
r

gitn+l) ~ g('no+l) = . X ! ^(tk+2) - gitk+\)}


k=n0

<yy(l + s){gk+i)-gk)}
k=n0
r

= y (l +

e){{g(t+i)-g(t0))}

and a similar lower inequality. It follows that

l i m i%i>

= yv

(B.2.9)

and hence
Y g{t ) .
w
w t /
a(tn) ~ og(tn+l)
' ^ ' -_ gifn)
~ 'n
C

(B.2.10)

374

B Regular Variation and Extensions

Further, for* > 1,


g(tnx)
g(tnx) - g(tn)
g(tnx) - g(tn)
1=

~
J-T
g(tn)
g(tn)
C ^

xy - 1 ,

n -> oo . (B.2.11)

For any s > 0 choose n(s) e N such that tn(S) < s < tn(s)+i. Then by (B.2.9) and
(B.2.11),
*(**> < 8(*njs)+ix) g(tnis)+i) ^ ^ ^ ^
n_^OQ
g(s)
g(tn(s)+l) g(tn(s))
Similarly
e(sx}

Q(t~/\x\

p(tf\}

g(s)

g(tn(s))

g(tn(s)+l)

-> xYy

(n-+ oo).

Since y > 1 is arbitrary, we have proved g e RVy. Combining with (B.2.7) gives
a(t)/g(t) -> y/c, t -* oo. With (B.2.6) this implies fit) ~ ca(t)/y,t -> oo, hence
Suppose next y < 0. Then (B.2.8) immediately implies lin^^oo g{tn) < oo.
Write h(x) := lim^oo g(0 g(x). We have

a{tn) ~ ^

flfe)

fife,)

'

Choose e > 0 and v > (1 + s ) - 1 ^ . Note that since a e RVy the above expression
is bounded above for n > no by

it=n

l-y^l+e)

which tends to c/y as e | 0. A similar lower bound is easily obtained, and we


conclude that
h(fn)
c
r
hm =
.
n-+oo a(tn)
y
Further, for x > 1,
*(Wl)
Afei)

*fti) M W l ) - h{fn)
Afti)
afe)

The rest of the proof follows closely the case y > 0.

Definition B.2.3 A measurable function / : R + -> R is said to be of extended


regular variation if there exists a function a : R + -> R + such that for some y e R
and all x > 0,
K
lr i m /(**) - /(*) * - 1 .
f-*oo

fl(f)

Notation: / ERV or / e ERVy. The function <z is called an auxiliary Junction


for/.

B.2 Extended Regular Variation (ER V); The class n

375

Definition B.2.4 A measurable function / : R + -> R is said to belong to the class


n if there exists a function a : R + -* R + such that for x > 0,
l i m /(^)-/(0 =
'-oo

fl(f)

Notation: / n or / 11(a). The function a is called an auxiliary function for / .


Example B.2.5 The functions f(t) = logf + o(l), /(f) = (logt) a (loglogf/ +
o(\ogtr~\
a > 0, p R, /(f) = exp((logf)) + ^ l o g f ) " " 1 exp((logf) a ), for
0 < a < 1, /(f) = f - 1 l o g r ( f ) + tf(l), as f - oo, are in IT. The functions
/(?) = [log f ], /(f) = 2 log f + sin log f are in RVo but not in FL
Remark B.2.6 1. Note that any positive function a \ is an auxiliary function for /
if and only if a\(t) ~ a(t), t -> oo.
2. For the definition of n it is sufficient to require (B.2.12) for all x in a set A
satisfying the following requirements: A.(A) > 0 and there exists a sequence
xn A, n = 1, 2 , . . . , such that xn -> 1, n -> oo.
3. We can weaken the definition of n as follows: there exist functions a : R + - R +
and g : R + -> R such that for x > 0,
r

hm

'-+oo

f(fx)-g(t)

= log* .
a(t)

Theorem B.2.7 Iff e Tl(a), then\imt^oca(tx)/a(t)


= Xforallx > 0. Moreover,
(B.2.12) holds with a function a that is measurable and hence is RVo.
Proof. This is a special case of Theorem B.2.1.
Theorem B.2.8 Iff

e Ti(a) and g : R
hm
r->oo

-> R is measurable and satisfies


= c

(B.2.13)

a{t)

for some c e R, f/z^n (B.2.12) is satisfied with f replaced by g; hence g e 11(a).


This follows immediately from (B. 1.23) and (B. 1.24). Obviously for a fixed auxiliary function a the relation (B. 1.24) between functions / , g e n (a) is an equivalence
relation. We shall see below (Proposition B.2.15(3)) that any equivalence class contains a smooth n-function.
Theorem B.2.9 (Uniform convergence theorem) If f eU, then for 0 < a < b <
oo relation (B.2.12) holds uniformly for x e [a, b].
Proof Define F(t) := f(e*> and A(t) := a(ef). It is sufficient to deduce a contradiction from the following assumption: there exist 8 > 0 and sequences tn -> oo,
xn -> 0, n -> oo, such that for all n,
F(xn + tn) - F(tn)
>8 .
A{tn)

376

B Regular Variation and Extensions

Consider the sets


J := [-8/5, +8/5] ,
F(tn + y)- F(tn)
>-2,yeJ),
YUn = {y :
A(tn)
F(tn+xn)-F(t+y)
>-2,yeJ}.
Y2,n = [y :
A{tn)
The above sets are measurable for each n and Y\%n U Y2,n J\ hence either X(Y\^n) >
l/2X(J) ork(Y2,n) > l/2k(J) (orboth), where k denotes Lebesgue measure. Define
Z\,n = \z :

I F(tn + xn) - F{tn + xn - z)

zeJ

>-,xn

A(tn)

ThenA(Zi,) = A(F2,w).
Sincea e R Vo (Theorem B. 2.1) we have the inequality A (tn) > A{tn+xnz)/2
for z e Z\,n and n > no by Proposition B. 1.9(5). As a consequence, Z\,n c Z2,n for
n > o, where Z2,n is defined by
Z2,>

-I-

F(tn+xn)-F(tn+xn-z)
Mtn +*n -Z)

Z/} C

[45]

for n sufficiently large since xn -> 0. Hence we find that X (limn-^oo sup Z2,w) >
A (lim^-xx) sup Z\,n) > k(J)/2 or A (limn_^oo sup Y\,n) > X(J)/2. This implies the
existence of a real number JCO contained in infinitely many Y\,n or infinitely many
Z2,n, which contradicts the assumption Mmt^oo{F{t + *o) F(t))/A(t) = JCO.
Corollary B.2.10 If f e U(a), for any e > 0 there exist to,c > 0 such that for
t >to,x >l,
\f(tx)-f(t)\
(B.2.14)
< cxc

ait)

Hence fit) is bounded for t > to.


Proof By the uniform convergence theorem (Theorem B.2.9) we have

f(fu) - f(t)
ait)

for t > t\ and 1 < u < e. Forx > 1 definen eNby

/('*) ~ fit)
ait)

f(ek+lt)

a(e t)

k=0

en <x < en+l. Then

- f(ekt) a(ekt)
k

(B.2.15)

f(tx) - f(e"t)
n

a(t)

a(e t)
e

a{ent)
a{t)

Using (B.2.15) and the inequality a{tx)/a(t) < c\x for some c\ > 0, t > ti
(Proposition B. 1.9(5)), we find that for t > to := maxfo, ^ ) ,

f(tx) - f(t)
a(t)

2d E eek

k=o
For the last statement, take t = to in (B.2.14).

< cene < cx.

B.2 Extended Regular Variation (ER V); The class n

377

Corollary B.2.11 If f e 11(a), there exists a nondecreasing function g such that


fit) git) = o(a(t)), t -> oo. In particular, g e U(a) by Theorem B.2.8.
Proof By Corollary B.2.10 the function / is locally integrable on [to, oo). Note that
by Theorem B.2.9,

lim

ffm-mdx^p

d* =

(B.2.16)

t^ooJx
x
Jx
x
a(t)
Now choose t\ > to such that f(ex) f(x) > 0 for x > t\. Then

[e f(tx)
I
J\

fte f(x)
ax = /
Jtl

[' fix)
ax I
Jtl

ax
x

ftl fW A ^ V /(**) " fi^

= I

ax + /
x

Jti

ax
x

Jti

=: goit)
Note that go is nondecreasing and by (B.2.16),
r

8o(f) - fit)

lim

= - .

^oo
ait)
2
Now go e U(a) by Theorem B.2.8. Define g(t) := g0(te~l/2). Then g e Ilia)
and git) - fit) = o(a(t)), t-^ oo.

The following theorem gives a characterization of the class IT.


Theorem B.2.12 Suppose f : R + - R is measurable. For to >Olet(p: (to, oo) ->
R be defined by
(Pit):=fit)-rl

I fis)ds.

(B.2.17)

JtQ

The following statements are equivalent:


I.
fell.

(B.2.18)

2. The function cp : (to, oo) > R is well-defined for some to > 0 and eventually
positive, and
fitx)-fit)
,DOim
r
iim
(B.2.19)
= \0gX
x > o .
3. The function <p : (to, oo) -> R w well-defined for t > to and slowly varying at
infinity.
4. There exists p e RVo such that

f.

/(0 = P(0+ / P t o y

(B.2.20)

5. 77iere exist c\, C2 R, i, 2 /?Vb wif/i tfi(0 ~ 2(0> * -> oo, such that

378

B Regular Variation and Extensions

fit) = cx+ C2fli(0 + I a2(s) .


(B.2.21)
s
J\
If f satisfies (B.2.20) (or (B.2.21)) then f Tl(p) (or f e U(a2) respectively).
Hence p(t) ~ a2it) ~ #>(f), -> oo.
Proof. We start by proving that (1) implies (2). Suppose / e U(a). Take to as in
Corollary B.2.10. Then (pit) is well-defined for t > to. Note that for t > to,
<p(t)
ait)

tpfjt) + Cl f(t) - fjtu)


+ /
^ ^ ^ - ^ du .
tait)
Jto/t
ait)

(B.2.22)

From Corollary B.2.10 it follows that /(f) = oit&), t -> oo, for any > 0 (take
f = f0 in (B.2.14)). Since tait) e RV\ (Theorem B.2.7), we have fit) = oitait)),
t -+ oo.
We can apply Lebesgue's theorem on dominated convergence to the second term
on the right-hand side in (B.2.22) since by Corollary B.2.10 for tu > to, 0 < u < 1,

aitu)
and by Proposition B. 1.9(5) for tu > t\, 0 < u < 1,
0 < - i - < ciu~ .
Hence lim^oo (pit)/a(t) = / 0 log u du = 1, which proves that (1) implies (2).
For proving that (2) implies (3) see Theorem B.2.7.
Next we prove that (3) implies (4). By Fubini's theorem we have
J

ds J J Y ~ "u "S

ds = J
S

J tQ

J to

J to J to

=U>

) du = fit) - (p{t) .

'o

Hence (B.2.20) with p = <p.


Finally, we prove that (4) implies (1). By the uniform convergence theorem (Theorem B.1.4) for functions in RV,
f(tx) -

f(t)

(aiitx)

\ aiit)

fx a2itu) du

aiit)
as t -> oo, for all x > 0.

Corollary B.2.13 IffeU,


then limr_>oo fit) =: /(oo) < oo exists. If the limit is
infinite, then f R Vb. If the limit is finite, then /(oo) fit) e RVo. Moreover,
a(t) = oifit))

as t - oo,

and when /(oo) < oo,


ait) = 0(/(oo) - f(t)) ,

ast->oo.

(B.2.23)

B.2 Extended Regular Variation (ER V); The class n

379

Proof. Consider the representation (B.2.20). Theorem B.1.5 implies that p(t) =
o(J\ p(s)/s ds), t -> oo. Hence, if f p(s)/s ds < oo, p ( 0 -> 0, f -> oo, and
lim^oo / ( ' ) = c + / r p(*)A <**. Then /(oo) - /(*) = /r p(s)/s ds e RV0
(Proposition B. 1.9(4)). If J p(s)/s ds = oo, then f(t) ~ f[p{s)/s ds e RV0
(Proposition B.1.9(4)).
When comparing (B.2.12) and (B.2.19) one sees that cp(t) ~ a(t)9 as t -> oo.
Now Theorem B.1.5 implies <p(0 = o(f(t)). Relation (B.2.23) follows. The second
relation follows in a similar way.

Remark B.2.14 1. Note that from the proof of Corollary B.2.13 it follows, using
(B.2.17), that <p(t) ~ a(t), t -> oo. As a consequence of Corollary B.2.13, the
limit relation (B.2.13) above is strictly stronger than f(t) ~ g(t), t -> oo.
2. Theorem B.2.12 is also true (and the proof not much different) with <p replaced
by the function
f
du

tJ

f{u)-^-f(t).

3. The result of Corollary B.2.11 is obtained again from Theorem B.2.12 by taking
g(t) = f p(s)/s ds with p as in (B.2.21).
4. Suppose / is locally integrable on R + and a R Vb- Then

IM^m^o,

,-,00,

(B.2.24)

a(t)
for x > 0, and
0,

r -> oo ,

(B.2.25)

are equivalent. The proof follows closely the proof of Theorem B.2.12.
5. From Theorem B.2.12(5) it is clear that for any a e R Vb, there exists a function
/ such that fe Tl(a).
6. Let t\ > 0 be such that / is locally integrable on (fi, oo). Then Theorem B.2.12
holds for any to > t\.
We mention some properties of functions that belong to the class n .
Proposition B.2.15 LIff,geTl
then f + g e II. If f e Tl, and h e RVa,
a > 0, then foheTl,
where ho f denotes the composition of the two functions.
/ / / 6 n, lim^oo f(t) = oo, and h is differentiate with h! RVa, a > 1,
then ho f eTl.
2. If f e Tl(a) is integrable on finite intervals ofM+ and the function f\ is defined
by
hit) := r 1 [ f(s) ds,

t>0,

(B.2.26)

then f\ U(a). Conversely, if f\ e Yl(a) and f is nondecreasing, then f


U{a).

380

B Regular Variation and Extensions

3. If f 11(a), there exists a twice differentiable function f with f


such that
__
lim ^ ( f ) " ^
t-*oo

= 0.

e RV-2
(B.2.27)

a(t)

In particular f is eventually concave. As a consequence of this, if f e Tl is


bounded on finite intervals o/R + andlim,-+00 f(t) = 00, then sup0<x<t f(x)
f(t) = o(a(t)), t-oo.
4. Suppose f e Tl(a). For arbitrary 81,82 > 0 there exists to = to(8\, 82) such
that for x > 1, t > to,
(1 - 82)*-^
<$i

f ( t x )

-S2<

a(t)

< (1 + h)X-^f
8\

+h .

(B.2.28)

Note that conversely, if f satisfies the above property, then f 11(a).


5. Suppose
fit) = fito) + / gis) ds,

t>t0,

(B.2.29)

JtQ

with g RV-\. Then f e Tl. Conversely, if f e Tl satisfies (B.2.29) with


g nonincreasing, then g e RV-\. Moreover, in this case tgit) is an auxiliary
functionfor f. This property supplements a corresponding statement for functions
in RVa,a^0
(cfi Proposition B.I.9(11)).
Proof. (1) The statement f+g e n is a consequence of representation (B.2.20) since
the sum of two slowly varying functions is slowly varying (see Proposition B. 1.9(2)).
If / e Tlia) and h e RVa, then for x > 0 we have

/ (l$?*(0) - f(h(t))
lim
= log x
t-+oo
aaihit))
by the uniform convergence theorem (Theorem B.2.9).
For the last statement we expand the function h:
hjfjtx)) - hjfjt))
aitWifit))

fjtx)

- f(t) h'jfjt)
ait)

+ 0{fjtx) h'ifit))

f(t)})

for some O < 0 = 0 ( J M ) < 1 - ^ o w ^ e s e c o n c ^ factor on the right-hand side tends


to 1 as t -> 00 since h' RVa and f e RVo (see Corollary B.2.13) by the uniform
convergence theorem (Theorem B.2.9).
(2) Define (pit) := fit) - t~l /J fis) ds for t > 0. If / Tlia), we have by
Theorem B.2.12
l i m ^ - ^ l i m ^ l .
'-00
ait)
t-+oo ait)
As a consequence, f\ e Tlia) (see Theorem B.2.8).
Conversely, suppose f\ Tlia). Then for x > 0 we have by definition
Jo <pi$)/s ds = fiit) and hence

B.2 Extended Regular Variation (ER V); The class n

fl(tx)-Mt)

381

[Xx<p(ts)ds

[ <PQ

a(t)
J\ a(t) s
Now fix x > 1. Since f\ e Ti(a) the above expression tends to log* as t > oo.
Since / is nondecreasing, tcp(t) is nondecreasing. This implies
)-^ ( 0 -<

(1 -x

rxvi

a(t) ~ Ji ai(t)

for t > 0, hence


<p(0 ^
<

logx

hm sup
HOC

fl(0

I-*"

for

x > 1.

Similarly we find that


r r ^W ^ - l o g *
hm mf
> z
for 0 < x < 1 .
'-oo a(t)
x~l - 1
Finally, let x -> 1 to obtain #>(f) ~ a(f), * -> oo, which implies <p R Vb- The proof
is finished by application of Theorem B.2.12.
(3) We may assume without loss of generality that / is integrable onfiniteintervals
of R + . Define the functions ft for i = 1, 2, 3 recursively by

Mt):=rl

f
Jo

fi-i(s)ds

for t > 0, where fo = / . Repeated application of Theorems B.2.8 and B.2.12 gives
2

fit) - hit) = { / i ( 0 - 7i+i(0} - 3(0 ,


;=o

f -> oo .

Hence f3 e 11(a) by Theorem B.2.8. Define J by 7 ( 0 := M^t).


/ ( r ) = o(a(f)), f -> oo. Furthermore, / is twice differentiable and
^ 2 /3 (0 = ( / l ( 0 - /2(0) " 2 ( / 2 ( 0 - / 3 ( 0 ) ^ - A ( 0 ,

Then f(t)

' "> OO ,

by Theorem B.2.12.
(4) From Remark B.2.14(2) it follows that there exist functions ao, b such that
flo(0 ~ <*(0 *(0 = o(a(t)), t -> oo, and
/(f) = /* ^ * < f a + b(t) ,

for

f > f' .

(B.2.30)

Then for all e, 8\, S3, 84 > 0 there exists f = fo(, 61, 63, S4) such that for all f >
fo, JC > 1 we have
fx ao(ts)
b(tx)
f(tx) - f(t) = / - ^ - ^ ds + - ^ - a ( f j c ) - i(r)
a(tx)
J\
s
~"-'^
< ( (1 + *3) f

s8l~l

ds + e(l + 54)JC51 + e j a(f)

f
x51 - 1
1
= j (1 + 83 + e(l + 54)6i)
+ 6(2 + 64) a ( 0

382

B Regular Variation and Extensions

using aoit) ~ a(t), bit) = o(a(t))9 and Proposition B.1.9(5). Hence / satisfies the
stated upper inequality if we take , 3, and 84 such that max(<$3 + e(l + <$4)<$i, e(2 +
4)) = <$2. The proof of the lower inequality is similar.
(5) We give the proof of the first statement. The proof of the other statement is
similar. Let

tgit)
h
8(t)
If g e RV-i, then the right-hand side in (B.2.31) tends to logx, as t -> 00 by the
uniform convergence theorem for regularly varying functions (Theorem B. 1.4). Next
suppose / e U(a). We have
f(tx) - f(t)

tg(x)
fx g(fs)
tg(x) r*
a(t)
a(t) Ji

a(t)

.
as
,
g(t)

and the integral is at most x 1 when x > 1. Hence for x > 1, since / e U, we get
. ,tg(t)
In*
hm inf >
r

'-oo

a(t)

x - 1

Similarly we find that limsup,..^ tg(t)/a(t) < (lnx)/(jc 1) for 0 < x < 1. Let
x -* 1 to obtain fg(f) ~ a(t),t -+ 00, and the last function is slowly varying by Theorem B.2.7.

Remark B.2.16 A special case of the current section is obtained when the auxiliary
function a satisfies a(t) ->- p > 0, t -> 00.
Note that the specialization of Theorem B.2.12 then gives the following statement:
Suppose g : R + -> R + is measurable. Then g /? Vp if and only if log g is locally
integrable on (to, 00) for some to > 0 and
lim /

log ( } ds = / l o g ^ ds = p .

This can be seen by applying Theorem B.2.12 for f(t) = log g(t).
A uniform inequality in the spirit of Proposition B. 1.10 is as follows.
Proposition B.2.17 If f Tl(a), f/iere emto a positive function ao with ao(t) ~
a(t) -> 00 swc/i that for all s, 8 > 0 f/*ere 15 0 fo = *o(> 5) MC/I that for t, tx > to,
f(tx) tfo(0

f(t)

5
a
- l o g * < emaxOc ,.*; )

Pwo/ We choose aoit) := /(f) - 1 _ 1 / J / ( J ) ds. Then by Theorem B.2.12, ait)


aoit), t -* 00, and
/ ( 0 = Ao(0 + / a0is) .
Jo
s
We have

B.2 Extended Regular Variation (ER V); The class n


f(tx) - f(t)
a0(t)

log x =

a0(tx)
a0(t)

1+

C Upjtu)
J\ \a0(t)

383

_ \ dw
) u

Hence, using the inequalities of Proposition B.1.10,


f(tx) - f(t)
- l o g * <emax(jc d ,x
a0(t)

max(jc,l)

)+ /

max(M5 l,u

8 l

) du

/min(jc,l)

-H)<

max(jc5,^~5) .

Combining Theorem B.2.2, Proposition B.1.10, and Proposition B.2.17 we obtain


the following theorem:
Theorem B.2.18 (Drees (1998)) Supposefor a measurablefunction f and a positive
function a we have
fitx) - f(t)
r
xy-i
lim

=
t-*oo
a(t)
y
for all x > 0, where y is a real parameter, i.e., f ERVy. Then for all 6,8 > 0
there is a to = to(s,8) such that for t,tx > to,
fitx) - f(t)
xy - I
< 8xy max(jc , x ),
ao(t)
where
Yf(t) ,
a0(t) :=

y > 0,

-y(/(oo)-/(*)),
l

fit) - t~ SI f{s)

Y <0,
ds,y=0.

The following result is a generalization of part of Theorem B.2.12 (the kernel


function k below is constant in Theorem B.2.12).
Theorem B.2.19 Suppose f e 11(a) is integrable on finite intervals ofM+.
1. If the measurable function

fc:E+->Ew

a(t)

2. Ift k(t)

t -+ oo .

is integrable on (I, oo)for some e > 0, then

i;
and

k(s)logs ds,

Jo

Jo

bounded on (0, 1), then

i;

k(s)

k(s)f(ts)ds

fits) - f(t)
ait)

< oo , for

*~f

t > 0

k(s) log s ds ,

f oo .

(B.2.32)

384

B Regular Variation and Extensions

Proof. (1) Note that for 0 < e < 1 the function t~sk(t) is integrable on (0,1). We
proceed as in the first part of the proof of Theorem B .2.12. Applying Corollary B .2.10
we have
f k(s) / ( ^ ) ~ / ( ' ) ds_+ f k(s)\ogs ds
Jto/t
a(t)
Jo
by Lebesgue's theorem on dominated convergence. Since k is bounded, ta(t) e RV\,
and f(t) = o{tlt2), t -> oo, we have
ft0/tk(s)
Jo

/("WO
a(t)

ds

fok(s7)f(s)ds-f(t)fk(*7)ds
ta(t)

as t -> oo.
(2) The second statement is proved in a similar way.

Remark B.2.20 Theorem B.2.19(1) also holds under the alternative conditions /
bounded on (0, 1) and f0 s~ek(s) < oo for some s > 0.
Theorem B.2.21 Suppose that f is nondecreasing andcj) is its left-continuous inverse
function. Let y be a real parameter. Equivalent are:
1. There exists a positive function a such that for x > 0,
r

lim
'-o

f(tx) - f(t)
=
a(t)

XY-\

2. There exists a positive function g such that for all x for which 1 + yx > 0,
1

= ( 1 + Yx)

F i T7T\

'

where x* = lim^oo f(t).


The functions a and g are connected as follows: g(t) = a(<p(t)).
Proof. Assume (1). For s > 0 and all t,
f(4>(f)-e)<t<f(4>(t)

+ e)

It follows that
(1 - ey - i ^ u ((i _
Y

*~

g)0(O)

u m )

a(d>(t))

^uqi + eMty-umt))
a (0(0)

= 0.
a (0(0)

fl(0(O)

^d + ^ - i
""*

as t t x* and consequently
lim
*t**

^ t - U(<l>(t))

B.3 Second-Order Extended Regular Variation (2ERV)

385

Hence by (B.2.21) for all x > 0,


t.

lim
ft**

u{x<t)(t))-t

xy-\

a((/)(t))

and by Lemma 1.1.1,


hm

yxyIY,

= (1 +

i.e., (2) holds. The converse is similar.

B.3 Second-Order Extended Regular Variation (2ERV)


Recall that a measurable function / is of extended regular variation ( / e ERV) if
for some y e R and positive function a,
lim
'-oo

f { t X )

= ^

(B.3.1)

a{t)

for all x > 0. Since one often needs to control the speed of convergence in (B.3.1),
it is useful to build a theory in which the convergence rate in (B.3.1) is the same for
all x > 0, that is, there exists a positive function A with lim^oo A{t) = 0 such that
f(tx)-f(t)

a(f)

H(x) := lim
t-^oo

xY-1
Y

(B.3.2)

A(t)

exists for all x > 0. First we want to exclude a more or less trivial case. If H(x) =
c{xY \)/y for some c R, the limit relation can be reformulated as
f(tx)-f(t)
lim

:A(Q)
^OOICACO)

_ *y-l

^ _

= a

A(f)

which seems to be insufficiently informative. Hence we exclude this case.


Our first theorem identifies the form of the limit function in (B.3.2).
Theorem B.3.1 (de Haan and Stadtmiiller (1996)) Suppose that for some measurable function f and positive functions a and A the limit (B.3.2) exists for all x > 0
where the limitfunction is not a multiple of(xy l)/y. Then there exist real constants
c\, C2 and a parameter p < 0 such that for all x > 0,
f(tx)-f(t)

lim

t-*oo

_ xZ^l

a{t)

A(t)

= d f sy~l

f up~l du ds + c2 f

sy+p~l

ds .
(B.3.3)

Moreover, for x > 0,

386

B Regular Variation and Extensions


lim '<'>
r-*oo A(t)

=clXy-

(B.3.4)
p

and
lim *W

*P .

(B.3.5)

/ view 0/ f/ie restrictions discussed before the theorem, the constant c\ cannot be
zero ifp = 0.
Remark B.3.2 Relation (B.3.3) holds with a replaced by a\ and A replaced by A\
if and only if Ai(0 ~ A(t) mda\(t)/a(t) - 1 = o(A(t))91 -> oo.
Remark B.3.3 Alternatively, one can write (B.3.2) as
r

lim
t-+oo

f{tx)-f(t)-a{t){xy-\)/y
ai(t)

f .
= H(x)

with a\ := a A. This will be used in the proof.


Definition B.3.4 A function / satisfying the conditions of Theorem B.3.1 is said to be
of second-order extended regular variation. Notation: / e 2ERV or / e 2ERVyiP.
The parameter p is called the second-order parameter.
Proof (of Theorem B.3.1). Consider for JC, y > 0 the identity
fjtxy) - f(t) - a(t)((xy)y - \)/y
f(tx) - f(t) - a(t)(xy - l ) / y
fli(0
ai(t)
y
f(txy) - f(tx)-a(tx)(y
- \)/y (tx)-yai(tx)
ai(fx)
t~yai(t)
y
yy - 1 (tx)~ a(tx) - t'Yaif)
+ Y
t-Ya\{t)
Letting t > oo on both sides, we obtain
x-Y(H(xy)-H(x))
y
^yy - \ (tx)-ya(tx) - t~ a{t) 1
+
y
y
t~ a\(t))
J "
(B.3.6)

v \ u< wi _i_ ,^(t*rYai(tx)


= lim \H(y)(l+o(l))
'-oo [ '
t~ya\(i)

By assumption, there exist y\, j2 <= R such that (#(;yi), {y\ l)/y)and(H(y2), O^
l)/y) are linearly independent; hence with A = (y\ l J / O ^ 1) w e obtain that
#Cyi) ^H(y2) 7^ 0. Now we subtract X times (B.3.6) at argument y = y2 from
(B.3.6) with argument y = y\ and we obtain
lim l(H(yi)
f^<x> I

- kH(y2))(l

+ 0 (1))

(,

*)_>'ai^>
t-Va\(t)

= ac-y { (ff(*yi) - #(*)) - y^f - i r (fl(iy2) - # ( * ) ) } ,

x>0

B.3 Second-Order Extended Regular Variation (2ERV)

387

From this, we conclude that \\mt-^oo{tx)~y a\{tx) / {t~y a\{t)) exists for all x > 0.
Since the limit should be finite for all x > 0, it must be positive for all x > 0. Hence
t~Ya\(t) is regularly varying with index p, say. The existence of this limit, together
with relation (B.3.6), implies that limr-*oo((fjiO~ya(fjc) - t~ya(t))/(t~yai(t))
also
exists for all x > 0. Hence we obtain (B.3.4) for some c\ e R by Theorem B.2.1.
From Theorem B.2.2 we must have p < 0.
As a result, we obtain the following functional equation for H:
H(xy) = H(y)xp+y+

H(x) + cixy?^-

for

x,y>0.

(B.3.7)

A simple calculation verifies that


Hi(x) = ci J sy~l

I up~l

duds

is a solution of (B.3.7). Obviously, the function G(x) = H(x) H\ (x) satisfies the
homogeneous equation
G(xy) = G(x) + G(y)xp+y

for JC, y > 0 .

If p + y = 0, this is Cauchy's equation, having the unique solution G(x) ci log x


in the class of measurable functions. If p + y ^ 0, by symmetry we obtain
G(xy) = G(y) + G(x)y+y
and hence G(JC)(1 - yp+y) = G(y)(l - xp+y) for JC, y > 0, which implies that with
some C2 e R, G(JC) = C2(xp+y l)/(p + y). Hence the general solution is
ex f sy~l

f up~l du ds + c2 I sp+y~l

ds .

Remark B.3.5 1. The function A, describing the rate of convergence in (B.3.2),


is regularly varying with index p. So if p < 0 we have an algebraic speed of
convergence in (B.3.2). In case p = 0 it is much slower, for example, logarithmic.
2. From (B.3.3) we see that //(JC) can be written as
I P V y+p

Y ) '

'

(B.3.8)

H(x) =
2

li(logx) ,

p = y=0,

if we replace a and A in (B.3.3) with a and A, respectively,


A(0 := (ci + C2P) A(0,
5(0:=fl(0(l+C2A(0)
This corresponds to ci = 1 and C2 = 0 in (B.3.3). This representation has been chosen
in such a way that H{\) = 0 and H'(x) = xy~l(xp - I)/p.

388

B Regular Variation and Extensions

Next we prove that in some cases we can relate 2ERV functions to classes of
functions we have met before.
Theorem B.3.6 (de Haan and Stadtmiiller (1996)) Suppose that thefunction f satisfies the conditions of Theorem B.3.1. Then:
1. in case p = y = 0,
f(t) = h(t) +\

/" his)
-U-ds
Jo s

with h e Yl;
2. in case p = 0, y ^ 0,

t~yfit)

n,

where fit)

( fit) ,

y > 0,

I /(oo) - fit),

y <0;

:=

3. in case p < 0, there exists some nonzero constant c such that


( f(t) - c

is in ERVK+/0 .

Conversely, any of the properties (1), (2), (3) implies that f is in 2ERV.
Remark B.3.7 More precisely: In case p = 0 and y > 0,
r
hm
^oo

(tx)-yfjtx)-t-yfjt)

= logjc
*
Clt~Yfit)Ait)

(B.3.9)

or, equivalently,
lim - ^
?^oo
c\Ait)

= xY logx .

(B.3.10)

In case p = 0 and y < 0, (B.3.9) and (B.3.10) hold with / replaced by /(oo)
In case p < 0, lim^oo f yait) = c exists in (0, oo), and
lim
a(0A(0/(^+c2)

V+P

Remark B.3,8 1. It follows from the representations of Theorem B.3.6 that (B.3.3)
holds locally uniformly in (0, oo) by the properties of the function classes to
which the / ' s belong in the different cases. The representations also lead to
Potter bounds for relation (B.3.3).
2. Note that in most cases the existence of a second-order relation makes the firstorder relation simpler; for example, in case p < 0 and p + y = 0 one has
fit) ~ C3ty', as t - oo.

B.3 Second-Order Extended Regular Variation (2ERV)

389

Proof (of Theorem B.3.6). (1) For y = p = 0 relation (B.3.3) implies


r

lim

fitxy)

'-oo

- f(ty) - / ( ; * ) + f(t)

= (logx)(logy)
CL\(t)

for JC, y > 0 with a\ := aA. Hence gx e H(a\ log*) for x > 0, where

gxit):=f(tx)-f(t).
Now Theorem B.2.12 implies that

lim
where

- ^ r ~ = l>

(B311)

1 ff
hx(t) := gx(t) - - / gx(s) ds = h(tx) - h(t)

with

hit) := /(O - \ f f(s) ds


t Jo

Hence (B.3.11) translates into


h(tx)-h(t)
lim

t-*oo
a\(t)

= log*

for x > 0, i.e., /i e n(i). Finally, by Fubini's theorem,

f(t) = hit) + / < f a .

(B.3.12)

Jo
Conversely, if (B.3.12) holds and h e T\ia\) then for x > 0,
/(fjO-/(0-(*(0+ai(0)log*
aiit)
hitx) - hit)
fx (hist) - hit)
= lim
h /
f-+oo
aiit)
J i V i(0
fx
ds
= logx + / (logs - 1)
J\
s

lim
*->oo

= \i\ogx)2

\
1
/

ds

(2) We can assume (if necessary change the function a somewhat) that ci = 0 in
(B.3.3). Write (B.3.4) as

lim
*-+*>

(*> " *?)/(')


Ait)

= x y log*
y

390

B Regular Variation and Extensions

and subtract from (B.3.3), i.e.,


f{tx)-f{t) _ XY-\

xY-\\

/
y_ = Q
( JT log*
A{t)
y\

lim-J*)
t-+oo

}.

We get

lim {f^-y~la^}-{fV-y~laV}

'-+00

a\(t)

Zl*LZ
y

(B.3 13)

with a\ := a A as before. Consider y > 0 first. By Theorem B.2.2,


r
/ ( O - Y-l<*(t)
ci
hm

=
= .
t-+oo
a\{t)
yl

. 1 . .
(B.3.14)

Then we have for x > 0,


hm

f(tx) - y la(tx) _ _ i y

=
-jt .

(B.3.15)

Subtract (B.3.14) from x~Y times (B.3.15) to obtain

(tx)-yf(tx)-t-yf(t) _ i (/jc)-yq(/jc) - f ^ o

The result follows in view of (B.3.4). The proof for y < 0 is similar. Conversely, if,
for example, t~Y f{t) e 11(a), then
r

hm
'-oo

f(tx) - f(t) - yf(t)(Xy


tYa(t)

- \)/y

= xy log*
.
B

If f(oo) := limf^oo f(f) is finite and for example t~y(f(oo)


then
r

/('*) - fit) + y (/(oo) - /(;)) ( ^ - 1) /K

hm

f(t))

e 11(a),

yi

= xy logjc .

Hence / 2ERV.
(3) In view of Theorem B.2.2, relation (B.3.4) implies for p < 0 and c\ ^ 0 in
(B.3.3) that
c := lim t~Ya{t)
f->>oo

exists in (0, oo) and

c - t-ya(t) - c i
hm
=
,
*-< t~ya\(t)
p
i.e.,
fl(0

= cfy + i(O(l+0(l)).

B.3 Second-Order Extended Regular Variation (2ERV)

391

Hence for x > 0,


r

f(tx) - no - {ay + cip-^iWd + o(t))} (Xy - \)/Y

lim
*->oo

a\(t)
CK+P-1

PV

JC^-IX

J C ^ - 1

K+P

X+P

i.e.,

. {/('*>-c^}-{/(*)-c*ri} /Cl W . + P .
lim
lim
i-i
^ - = \
'-oo i
a\(t)
o + c2)) v 4- p
-oo
a\(t)
\P
/ y + i (B.3.16)
Note that in order not to have a trivial limit we need to have c\/p+C2 ^0. Conversely,
if a function / satisfies (B.3.16), then
f(tx)-f(t)-cty^

t-+oo

_XY+P-I

(c\/p + C2)a\(t)

y+p

for* > 0, i.e., / 2ERV.


For p < 0 and c\ = 0, we have by Lemma B.3.9 below that
a(t) = ctY + o(ai(t))

t -> o o ,

with c ^ O since a e RVy and a\ e RVy+p. Hence (B.3.16) holds, with c\ 0.


Lemma B.3.9 (Ash, Erdffs and Rubel (1974)) Let r be a regularly varying function
with index less than zero. If for some measurable function q,
lnaq(tx)-q{t)=0
r(t)
for all x > 0, then
C := lim q(t)
exists (finite) and
C - q(t) = o(r(t)) ,

t -> oo .

Proof Take A > 1. For n = 1, 2 , . . . and v > JC,


lg(y) - g(*)l ^ r(y) A |g (A*y) - q (Ak~ly)\
r(x)
~r(x)^x
r(Ak-ly)
\q(Any)-q(Anx)\
r(Ax)
Now

r (Anx)
r(x)

(Ak~ly)
r (y)

A \q (Akx) - q (Ak'lx)\
^
r (A*" 1 *)

(Ak~lx)
r (x)

392

B Regular Variation and Extensions


r(Ax)
2
hm sup - =: o < 1 ,
x-+oo r(x)

r(y)
hm sup
= 1.
*-*y>jrr(*)

Hence with s > 0 for x > XQ and y > x,


r(Ax)
and for y >

< ^,

- < 2 ,
r(x)

r(x)

< e,

JC > JCO,

3s

<2eJ2*k~l +*" +*T,*k~l ~+ J

r(x)

(B.3.17)

as n -> oo. Since l i m * . ^ r(*) = 0, for > 0 we can find jq such that
\q(y)-q(x)\

<s

for y > x > x\. Hence C := tim^-^oo^C*) exists by Cauchy's criterion. Taking
y - oo in (B.3.17) we conclude that
C - 9 ( x ) = *(r(Jc)),

oo .

We finish the main theorems by establishing uniform inequalities for the class
2ERV.
Theorem B.3.10 (Drees (1998); cf. Cheng and Jiang (2001)) Let f be a measurablefunction. Supposefor some positivefunction a, some positive or negativefunction
A with Hindoo A(t) = 0, and parameters y e K, p < 0, that
/(**)-/(*) _

xy-\

ait)

lim

Y_

A(t)

f->00

- /

>

'

du ds

(B.3.18)

for all x > 0, i.e., f e 2ERV. Then for all 8,8 > 0 there exists to = to(e,8) such
that for all t,tx > %
f(tx)-f(t)
aoit)

XY-\

-*Y,pW

Ao(0

<sm?K{xy+p+\xY+f>-8),

(B.3.19)

where
Xy+P-1

^y,pM

-jw-

> P < >

(B.3.20)

= 1
2

i(logx) , y = p = 0,
cf'' ,
a0(t) :=

p < 0,

- y ( / ( o o ) - / ( / ) ) ,Y<P
yf(t),

1/(0 + 7 ( 0 ,

= 0,

y > /> = o.

y = P = 0,

(B.3.21)

B.3 Second-Order Extended Regular Variation (2ERV)

393

with c := lirrif-xx, t~Ya(t),

Ao(0 :=

-(Y + P)7i%Ji0

y + P<0,

(y + ^ J S -

y+

fit)
aoit) '

y + p = 0, p < 0,

m
m.

P<0,

p>0,p<0,
(B.3.22)

Y * P = 0,

fit)'

y=p=0

aoit) '

where for an integrable function h,


h(t)

-xo-U

h(s) ds

(B.3.23)

and
f(t) - c ^ 1 ,

p < 0,

t-y(fioo)-f(t)),y<p

fit) :=

t~ fit)

fit) ,

= 0,
y>p

= 0,

(B.3.24)

Y =P=0

Moreover,
apitx) _
aoit)

ry

Aoit)

<emax(xr+p+s,xY+<,-s).

(B.3.25)

Proof From Theorem B.3.6 it is clear that for all cases except when y = p = 0 the
limit relation (B.3.1) can be reformulated as a relation of extended regular variation
(first order) for a simple transform of / . Hence those cases are covered by Theorem
B.2.18. It remains to consider the case y = p = 0. We use Theorem B.3.6:
fit) = hit) + / his) ds
Jo
with h e n. Now Theorem B.2.18 implies
\h(tx)-h(t)
log* <emax(jr,jc
kit)
for some positive function k and therefore

Aoit)
\hitx)-hit)
kit)
<s(l

log* +

+ -\msK(xs,x~s)

\hjtu)-hjt)

i;

I W)

-logK

Corollary B.3.11 It follows that if p < 0 or p = 0, y < 0, then

394

B Regular Variation and Extensions


/(**)-/()

iim

'?>

=L_

A/;-'

A(f)

p + y-

with Y- := min()/, 0).


Remark B.3.12 Analogous of Theorems B.3.1, B.3.6, and B.3.10 for third-order
extended regular variation can be found in the Appendix of Fraga Alves, de Haan and
Lin (2003).
Next we show that a function / satisfying the second-order condition is always
close to a smooth function of the same kind.
Theorem B.3.13 Suppose
lim f*)-n-W-n/Y
'-oo

= H(x)

(B .3. 2 6)

a\{t)

with H not a multiple of(xy l)/y. Then there exists a twice differentiable function
f\ with
lim (f(t) - Mt)) lax(t) = 0 ,

(B.3.27)

f-oo

(a(t) - t f[(t)) lax (f) = 0 ,


lim a(t)Ai(t)/ai(t)

= 1;

(B.3.28)
(B.3.29)

with h = / orh = f\we have


h(tx)-h(t)

hm
*->>

_ xy-i

= # (x)
Ai(0

and such that with


A i ( f ) : = ^ 7 7 r - K + 1,
/i(0

(B.3.30)

sign (A\(t)) is constant eventually ,

(B.3.31)

we have

lim Ai(r) = 0 ,

(B.3.32)

t-+oo

\Ai\eRVp.

(B.3.33)

Proof For the case y = p = 0 the proof is given in Lemma B.3.14 below. For other
values of y and p separate proofs apply. As an example we give the proof for p = 0,
y >0.
Assume that the function a\ is positive (for negative a\ a similar proof applies).
Then (B.3.26) implies by Theorem B.3.6,

B.3 Second-Order Extended Regular Variation (2ERV)


lim
'-oo

t-Yai(f)/Y

= logjc

395
(B.3.34)

for all x > 0; hence (B.3.26) holds with a(t) = yf(t) +


y-la\(t).
y
Now (B .3.34) says that the function t ~ f (t) is in the class n ; hence by Proposition
B.2.15(3) there is a function g\ with
lim
t-+oo
and such that

= lim
t-^oo

t-Ya\{t)

= 0

(B.3.35)

a\{t)

lim t^-l
= -1 .
'-* 00 *i(0
Combining (B.3.34), (B.3.35), and

(B.3.36)

gi(tx)-gi(t)

lim
^

= log x

tgx(t)

we obtain
lim

&

=1,

i.e.,
lim
t-+oo

^^-

= 1.

(B.3.37)

a\(t)

We take fi(t) := t*g\{t). Then (B.3.27) holds by (B.3.35). Further,


aft) ~ tf[(f)

yfit) + y-'aiCQ - t/,(Q

ai(0

ai(0
/(0-* gi(0 ,
y

= y

-i , y ^ g K O - f ^ g i C O ) '

1- y

ai(0

ai(0

/ ( o - ^ g i ( o , -i

*y+Vi(o

J
= y
hy
y 0,
f - oo, by (B.3.35) and (B.3.37). Hence (B.3.28) holds. Finally, by (B.3.28),
(B.3.36), and (B.3.37),

a(OAi(f)~f/i(f)Ai(0
=
= (Y +

t2f'i(t)-(y-l)tf'l(t)
l)tr+lg'l(t)+ty+2g'i(t)

~ y ' y + V i ( 0 ~ i(0 .
Hence (B.3.31), (B.3.32), (B.3.33), and (B.3.29) hold.

t -* o o .

396

B Regular Variation and Extensions

Lemma B.3.14 (A.A. Balkema) Let <f> be measurable and for all x,
x2
+ a2(t) +o(a2(t))

</>(t + JC) = a0(t) +ai(t)x

t -+ oo .

(B.3.38)

Since this relation is essentially 2ERV, we know that it holds locally uniformly.
Let y be a C2 function with compact support that satisfies
fxky(x)dx=0

for

xky(x)dx

= l for

= 1,2,
k = 0,

and define r/r = </> + y as


*(t):=

]4>(f +

s)y(s)ds

and for m = 1, 2 , . . . ,
with ^ ( 0 ) = ^ .
Then x/r^ satisfies, for all x,
lim

y
f

,\/"

= 1.

(B.3.39)

Fork = 0,1,2,
ak(t)-1r<kHt)

= o(a2(f)),

and in particular,
4>(0 - iKO = o(a2(t)) ,

<H t -> oc .

Prcra/ If /? is a polynomial of degree & < 2, then so is p := p y. Indeed, p ^ =


//*) y is constant. Since r/r^ = <f> y ^ for fc < 2, we have
if,V\t + x)=

f </>(t + x - s)y(2)(s)

ds

= J Uo(0 + (x~ s)ai(t) + ^ Z L f l 2 ( / ) J y (2) (j)


=

fo{a2(t))Ym(s)ds

rf5

B.3 Second-Order Extended Regular Variation (2ERV)

397

(since p y ( 2 ) = p ^ y vanishes for any polynomial of degree < 2). T h e first


term equals a2(t) and is independent of x. The second term is o ( 2 ( 0 ) since y has
bounded support by the local uniformity in (B.3.38). This proves (B.3.39).
N o w write as above, with x = 0, k = 0 , 1 ,
y(k\s)

Vr<*>(0 = / la0(t) - sax(f) + ja2(f)\

ds + j o(a2(f))y<k\s)

ds .

T h e second integral is 0 ( ^ 2 ( 0 ) and ^ e first integral reduces to

ak(t)j^-y{k\s)ds=ak(t)
since f(sj /j\)y^k\s)
p < k) and

ds vanishes for j < k (note that p y^k\s)

for 7 > k.

= 0 for all s if

R e m a r k B.3.15 A simpler case of second-order behavior is related to regular variation rather than extended regular variation. Suppose f e RVy for some y R and
there is some positive or negative function A such that
/('*)

XY

lim -*>
*-> A(0

#(*)

=:

(B.3.40)

exists for x > 0 with if not constant. If (B.3.40) holds, we say that the function / is
of second-order regular variation. Then
,.

&>

(tx)-yf(tx)-t-yf(t)

= H{X)>

t-vf(t)Ait)

hence the function t~y f(t) is extended regularly varying, H(x) = (xp X)/p for
some p < 0, and the theory of Section B.2 applies. Using the inequalities of Theorem
B.2.18 one then gets, if p < 0,
lim
'7~

" '
A(0

'
/>

In view of applications in Chapters 3,4, and 5 we now show h o w the second-order


condition for / translates into a second-order condition for log / .
Lemma B.3.16 Let f be a measurable function with lim^oo f(t) = /(oo)
(0, 00]. Assume there exist functions; a positive and A with lim^oo A(t) = 0 and
not changing sign eventually, such that

398

B Regular Variation and Extensions


f(tx)-f(t)
a(t)

lim

xy-i

= Hyfp(x),

A(t)

t-+oo

where

i / vv-Y+P
+'-i
p \

xy

y+p

(B.3.41)

-i\
Y

Suppose y T p.
Then
f0,

ait)

K+
fit)
lim
f->>00
A(t)

< P < 0,

oo , p < y < 0 or (0 < y < p and I ^ 0 ) ory = p,


l 7+^ , (0 < Y < ~P <dl = 0) or y > -p > 0,

(B.3.42)
where for y > Owe define I := lim^oo f(t) a(t)/y. Furthermore, in case y > 0
assume p < 0, and we have
lim

log f(tx)-\og f(t) _


a(t)/f(t)

Q(t)

t-*oo

xy--\

= Hy_tfi,(x),

(B.3.43)

where
Y < P < 0,

A(0,

Q(t) = \ y+ jL , p < y <0or(0


I Y+P

A(0,

< y < p and I ^ 0) or y = p,

(0 < y < p and l =

0)ory>p>0,

12(01 RVp>,and
P,

y < P < 0,

,
, y , p < y < o,
p = \
y , (0 < y < p and/ ^ 0),
p,
(0 < y < p ad/ = 0) or y > p > 0 .
If y > 0 anJ p = 0, f/i ftmif in (B.3.43) equals zero for any Q(t) satisfying
A(t) = 0 ( 0 ( 0 ) orequivalentlya(t)/f(t)
- y = 0(Q(t)).
Proof We start with the proof of (B.3.42) and separately analyze the cases y < 0,
y = 0, 0 < y < p, y = p, and y > p.
Start with y < 0. Then a(t)/A(t) e RVy-p and by assumption /(oo) > 0.
Hence
lim
f-oo

7&ZI=
A(r)

Hm
hm

(')

(0.

K"P<0,

oo , y - p > 0 .

-> /(r)A(r)
Next consider y = 0. Then p < 0 since we assume y ^ p. Then f(t)eR
Vb and
from (B.3.4) and Theorem B.1.6, there exists lim,_,.oo fl(0 and it is positive. Hence,
limbec a(f)/(/(fM(0) = 00.
Next we consider the various possibilities when y is positive. Note that from
(B.3.4),

B.3 Second-Order Extended Regular Variation (2ERV)


(tx)-yg(tx) - t-Ya{t)
lim
7T-T7-;
=
'-oo
t-Ya(t)A(t)
and combining this with (B.3.41), we have

(/("> - V) ~ (/ ~ f)

lim -*
'^

V^
-a(t)A(t)

399

xf>-\
>

(B.3.44)

xr+> - 1

(B.3.45)

Y+P

Then if y + p > 0,

no-"--

00
lim
'-+
-$a(t)A(t)

Y+p'

that is,
lim^
^
r-*oo A(t)
y + p
Next if y + p < 0, there exists / := lim^oo fit) a{t)/y and

(f{t)-a-f)-i
lim

Hence
J \lJ

1*

'-oo

lim

A(t)

~'-K>O f(t)

ait)Ait)

ait)Ait)

oo , 0 < y < p and / ^ 0,


L

y+p , 0 < y < -p and / = 0 .

Finally, consider y = p > 0. From (B.3.45) it follows that ait)Ait)/y


o if it)-ait)/y)md
so

'-oo

Ait)

' - c o /(/)

-ait)Ait)

We have now proved (B.3.42).


For the proof of (B.3.43), first note that ]xmt-+oo ait)/fit)
1.2.9). Then for y < 0, from (B.3.41),

108

= Y+ (cf. Lemma

OjW ) = ^ 0 + W) [^T1 + A(0y.p(x) +O(A(0)J)


=

76) I T -

A(t)Hy p(x)+ A

' < <')

400

B Regular Variation and Extensions

-imn

xy

-i

+ A(t)HYtP(x) + o(A(t))\

hence
lim

log/(**)-log/(*)

*y - 1

fit)

f-00

-W+-\J$><TT)+>{%)
The result follows for this case.
Now consider y > 0. Again from (B.3.41),
-yf(fX)

m = x~r + %l-^f-+x~vW)
= 1 + (x-r - 1) ( l - ^

{mH

M+oiMt))]

+x-y^

{A(t)Hy,p(X)

o(A(t))};

hence
r

lim

log f(fx)~

f->oo

log / ( Q

.
logjc
6

().

VS

+ X-rA(t)HYiP(x)

; a(t) \f(t)

+ o(A(t)) +

fa(t)

\W)- )-

Combining this with (B.3.42), the result follows for y > 0.

Remark B.3.17 Relation (B.3.45) is true for y # 0 and p < 0.


Remark B.3.18 From Lemma B.3.16 it follows that
0,
y/p ,
#v p = lim

a(t)/f(t)
Q(t)

- K+

y < p < 0,
(lim^oo f(t) - a(t)/y = 0
and 0 < y < p) or y > p > 0,

-1 ,

(lim^oc f(t)-a(t)/y

^0

and 0 < y < p) or p < y <0 .


(B.3.46)
Next we consider the special case of / nondecreasing and give equivalent conditions in terms of <j> := / * " (the inverse function of / ) . This is relevant for extremevalue statistics where (B.3.3) is a condition in terms of the quantile function (so the

B.4 ERV with an Extra Parameter

401

inverse of a probability distribution) and one wants to have conditions in terms of the
distribution function itself.
Theorem B.3.19 Suppose that f is nondecreasing and <j> is its left-continuous
function. Then (B.3.3) is equivalent to
<j>(t+xa(<t>(t))) _ ( ! _ ! _

inverse

yx)l/Y

HP ,
^
AfAf.n
= -(l + YxrM,YHl
+ yx)1'*) (B.3.47)
rt/(oo)
A(0(r))
locally uniformly for x e ( l / m a x ( 0 , y ) , l/max(y, 0)), where H is the limit
function in (B.3.3)
Remark B.3.20 1. The result is also true for the right-continuous inverse of / .
2. In case y = 0 we define (1 + yx)l/y = e*.
3. For specific parameters we can give more specific statements, such as, for example:
(a) if a = 0, y > 0, then rl^<f>(t) e II;
(b) if a = 0, y < 0, then f ^ ( / ( o o ) - rl) 6 II;
(c) if a < 0, y = 0, then \imt^oo(e-cit+x) <t>(t + *)) - c/(e~ct<t>(t) - c) = ea\
x R.
Proof Since by Remark B.3.8(l) relation (B.3.3) holds locally uniformly we can
replace x by x(t) = 1 + sA(t) in (B.3.3) and get
r

lim
'-oo

fit + teAjt)) - f(t) - q(Q((l + eAjtW - \)IY


a(0A(f)

= 0,

(B.3.48)

hence
hm

= e.
fl(0A(0
Applying this for s > 0 and < 0 and using /((0(f))") * < / ( ( 0 ( O ) + ) we
obtain lim, t / ( o o ){/(0(O) - f}/{a(0(f))A(</>(O)} = 0. This and (B.3.3) imply that
f(4>(t)x)-t

_ JC^-1

lim ^/JL/X,
= H(x) .
ff/(oo)
A(0(O)
Now Ft(x) : = (/(</>(*)*) t)/a((/>(t)), x > 0,t < / ( o o ) , is a family (with respect
to 0 of nondecreasing functions; furthermore, (xy l)/y has a positive continuous
derivative, the function //() is continuous, and the function A satisfies A ( 0 ( 0 ) - 0,
r t / ( o o ) . Therefore we can apply an obvious generalization ofVervaat's lemma (see
Lemma A.0.2) to deduce (B.3.47). The converse implication is similar.

B.4 ERV with an Extra Parameter


Definition B.4.1 Let fs (t) be a measurable function for 0 < s < 1 and t > 0. We say
that the function / is jointly regularly varying if there exists a continuous function y
defined on [0,1] such that for x > 0,

402

B Regular Variation and Extensions


fsitx)
- * * ' >
fs(t)

lim sup
'-0<5<l

(B.4.1)

= o.

We define analogously joint extended regular variation:


lim sup
'"0<5<1

fs(tx)-fs(t)

**'>-!

as(t)

Y(s)

= 0,

(B.4.2)

where as (t) is positive. The definition ofjoint (extended) regular variation of secondorder is analogous. The function y is called the index function.
This concept of joint (extended) regular variation is used in Chapter 9 and we
develop some properties that are needed in that chapter.
Theorem B.4.2 Suppose f is jointly regularly varying. For any positive e, 8 > 0
there exists to = to(s, 8) such that for t, tx > to, 0 < s < 1,
fs{tx)

rYis)

fsH)

<exy{s)m<ix(x\x-8)

(B.4.3)

and
(1 - e)xyto min (xs, x~s) < ^ -

7(0

< (1 + e ) * ^ max (*',*-')

(B.4.4)

Proof Clearly it is sufficient to prove the statements for y = 0 for all s. The first
step is to prove that (B.4.1) holds locally uniformly for x e (0, oo). It is sufficient
to deduce a contradiction from the following assumption: suppose there exist 8 > 0
and sequences sn > so, tn -> oo, xn > 0, as n -> oo, such that
\FSn(tn + xn) - FSn(tn)\

>8

for n = 1, 2 , . . . , where Fs(x) := log/ 5 (e*). The rest of the proof of the local
uniformity is exactly like the proof of Theorem B.2.9. Now we know that for any
s (0, 1), there exists a fy such that if t > to, x [1, e], s e [0,1],
|log/,(/x)-log/5(0|<.
Take any x > 1. We write x = eny, y [1, e) for some nonnegative integer n. Then
|log/,(^)-log/,(OI
n

< I ] | log fsW)

- log Mte1-1)|

+ | log fs(tenv)

- log

fs(ien)\

<{n + Y)s\ < e\ logx + i .


For the last inequality we use logjc > n. This proves (B.4.4) for x > 1. By
interchanging the role of t and tx in case x < 1, we get the full statement (B.4.4). The
proof that (B.4:4) implies (B.4.3) is analogous td the proof of Proposition B.1.10.
Next we shall prove uniform inequalities for joint extended regular variation.

B.4 ERV with an Extra Parameter

403

Theorem B.4.3 Suppose f is jointly extended regularly varying, i.e., (B.4.2) holds.
For any e, 8 > 0 there exists to = to(s, 8) such that for t, tx > to, 0 < s < 1,
fs(tx)-fs(t)

**'>-!

as(t)

y(s)

<e

1+JC

y(s)

max (*'.*-'))

(B.4.5)

Moreover, the function as(t) is jointly regularly varying with index function y.
Proof. The proof of the last statement is analogous to that of Theorem B .2.1 and it is
left to the reader.
The defining relation (B.4.2) holds locally uniformly for x e (0, oo). This can be
proved in the same way as for the corresponding result in the previous theorem, now
following the proof of Theorem B.2.9.
By Theorem B.4.2 for any e, 8 > 0 we canfindto such that for t, tx > to we have
\as(tx)

rY(s)

as(t)

<exy(s>max(xs,x-s)

as(tx)
(1 - e)xYM min (x\ x~s) < ^ ^ - < (1 + s)x^s) max (*', x~s),
as(t)

(B.4.6)

(B.4.7)

and
yyis) - 1
y(s)

fs(ty) - fs{t)
as(t)

<

(B.4.8)

for y e [1, e]. Take any x > 1. We write x = eny, where y e [1, e] with some
nonnegative integer n. Note that
fs(teny) - fs(t)
as(t)

(e"y)y^ - 1
y(s)

(fs(teny) - fs(te")
as(ten)

yM

as(ten)

- l\

MO

y(s.)

+
+

(aAte^
V as(t)

\ yxW - 1
J y(s)

^ (a_Ate^ _ ^ A >*>f-^ V as(t)


J
y(s)

Applying (B.4.6), (B.4.7), (B.4.8), and ( y - l)/y(s) < ( e ^ - l)/y(j) , we


get with
/i^

, e^>-l\

supyO)

y(s) + s

inf y(s)
e+1I] y(s) _ i
/ eg+inf

404

B Regular Variation and Extensions

that
\fAteny)-Mt)

("#'s-l

MO

Y(s)

< * ]T(1 + e)JWW + > ei(yisH)i=0

(
<sC

i=0
ey(s)

1 + +

Y(s)

_ i \ tf(n+i)(y(*)+) _ i

y(s)

e(n+l){Y(s)+e)

*?<*>+*-1

_ 1

Y(s) + 6

Now on the set {s e [0,1]: y(s) + s < y/e] we have


e(n+lXy(s)+e)

_ i

y(s) + e

1
~ -y(s)

1
- e ~~ Je'

and on the set {s e [0,1] : y (s) + s > ^/e} w e have


(n+l)(x(j)+)

_ i

e (n+l)(y(j)+e+2v'i)

V?

xCO + e

(consider y(s) + e < 0 and y(.s) + e > 0 separately and use that (1 e~x)/x < 1
for * > 0 and that (e* l)/x is increasing for x > 0).
Hence (with y = jce~n as before)
Mteny) - fs(t)
as(t)

(eny)y^-l
Y(s)

< c (v^ + ^ K ( 5 ) + + 2 ^ K ( ' H + 2 ^)


since logx > n. This proves (B.4.5).
Theorem B.4.4 Suppose f is jointly extended regularly varying, i.e., (B.4.2) holds.
Moreover, suppose f is positive. Then
lim sup

as(0

'-+0<5<1 fs(t)

- Y+(s) = 0,

(B.4.9)

where y+(s) := max(0, y(s)). For any e, 8 > 0 there exists to = to(s, 8) such that if
t, t x > to, 0 < s < 1,
log/,(^)-lQg/,(0
as(t)/fs(t)

**-<*>-!
< s ( l + xy~(s) max(A jT a ) ,
y-(s)

where y~(s) := min(0, y(s)).

(B.4.10)

B.4 ERV with an Extra Parameter

405

Proof. For (B.4.9) we need to prove that if tn - * oo, sn - * so,


fsn(tn)

flM*)), K(*0)>0,
oo ,

<*sH(fn)

y(^o) < 0 .

Suppose Y(SQ) > 0. By Theorem B.4.3 for any e (0, y(so)) there is a fo such
that for tn > to,
1

f*m(tn) - fsH(to)
aSn(tn)

(?)

a .(. + en

yfo.)

y(sn)

It follows that
lim

fsnitn)

fsn{t0)

(B.4.11)

Now Theorem B.4.3 implies (take t = to and JC - oo in the statement of the theorem)
that
lim as(t) = oo
(B.4.12)
f-*00

uniformly for those s with y(s) > y(to)/2.

Combining (B.4.11) and (B.4.12) gives


1
y(s0)

fsn{tn)
- aSn{fn)

n lim
0

Next suppose y(so) < 0. By Theorem B.4.3 the function ir5(t) defined by

dx

iMO

J\

r dx

X*

Jt

X^

is well defined for those s for which y (s) < 1. By partial integration
ft

fa

rt

/oo

rt

rfu

/ fs(u) u = / / fs(x) -jdu


- / fs(u)
u
J to
Jto Ju
*z
Jt0

= '/

/-W^-W
x

Jt

fsM-2

Jto

r
dx
= W ) + /s(0-'o / fsM
,
x
Jt0

i.e.,

/'
dw
r
dx
fs(t) = / irs(u)
xlrs(t) +10 / / , ( * ) .
u
x
Jro
Jto
Next we shall prove
lim sup

1^(0
as(f)

1
l-y(s)

= 0

(B.4.13)

(BAH)

406

B Regular Variation and Extensions

for 0 < c < 1, where Ec := {s e [0, 1] : y(s) < c}. This implies that x/r is jointly
regularly varying and also that it is sufficient to prove (B.4.9) where a is replaced
with ty.
Now,
ro
fsitx) - MO dx
as(t) Jx
as(t)
x2'
For any e e (0, c), by Theorem B.4.3 there exists to such that for t, tx > to,

Jl

y(s) ) x 2 | - V i

as(t)

dx
) x*

V+

Hence

iim M >= rr-oo

xy(s)

_ j

d x

y(s)
x2 1 - y(s)
t^oo as(t) Jx
uniformly for s e Ec. Hence it is sufficient to prove
fsn{tn)
r
hm = oo
for y{so) < 0, where V" is jointly regularly varying with index function y. Now by
(B.4.13),

liminfAM> l i m i n f / 1

M^<^_x

> liminf / (1 - 2e)uY{Sn)~l+2e du - 1


n
^ Jto/tn
> liminf / (1 - Is)^6'1
n
^ Jto/tn
V

t.1

du - 1

O Nl ~ CO/**)3*

= liminf (1 - 2s)
n-+oo
_ 1 - 2e

1
3s

which tends to infinity as s | 0. The proof of (B.4.9) is complete.


For (B.4.10) by Theorem B.4.3 we need to prove only that for any sn -> so,
xn -> x0 > 0, tn -> oo,
\Qg fsn(tnXn)-log fSn{tn)
^
aSn(tn)/fSn(tn)

Hm
n

^-(J0) - 1
Y-(so)

We write
] n o (fsn(tnX)-fSn(tn)
lQ
H m

n-*00

g^nfa^)-l0g/5 w fa)

aSn(tn)/fSn(tn)

1Qg
=

n-*oo

asn(tn)

aSn(tn)

fsnitn)^1)

aSn(tn)/fSn(tn)

B.4 ERV with an Extra Parameter


For yOo) > 0, lim^oo aSn(tn) /fSn(tn)

= y(s0)', hence this converges to

I
/xy(so) - 1
\
- log I . . yOo) + 1 1 = logx 0
y(s(so)
\ yfao)
/
For y (s0) < 0, lim^oo aSfl (tn)/fsn (tn) = 0; hence the limit equals
hm
"-*

fSn(tnxn) - fSn(tn)

aSn(tn)

*0y(5o) - 1
.
y(so)

407

References

1. K. Aarssen and L. de Haan: On the maximal life span of humans. Mathematical Population Studies 4,259-281 (1994).
2. M. Ancona-Navarrete and J. Tawn: A comparison of methods for estimating the extremal
index. Extremes 3, 5-38 (2000).
3. B.C. Arnold, N. Balakrishnan, and H.N. Nagaraja: A First Course in Order Statistics.
Wiley, New York (1992).
4. J.M. Ash, P. ErdGs and L.A. Rubel: Very slowly varying functions. Aeq. Math. 10, 1-9
(1974).
5. A.A. Balkema and L. de Haan: A.s. continuity of stable moving average processes with
index < 1. Ann. Appl. Probab. 16, 333-343 (1988).
6. A.A. Balkema, L. de Haan, and R.L. Karandikar: Asymptotic distributions of the maximum of n independent stochastic processes. /. Appl. Prob. 30, 66-81 (1993).
7. O. Barndorff-Nielsen: On the limit behaviour of extreme order statistics. Ann. Math.
Statist. 34,992-1002 (1963).
8. J. Beirlant and J.L. Teugels: Asymptotics of Hill's estimator. Theory Probab. Appl. 31,
463-469(1986).
9. J. Beirlant, P. Vynckier, and J. L. Teugels: Tail index estimation, Pareto quantile plots and
regression diagnostics. J. Amer. Statist. Association 91,1659-1667 (19%).
10. J. Beirlant, J.L. Teugels, and P. Vynckier: Practical Analysis of Extreme Values. Leuven
University Press, Leuven, Belgium (1996).
11. P. Billingsley: Convergence of Probability Measures. Wiley, New York (1968).
12. P. Billingsley: Weak Convergence of Measures: Applications in Probability. SLAM,
Philadelphia (1971).
13. P. Billingsley: Probability and Measure. Wiley, New York (1979).
14. L. Breiman: Probability. Addison-Wesley (1968); Republished by SIAM, Philadelphia
(1992).
15. B. Brown and S. Resnick: Extreme values of independent stochastic processes. J. Appl.
Probab. 14,732-739 (1977).
16. N. G de Bruijn: Pairs of slowly oscillating functions occurring in asymptotic problems
concerning the Laplace transform. Nw. Arch. Wish. 7,20-26 (1959).
17. S. Cheng and C. Jiang: The Edgeworth expansion for distributions of extreme values.
Science in China 44,427-437 (2001).
18. K.L. Chung: A Course in Probability Theory. 2nd Edition, Academic Press, New YorkLondon (1974).

410

References

19. M. Csorg<5 and L. Horvath: Weighted Approximations in Probability and Statistics. John
Wiley & Sons, Chichester, England (1993).
20. D. J. Daley and D. Vere-Jones: An Introduction to the Theory ofPoint Processes. Springer,
Berlin (1988).
21. J. Danielsson, L. de Haan, L. Peng, and C. G de Vries: Using a bootstrap method to
choose the sample fraction in tail index estimation. J. Multivariate Analysis 76,226-248
(2001).
22. A.L.M. Dekkers and L. de Haan: Optimal choice of sample fraction in extreme-value
estimation. /. Multivariate Analysis 47, 173-195 (1993).
23. A.L.M. Dekkers, J.H.J. Einmahl, and L. de Haan: A moment estimator for the index of
an extreme-value distribution. Ann. Statist. 17, 1833-1855 (1989).
24. D. Dietrich, L. de Haan, and J. Hiisler: Testing extreme value conditions. Extremes 5,
71-85 (2002).
25. G. Draisma, H. Drees, A. Ferreira, and L. de Haan: Bivariate tail estimation: dependence
in asymptotic independence. Bernoulli 10, 251-280 (2004).
26. G Draisma, L. de Haan, L. Peng, and T.T. Pereira: A bootstrap based method to achieve
optimality in estimating the extreme value index. Extremes 2, 367-404 (1999).
27. H. Drees: On smooth statistical tail functionals. Scand. J. Statist. 25,187-210 (1998).
28. H. Drees: Weighted approximations of tail processes for ^-mixing random variables.
Ann. Appl Probab. 10, 1274-1301 (2000).
29. H. Drees: Tail empirical processes under mixing conditions. In: H.G Dehling, T. Mikosch,
and M. Sorensen (eds.) Empirical Process Techniques for Dependent Data. Birkhauser,
Boston, 325-342 (2002).
30. H. Drees: Extreme quantile estimation for dependent data with applications to finance.
Bernoulli 9, 617-657 (2003).
31. H. Drees, A. Ferreira, and L. de Haan: On maximum likelihood estimation of the extreme
value index. Ann. Appl. Probab. 14, 1179-1201 (2003).
32. H. Drees, L. de Haan, and D. Li: On large deviations for extremes. Stat. Prob. Letters 64,
51-62(2003).
33. H. Drees, L. de Haan, and D. Li: Approximations to the tail empirical distribution function with application to testing extreme value conditions. To appear in /. Statist. Plann.
Inference (2006).
34. H. Drees and E. Kaufmann: Selecting the optimal sample fraction in univariate extreme
value estimation. Stock Proc. Appl. 75, 149-172 (1998).
35. W.F. Eddy and J.D. Gale: The convex hull of a spherically symmetric sample. Adv. Appl.
Prob. 13,751-763(1981).
36. J.H.J. Einmahl: Multivariate empirical processes. PhD thesis, CWI Tract 32, Amsterdam
(1987).
37. J.H.J. Einmahl: The empirical distribution function as a tail estimator. Statistica Neerlandica 44, 79-82 (1990).
38. J.H.J. Einmahl: ABahadur-Kiefer theorem beyond the largest observation. J. Multivariate
Anal. 55, 29-38 (1995).
39. J.H.J. Einmahl: Poisson and Gaussian approximation of weighted local empirical processes. Stock Proc. Appl. 70, 31-58 (1997).
40. J.H.J. Einmahl, L. de Haan, and V. Piterbarg: Non-parametric estimation of the spectral
measure of an extreme value distribution. Ann. Statist. 29, 1401-1423 (2001).
41. J.H.J. Einmahl and T. Lin: Asymptotic normality of extreme value estimators on C[0, 1].
Ann. Statist. 34,469-492 (2006).
42. P. Embrechts, C. Kluppelberg, and T. Mikosch: Modelling Extremal Events for Insurance
and Finance. Springer-Verlag, Berlin Heidelberg (1997).

References

411

43. P. Embrechts, L. de Haan, and X. Huang: Modelling Multivariate Extremes. In: P. Embrechts (ed.) Extremes and Integrated Risk Measures. Risk Waters Group, 59-67 (2000).
44. M. Falk: Some best estimators for distributions with finite endpoint. Statistics 27,115-125
(1995).
45. M. Falk, J. Htisler, and R.-D. Reiss: Laws of Small Numbers: Extremes and Rare Events.
Birkhauser, Basel (1994).
46. W. Feller: An Introduction to Probability Theory and Its Applications. Vol. 1, 3rd edition,
John Wiley & Sons, New York (1968).
47. A. Ferreira and C. de Vries: Optimal confidence intervals for the tail index and high
quantiles. Discussion paper, Tinbergen Institute, the Netherlands (2004).
48. R.A. Fisher and L.H.C. Tippett: Limiting forms of the frequency distribution of the largest
or smallest member of a sample. Proc. Cambridge Philos. Soc. 24, 180-190 (1928).
49. M.I. Fraga Alves, M.I. Gomes, and L. de Haan: Anew class of semi-parametric estimators
of the second order parameter. Portugalia Mathematica 60, 193-213 (2003).
50. M.I. Fraga Alves, L. de Haan and Tao Lin: Estimation of the parameter controlling the
speed of convergence in extreme value theory. Math. Methods Statist. 12,155-176 (2003).
51. M. Frechet: Sur la loi de probabilite de l'6cart maximum. Ann. Soc. Math. Polon. 6,
93-116(1927).
52. J. Geffroy: Contributions a la theorie des valeurs extremes. Publ. Inst. Statist. Univ. Paris
1 8, 37-185 (1958).
53. J.L. Geluk and L. de Haan: Regular variation, Extensions and Tauberian Theorems. CWI
Tract 40, Amsterdam (1987).
54. E. Gine\ M. G. Hahn, and P. Vatan: Max-infinitely divisible and max-stable sample continuous processes. Probab. Th. Rel. Fields 87, 139-165 (1990).
55. B.V. Gnedenko: Sur la distribution limite du terme maximum d'une serie aleatoire. Ann.
Math. 44,423-453 (1943).
56. L. de Haan: A spectral representation for max-stable processes. Ann. Prob. 12,11941204
(1984).
57. L. de Haan and A. Hordijk: The rate of growth of sample maxima. Ann. Math. Stat. 43,
1185-1196(1972).
58. L. de Haan and T.T. Pereira: Spatial extremes: the stationary case. Ann. Statist. To appear
(2006).
59. L. de Haan and J. Pickands: Stationary min-stable stochastic processes. Probab. Th. Rel.
Fields 72, 477-492 (1986).
60. L. de Haan and S.I. Resnick: Estimating the limit distribution of multivariate extremes.
Commun. Statist. - Stochastic Models 9, 275-309 (1993).
61. L. de Haan and S.I. Resnick: Second order regular variation and rates of convergence in
extreme value theory. Ann. Prob. 24, 119-124 (1996).
62. L. de Haan, S.I. Resnick, H. Rootzen, and C. de Vries: Extremal behaviour of solutions
to a stochastic difference equation with applications to ARCH-processes. Stock Proc.
Appl. 32, 213-224(1989).
63. L. de Haan and A.K. Sinha: Estimating the probability of a rare event. Ann. Statist. 27,
732-759(1999).
64. L. de Haan and U. Stadtmiiller: Generalized regular variation of second order. J. Australian Math. Soc. (Series A) 61, 381-395 (1996).
65. W.J. Hall and J.A. Wellner: The rate of convergence in law of the maximum of an
exponential sample. Statist. Neerlandica 33, 151-154 (1979).
66. PR. Halmos: Measure Theory. Springer (1950).
67. E. Hewitt and K. Stromberg: Real and Abstract Analysis. Springer (1969).

412

References

68. B.M. Hill: A simple general approach to inference about the tail of a distribution. Ann.
Statist 3,1163-1174(1975).
69. J.R.M. Hosking and J.R. Wallis: Parameter and quantile estimation for the generalized
Pareto distribution. Technometrics 29, 339-349 (1987).
70. J. Husler and D. Li: On testing extreme value conditions. Accepted for publication in
Extremes (2005).
71. J. Husler and R.-D. Reiss: Maxima of normal random vectors: between independence
and complete dependence. Stat. Prob. Letters 7, 283-286 (1989).
72. P. Jagers: Aspects of random measures and point processes. In: P. Ney and S. Port (eds.)
Advances in Probability and Related Topics. Marcel Dekker, New York (1974).
73. D.W. Jansen and C.G de Vries: On the frequency of large stock returns: Putting booms
and busts into perspective. Review of Economics and Statistics 73,18-24 (1991).
74. A.F. Jenkinson: The frequency distribution of annual maximum (or minimum) values of
meteorological elements. Quart. J. Roy. Meteorol. Soc. 81,158-171 (1955).
75. H. Joe: Multivariate Models and Dependence Concepts. Chapman & Hall, London
(1997).
76. O. Kallenberg: Random Measures. 3rd edition, Akademic-Verlag, Berlin (1983).
77. M.J. Klass: The Robbins-Siegmund criterion for partial maxima. Ann. Prob. 13, 13691370 (1985).
78. M.R. Leadbetter, G Lindgren, and H. Rootzdn: Extremes and Related Properties of
Random Sequences and Processes. Springer, Berlin (1983).
79. A. Ledford and J. A. Tawn: Statistics for near independence in multivariate extreme values.
Biometrika 83, 169-187 (1996).
80. A. Ledford and J.A. Tawn: Modelling dependence within joint tail regions. J. Royal
Statist. Soc. Ser. B 59,475-499 (1997).
81. A. Ledford and J.A. Tawn: Concomitant tail behaviour for extremes. Adv. Appl. Prob. 30,
197-215(1998).
82. M.J. Martins: Heavy tails estimation variants to the Hill estimator. PhD thesis (in
Portuguese), University of Lisbon, Portugal (2000).
83. D.M. Mason: Laws of large numbers for sums of extreme values. Ann. Prob. 10,754-764
(1982).
84. D. G Mejzler: On the problem of the limit distribution for the maximal term of a variational series. Uvov Politechn. Inst. Naucn. Zp. (Fiz.-Mat.) (in Russian) 38, 90-109
(1956).
85. R. von Mises: La distribution de la plus grande de n valeurs. Rev. Math. Union Interbalcanique 1,141-160 (1936) Reproduced in: Selected Papers of Richard von Mises, Amer.
Math. Soc., Vol. 2, 271-294 (1964).
86. R.B. Nelsen: An Introduction to Copulas. Springer-Verlag, New York (1998).
87. J. Pickands III: Maxima of stationary Gaussian processes. Z Wahrsch. verw. Gebiete 7,
190-233 (1967).
88. J. Pickands III: Sample sequences of maxima. Ann. Math. Stat. 38,1570-1574 (1967).
89. J. Pickands III: Statistical inference using extreme order statistics. Ann. Statist. 3,119-131
(1975).
90. J. Pickands III: Multivariate Extreme Value Distributions. Proceedings: 43rd Session of
the International Statistical Institute. Book 2, Buenos Aires, Argentina, 859-878 (1981).
91. H.S.A. Potter: The mean value of a Dirichlet series n. Proc. London Math. Soc. 47,1-19
(1942).
92. J.W. Pratt: On interchanging limits and integrals. Ann. Math. Statist. 31, 74-77 (1960).
93. A. Renyi: On the theory of order statistics. Acta Mathematica Scient. Hungar. tomus IV,
191-227 (1953).

References

413

94. S.I. Resnick: Extreme Values, Regular Variation and Point Processes. Springer-Verlag,
New York (1987).
95. S.I. Resnick and R. Roy: Random USc functions, max-stable processes and continuous
choice. Ann. Appl. Probab. 1, 267-292 (1991).
96. H. Rootzdn: Attainable rates of convergence for maxima. Statist. Prob. Letters 2,219-221
(1984).
97. H. Rootzdn: The tail empirical process for stationary sequences. Report 1995:9, Mathematical Statistics, Chalmers University of Technology (1995).
98. H. L. Royden: Real Analysis. 2nd edition, Macmillan, New York (1968).
99. G Shorack and J. Wellner: Empirical Processes with Applications to Statistics. Wiley,
New York (1986).
100. M. Sibuya: Bivariate extreme statistics. Ann. Inst. Statist. Math. Tokyo, 11, 195-210
(1960).
101. B. Smid and A. J. Stam: Convergence in distribution of quotients of order statistics. Stoch.
Proc. Appl. 3, 287-292 (1975).
102. N.V. Smirnov: Limit distributions for the terms of a variational series. In Russian: Trudy
Mat. Inst. Steklov. 25 (1949). Translation: Transl. Amer. Math. Soc. 11, 82-143 (1952).
103. R.L. Smith: Uniform rates of convergence in extreme value theory. Adv. in Appl. Probab.
14, 543-565 (1982).
104. R.L. Smith: Estimating tails of probability distributions. Ann. Statist. 15, 1174-1207
(1987).
105. W. Vervaat: Functional limit theorems for processes with positive drift and their inverses.
Z Wahrsch. verw. Gebiete 23, 245-253 (1971).

Index

Asymptotic independence, 226, 229, 259,


261, 265, 275, 285
Auxiliary functions, 44, 62, 374, 375
Bahadur-Kiefer representation, 143
Brownian motion, 49, 308, 323
Hill estimator, 76
Intermediate order statistics, 49
Maximum likelihood estimators, 95, 99
Negative Hill estimator, 113, 115
Pickands estimator, 88
Probability-weighted moment estimators,
112
Tail (empirical) quantile process, 51, 161
Tail empirical distribution function, 159
y positive, 161
Testing the extreme value condition, 164,
170,175
y positive, 173
Case studies
Life span, 14, 68,124,152
S&P500,13,68,122,151
Sea level, 12, 67,121,127,149, 207, 245,
246,249,251,271
Sealevel, 288
Convergence of moments, 176
Dependence, 210, 220-222, 259
Coefficients, 258
H, 259, 268
K, 259, 328
Extremal index, 198
Residual index, 263, 265, 269, 275, 286
Sibuva's, 259

Conditions
D and D', 197
^-mixing, 199
Estimation, 235
#,259
L, 235, 236, 247,252, 260, 268
Extremal index, 199
Level sets, 235,244, 245
Residual index, 265
Sibuya's coefficient, 269
Spatial, 323
Functions, 221
L, 222, 232, 258, 262, 265, 328
R, 225, 232
Copula, 221
Pickands', 225, 226
Sibuya's, 225
Level sets, 223
Spatial, 195, 322
Spectral measure, 220
Temporal, 195
Diagram of estimates, 120, 121, 123, 124,
148,150,152, 288, 289
Distributions
Beta, 18, 34
Cauchy, 18,34,61,76,120
Double-exponential, 10
Exponential, 18, 34, 60, 154, 174, 179,
194, 196, 322
Extreme value, see Max-stable
Frchet class of, 10
Gamma, 18, 34, 60, 61
Generalized Pareto, 34, 65, 89, 110, 124,
163, 326, 328

416

Index

Geometric, 35
Gumbel, 10
Max-infinitely divisible, 231
Max-stable
Estimation, 252
Multivariate, 208, 217, 221, 226, 231,
235
Simple, 217, 230-232, 235, 247
Univariate, 4, 6, 9
Normal, 11, 18, 61, 120, 179, 194, 197,
221,230,231,322
Poisson, 35
Reverse-Weibull class of, 10
Student-f, 62, 322
Uniform, 94,120,196
Von Mises\ 9, 294
Domain of attraction
Infinite-dimensional, 311, 325, 328
Multivariate, 226
Speed of convergence, 179
Testing, 163
Univariate, 4, 10, 14, 19,44
Drees, H.
Mixing Conditions, 199
Tail (empirical) quantile process, 51, 114
Uniform inequalities, 369, 383, 392
Edgeworth expansion, 185
Empirical
Distribution function, 62, 63, 66, 72, 127,
128, 159, 249
Exponent measure, 280, 336
Mean excess function, 112
Quantile, 13,127
Spectral measure, 251
Tail distribution function, 52, 76, 155,
159, 163, 333, 336
Left-continuous, 236,249, 266
Tail quantile process, 50, 51, 62, 76, 88,
114,155,161,163,200,236,333
Endpoint estimation, 145
Maximum likelihood, 147
Moment, 147
Excursion stability, 326
Exponent measure
Finite-dimensional, 211, 213, 214, 222,
229, 231,235, 248, 259, 272, 276, 286
Estimation, 235, 273, 280

Infinite-dimensional, 296, 301, 302,332,


344
Estimation, 332, 335,336
Extended Regular Variation, see Regular
variation
Extremal index, 198
Extreme value index, 6, 37
Estimators, 63, 82
Hill, 20,69, 85, 100,113,116, 148, 162,
173, 265
Maximum likelihood, 89,91, 116, 338
Moment, 20, 100,103, 116,126, 148,
278, 288, 338
Negative Hill, 20,113,116, 126,148
Pickands,SJ, 116,125
Probability-weighted moment, 110, HI,
116,148
Failure set probability estimation
Finite-dimensional, 271
Asymptotically independent components, 285
Positive exponent measure, 276
Upper quadrants, 261
Infinite-dimensional, 349
Index function, 295, 311,402
Estimation, 338
Inverse function
Generalized, 366
Left-continuous, 5, 34
Right-continuous, 34
Karamata's theorem, 363
Large deviations theorem, 187
Law of the iterated logarithm, 188,193
Location parameter
Finite-dimensional, 9
Estimation, 139, 278
Infinite-dimensional, 294
Estimation, 338
Max-moving average process, 196
Max-stable process, 296, 310, 311, 314
Simple, 296, 298, 301, 303, 314,316, 328
Spectral representation, 315, 316
Stationary, 308, 315, 320, 321, 323
Mejzler's theorem, 201
Order statistics
Central, 40

Index
Extreme, 37,40,65
Nondegenerate behavior, 38, 60
Poisson point process, 37,199
Intermediate, 40,49, 65,129,163, 338
Asymptotic normality, 41
Piston, 320
Poisson point process, 39, 198, 204, 214,
272, 302, 304, 314, 323, 329
Extreme order statistics, 37, 60,61,199
Potter's inequalities, 367
Quantile estimation, 82,134
y positive, 138
Maximum likelihood, 139
Moment, 140
Probability weighted moment, 154
RSnyi's representation, 37, 60,71, 83
Rank, 236,249,266
Regular variation, 23, 361,362
Class n, 371,375
Properties, 379
Uniform inequalities, 382
Conjugate slowly, 371
Extended, 295, 371,374
Jointly, 402
Second-order, 385,386
Theorems on inverse function, 384,401
Uniform convergence theorem, 375
Uniform inequalities, 383,392
Index of, 23,362
Jointly, 401
Index function, 295,402, see also Index
function
Uniform inequalities, 402, 403
Karamata's theorem, 363
Properties, 366
Representation theorem, 365
Second-order, 397
Uniform convergence theorem, 363
Uniform inequalities, 369
Scale parameter
Finite-dimensional, 9,128
Estimation, 91, 111, 130, 148, 149, 152,
153,278
Infinite-dimensional, 294
Estimation, 338
Second-order, see also Regular variation
Comparison, 117

417

Condition, 43,44, 50, 76, 239, 263, 266,


269, 278, 345, 397
y positive, 48
Speed of convergence, 179
Uniform inequalities, 46,48
von Mises\ see von Mises*
Parameter, 386
Simulations
Extreme value index estimation, 116,120
Quantile estimation, 148
Testing, 169
Skorohod representation theorem, 357
Slow variation, 23,362
Smirnov's lemma, 41
Spectral functions, 315, 316
Spectral measure
Finite-dimensional, 215, 226, 232, 233,
235, 275
H, 218,226,232
4>, 218, 226, 232,247,252
* , 215,223,226
Dependence, 220
Estimation, 247,249
Independence, 220, 230
Infinite-dimensional, 303, 322, 328
Spectral representation of simple max-stable
process, 315,316
Stationarity
Finite-dimensional, 155,195
Infinite dimensional, see Max-stable
processes, Simple, Stationary
Strong law of large numbers, 188,192
Tail distribution function, 155
Tail empirical process, see Empirical tail
distribution function
Tail index, 69
Tail probability estimation, 82,142
Y positive, 145
Maximum likelihood, 145
Moment, 145
Vervaat lemma, 357
Von Mises'
Conditions
First-order, 15,16,18, 22,41
Second-order, 49, 54,56
Distribution, see Distributions
Weak law of large numbers, 188

Potrebbero piacerti anche