Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Dr Richard Rossmanith
Kellogg College
University of Oxford
Acknowledgements
I thank my employer Arthur Andersen Financial and Commodity Risk Consulting Division,
Frankfurt (now d-fine GmbH) for the opportunity to participate in the Mathematical Finance
course at the University of Oxford and to prepare this MSc thesis. The company financed this
course and gave me part of the necessary time off.
I am especially grateful to Dr Hans-Peter Deutsch who had the idea for our groups
participation in the course and who made it all possible. For fruitful discussions about interest
rate dynamics and time series analysis, and mutual support concerning library and IT issues, I
include my colleagues Dr Christian Hoffmann, Peter Tichatschke, Dr Frank Hofmann, Michael
Giberman, Dr Andreas Werner and Jrgen Topper in my thank-you list.
I am indebted to our client Bayerische Landesbank, its team manager Dr Walter Prem, and its
project manager Oliver Bopp for the permission to use their test and development market data
base servers for my empirical evaluations, even on week-ends. Special thanks also to Kai Radde
and Jens Erler for our discussions on the mathematical methods implemented.
From the University of Oxford, I want to express my gratitude to Dr Jeff Dewynne, Dr Sam
Howison, and Dr Paul Wilmott for launching the course and for their lectures. Particular thanks
go to my academic supervisor Jeff Dewynne, also for his understanding that my academic work
was squeezed between the time for my profession and the time for my family. I thank all
external practitioner lecturers for their lectures, but most notably Dr Jamil Baz, Dr Chris Hunter
and Dr Riccardo Rebonato for the insights into interest rate dynamics I gained from theirs.
Special thanks to Tiffany Fliss for organising the course in Oxford. I also thank her successors
Anna Turner, Rosalind Sainty, and Riaz Ahmad for their efforts.
I am also grateful to Kellogg College generally and to its President Dr Geoffrey Thomas
personally for making our time and stay there very enjoyable.
Outside of Oxford, in Scotland, I thank the School of Mathematics and Statistics at the
University of St. Andrews, and the Head of the School, Professor Dr Edmund Robertson, for
their organisational, office, IT and library support during my term as Honorary Lecturer there.
Additionally, I want to state that it was a great pleasure to work together with an expert such as
Professor Dr Alan Cairns on our joint introductory lecture on Financial Mathematics. In
Germany, from the Universitt Augsburg and the Ludwig-Maximilians-Universitt Mnchen,
thanks to Dr Gregor Dorfleitner, Dr Thomas Klein, and Gregor Rossmanith for useful
discussions and references.
Finally, I thank my family for their love and encouragement, and especially my wife Batrice
for her understanding of the time constraints involved, and my mother Gertraud, and my
mother-in-law Christiane Sacreste for their support and backup, not only during weekends.
Abstract
Incomplete data is a very common problem financial institutions face when they collect
financial time series in IT data bases from commercial data vendors for regulatory, accounting,
and benchmarking purposes. The thesis at hand presents several data completion techniques,
some of which are productively used in practice, and others which are less established. Optimal
completion methods are then recommended, based on an empirical study for the completion of
swap and forward rate curves in the currencies Deutsche Mark (respectively Euro), Pound
Sterling, and US Dollar. The source code of all completion routines, programmed in a standard
professional market data base environment, is listed in an appendix (it is available in electronic
form upon request).
Dedication
Pour Batrice et Lukas.
Contents
CHAPTER 1
INTRODUCTION ........................................................................................10
1.1
1.2
SEPARATION OF MARKET DATA FOR TRADING AND FOR RISK CONTROL .....................11
1.3
1.4
1.5
1.6
CHAPTER 2
STRUCTURAL INTERPOLATION..........................................................15
2.1
2.2
2.3
2.4
2.5
CHAPTER 3
3.1
3.2
3.3
3.4
CHAPTER 4
CONDITIONAL EXPECTATION.............................................................20
4.1
4.2
4.3
4.4
4.5
CHAPTER 5
5.1
5.2
5.3
5.4
COMPLETION ALGORITHM.............................................................................................26
5.5
CHAPTER 6
6.1
6.2
6.3
COMPLETION ALGORITHM.............................................................................................30
6.4
6.5
CHAPTER 7
7.1
7.2
7.3
7.4
7.5
7.6
CHAPTER 8
CHAPTER 9
BIBLIOGRAPHY......................................................................................................................42
A
FORWARD.FE .................................................................................................................44
A.2
BACKEXTRAPOLATE.FE .................................................................................................45
A.3
EXTRAPOLATION.FE ......................................................................................................46
A.4
COVARMATRIX.FE .........................................................................................................46
A.5
CONDITIONALDISTRIBUTION.FE ....................................................................................48
A.6
PCAESTIMATION.FE .......................................................................................................49
A.7
COMPLETE.FE ................................................................................................................50
B.2
B.3
10
Chapter 1
1.1
Introduction
Modern Risk Control of a large banks trading portfolio does not involve a single piece of
software only, but a collection of IT systems, as illustrated by the following data flow chart:
Risk
Engine
Value-atRisk Report
FO System
(FX Options)
(etc. )
The risk engine collects all trading positions from the Front Office (FO) systems. It then
evaluates their prices based on the data provided by the trade-independent market data base, and
it assesses the associated risks according to some internal statistical model. All this is finally
summarised in a Value-at-Risk report for the banks management.
In this framework, many banks have focused on the perfection of their internal mathematical
model a task both financially and scientifically rewarding. They frequently found, however,
that the reliability of the models results depended heavily on the quality of the input market
data1 generally a much neglected topic. Consequently, there has been a recent2 perceptible
shift in attention towards the implementation of comprehensive and redundant market data base
systems which provide data completion, data validation, and data cleansing.
I have professionally consulted several large German banks on this topic, and together with the
clients management co-directed the subsequent system and process integration projects. For
1
Particularly sensitive risk models are the ones which use empirical distributions (historical simulation), but other
methods (e.g. Monte Carlo simulation) are affected as well.
2
This started around A.D. 2000, just after the (overrated) Euro introduction and millennium bug issues.
11
this thesis, I have now taken the time to study the particular problem of data completion in
detail. So in the following text, we will encounter several completion techniques, some of a
heuristic nature, and others of a more profound statistical nature. The empirical sample data for
our studies are swap rate curve time series in major currencies.
1.2
Apart from regulatory compliance (trader independence), which is of course a legal matter,
there are other intrinsic criteria why market data for the Front Office and market data for Risk
Control should be considered unequal:
The first difference concerns the consumption of data. On the one hand, an equity (or swap, or
FX, etc.) trader needs an up-to-date intra-day, high-performance partial view on the market, i.e.
current levels of equity (or swap, or FX, or ) prices. Hence each Front Office system usually
has a direct unfiltered connection to a single data provider. On the other hand, in Risk Control,
all financial markets need to be considered.3 This drastically increases the data volume, and so
the (technical) speed of the data feed cannot be as high. Therefore, one usually takes a onesnapshot-per-day view on the market here.
A second difference is how market data is processed. The data vendors (cf. Section 1.3 below)
usually deliver data fields for parameters which for the trader specify the conditions under
which a deal is struck.4 If such a data field, due to some technical error, contains a wrong (or
missing) value, any sufficiently experienced trader will spot the mistake immediately. However,
in a fully automated Risk Control process (as described in the preceding Section 1.1), a large
data set is compressed into a single number for the Value-at-Risk report. By simply looking at
this report, it is virtually impossible to decide if the input data set contains (significantly)
erroneous values. Therefore, the input values must be filtered for errors, and missing values
must be compensated for.
A third and significant difference is the interpretation of the data. For option valuation
purposes, one usually assumes a risk-neutral measure to obtain no-arbitrage prices. A market
snapshot including all underlying prices and associated parameters (discount factors, implied
option volatilities, etc.) is generally sufficient for this task. In contrast, the real world measure
applies in Risk Control.
Of course, one might argue in the latter case that the change of measures from risk-neutral to
real-world (or risk-averse) affects only the market parameter drifts, but not their volatilities
and correlations. So in principle, implied option volatilities could be used instead of historical
volatilities. But we would encounter difficulties with this approach: From a practical point of
view, there are often not enough liquid traded options for each underlying. This is particularly
true if one wants to calculate implied correlations. Conceptually, even if we could obtain
sufficient implied volatility data (from traded options), we still would have to assess the risk
associated with the (traded) implied volatility itself (vega risk) i.e. the (historical) volatility
of the (implied) volatility.
12
So in order to model future profit/loss distributions for Risk Control, real world time series
must be considered.5 This adds the historical dimension to market data, and to all financial
instruments, so that interest rate curves become two-dimensional objects, implied volatility
surfaces three-dimensional, etc.
1.3
Among the companies that independently collect market data and sell it to financial institutions
are Bloomberg, Reuters, Datastream, Telerate/Bridge, Telekurs, ISMA, DRI, Olsen, and RiskMetrics. The first two are the leading providers and cover market data in general, whereas most
of the others have particular strengths in certain areas. In particular, the provider RiskMetrics
has specialised in empirical distribution parameters (historical volatilities and correlations).
In principle, a bank could purchase todays general market data from one provider, plus
RiskMetrics volatilities and correlations, and assess its risk via, say, a Monte Carlo simulation.
This is indeed what several small banks are doing.
Many larger banks however feel they have to produce their own market data, for several
reasons: The first is vendor independence through redundancy, so they collect, say, Bloomberg
and Reuters (and ISMA, etc.; possibly supplemented by bank-internal sources) data in a
centralised data base. The second is data quality, so they apply data filtering, cleansing and
completion processes described in Section 1.2 to the data; this requires technology (automatic
routines) and staff (manual interaction). The third incentive for internal centralised market data
is method transparency, since now they can specify (and change, if so required) their own drift
and covariance estimators (length of input time series, equally or exponentially weighted, etc.),
rather than rely on the ready-made numbers provided by RiskMetrics.6
An argument which is also often heard in German banks is that they want to put more weight on
risk factors which are specific to the German market, and which in their opinion are not
sufficiently catered for by American or British based data vendors.
1.4
As has been outlined before, it is imperative to obtain a complete market data set for the
purpose of Risk Control. But because of market holidays, technical failures and networking
problems, periods with low liquidity in certain markets,7 or other problems, the data set
purchased from vendors as described in Section 1.3 will, in general, not be complete.
The first line of defence here is, of course, vendor redundancy (mentioned in Section 1.3 as
well). But this will not work in every case. Apart from the notorious market holidays, consider
the much more common case of an incomplete interest rate curve: It does not make sense to
5
This is regardless of whether one uses a full historical simulation, or only a set of historically estimated statistical
parameters, such as in the delta-normal model, or the variance-covariance method, or the Monte Carlo simulation.
6
This holds true for J.P. Morgan themselves [13]: RiskMetrics is based on, but differs significantly from, the risk
measurement methodology developed by J.P. Morgan for the measurement, management, and control of market risks
in its trading, arbitrage, and own investment account activities.
7
Or even too much liquidity: in stressful times often close to the end of a month or a year, brokers mainly work on
the phone for their primary clients, the traders. They do not always have enough resources to update their Reuters (or
other) pages in time, where they publish so-called indicative market values. These pages are independently
downloaded and processed by their secondary clients, the controllers. Unfortunately, market data for controlling is
most important at precisely these times the end of a month or a year, or generally at times of big market moves.
13
replace a single grid point of a curve delivered by, say, Olsen, with a value delivered by, say,
Bloomberg, since it may distort the overall shape of the curve because of different general
levels of the Bloomberg and Olsen curves (due to different contributing brokers, or different
curve construction and interpolation methods). So the market data set will still be incomplete.
Therefore, methods are needed which will fit missing values into an incomplete market data set
in a way that affects the delivered information as little as possible. Several such methods will be
described in the following sections, as follows:
Some of the above methods have been implemented in the market data projects mentioned in
Section 1.1. Keep in mind, however, that a bank as a competitive institution cannot only strive
for precision, but also has to control the cost of its IT projects. Therefore, if a simpler method
under-performs a more sophisticated one only insignificantly, the former will still be preferred
to the latter. The weights associated with these possibly conflicting goals of numerical
excellence and cost control depend on the business priorities of each institution, and the actual
choice can only be made individually.
1.5
Apart from missing values, a second problem affects data quality, namely the so-called outliers.
These are values which are incorrectly delivered by a data vendor for various reasons, including
technical problems (network problems, incorrect data mappings, ), conceptual problems
(wrong or unsuitable mathematical techniques, wrong reference data, sudden changes of raw
data contributors, ), or human error (manual input).8
The redundancy principle mentioned in Section 1.3 is of some help here, although its merits are
limited, due to one problematic feature of many errors appearing in practice namely, their
common source: Often the faulty value stems directly from the contributing exchange or broker,
and many data vendors will forward it to the client.
So we need a data validation and cleansing process. Such a process requires a certain amount of
human judgement and manual interaction, but it can be usefully supported by coded and fully
automated outlier filtering routines which inspect all delivered data and flag certain values as
suspect values.
Such filtering routines usually compare a value as it is delivered with some estimated
expectation of what it should be.9 They will mark the delivered value as suspect, if it
deviates from its expected value by more than a specified threshold. The expected value of
a delivery can be obtained by first treating it as missing, and then applying any data
8
The infamous decimal point error is in fact less frequent than is generally assumed. However, many errors seem
indeed to be caused manually; e.g. Eurodollar futures seem to be often precisely 100 basis points off, as suggested by
the following series: 96.625%, 96.750%, 96.875%, 96.125%, 97.000%.
9
The term expectation is used in a broad colloquial sense here, and not (yet) in the strict sense of mathematical
statistics.
14
completion method to estimate it. We see that the problem of data validation, or at least the
part which allows automation, is in fact very similar to the problem of data completion as
described in Section 1.4. The suspect threshold can either be a fixed value, or be made
flexible dependent on the delivered data as well. In the latter case, data validation requires the
additional estimation of an error bar, so it goes somewhat beyond data completion.10
After a value has been classified as an outlier, it must be deleted, and replaced by a corrected
value. This again leads to the problem of data completion.
We may conclude that the problems of data filtering, data validation, data correction, and data
completion are closely related. For the sake of this thesis, it is sufficient if we restrict our
attention to the problem of data completion (although there is certainly room for further
analysis; see Chapter 9 for an outlook).
1.6
A number of commercial software products support financial market data base systems, such as
Fame, Xenomorph, Odyssey, and Asset Control. The latter has a particularly broad user base in
Europe.
It technically is based on the usual server/client architecture. The server connects static
reference data for financial instruments (such as data vendor codes, instrument ID codes,
currency ISO codes, instrument maturity information, etc.; stored in a standard relational data
base such as Oracle or Sybase) with dynamic market data (time series for bid, ask, last trade,
and other prices, stored in UNIX flat files for fast access).
One of the key features of Asset Control are its data completion and validation functions.
Conceptually, these are filtering routines as mentioned in Section 1.5. Technically, they are
(built-in and/or customisable) Formula Engine programs which can directly be linked to any
market data time series.11 The Formula Engine is a macro language with a C-like syntax.
The completion methods examined in this thesis have all been coded in the Formula Engine.
The source code is listed in Appendix A.
The Asset Control software is documented in [1].
10
E.g., in the case of the conditional expectation method mentioned in Section 1.4, and described in detail in Chapter
4, one would additionally have to estimate the conditional variance.
11
The validation functions are triggered whenever a new delivered value is appended to a time series. Each value is
then marked with a so-called status flag. There are flags for normal values, suspect values, estimated
(interpolated, completed) values, manually validated values, etc. The user can flexibly define additional flags.
15
Chapter 2
Structural interpolation
Many financial calculations12 require the interpolation of interest rates along the yield curve. So
it is quite natural to start with this interpolation method for our problem of data completion. We
also refer to this method as interpolation along the maturity axis, or as structural interpolation.
Notice that it does not take into account the historical dimension of interest rates (the rate
changes from one trading day to the next), but only the market data as of today (the structural
dimension).
2.1
We can choose from various interpolation techniques which all have their advantages and disadvantages:
Linear interpolation is popular because it is simple to implement. The question remains as to
which representation we want to interpolate the curve: linear on par rates, or on zero rates, or on
discount factors? Using linear interpolation everywhere leads to inconsistencies and to
magnified interpolation errors. Using linear interpolation for one fixed representation (e.g. zero
rates) may lead to discontinuities in other representations (e.g. instantaneous forward rates).
Spline interpolation can be used to obtain continuous forward rates, but it displays an undesired
non-local property: small value changes in one point of the curve can lead to large interpolated
value changes, even in remote intervals of the curve [5]. This means that e.g. a small data error
in the 1Y rate may lead to a magnified interpolation error in the [15Y, 20Y] rate interval.
Because of this erratic behaviour, splines are seldom used for yield curves in practice.13
A more sophisticated interpolation method used by some trading IT systems (cf. [7]) is based on
Hermite polynomials. It delivers continuous forward rates, but avoids the mentioned problem of
global error propagation. This however comes at the price of more complexity, and less
intuition.
The favoured method for many practical financial applications is log-linear interpolation on
discount factors.14 It is almost as simple as linear interpolation on rates, and hence delivers
sufficient interpolation quality at a reasonable (implementation) price.15 It is also financially
intuitive: Without knowledge of the interval between two known values of the discount factor
curve, we expect the value of money put in the bank to grow exponentially (the Zinseszins
phenomenon).
2.2
Let B(T) denote the discount factor (price of a zero coupon bond) for maturity time T > 0,
observed in the market today (time t = 0). Notice that B(T) has a term structure, so it will be a
12
Such as bootstrapping, i.e. the derivation of yields from traded instrument data, or the transformation of the
yield curve from one representation (as par rates, zero rates, forward rates, discount factors) to another.
13
Nevertheless, most financial pricing libraries and trading systems offer this interpolation method.
14
In the case where continuity of forward rates is not a requirement; cf. Footnote 17.
15
According to [20], it is a popular method for productive use at Deutsche Bank.
16
curve, if it is given in continuous time T. However, in practice, market discount factors will be
given only at discrete maturity dates T1, , TN. So in order to obtain values between two grid
points Ti and Ti+1, we must interpolate. In accordance with the favoured method of
Section 2.1, namely log-linear interpolation on discount factors, we make the following
Assumption: The log discount factor curve log B(T ) is piecewise linear (between neighbouring
grid points on the maturity axis T).
Since the (instantaneous) forward rate curve is the (negative) derivative of the log discount
factor curve (cf. [4], equation 15.2, or [16], equation 1.3),
f (T ) =
log B(T )
,
T
01/10/1996
discount factor
100.00%
96.83%
92.92%
88.05%
82.63%
77.12%
log discount
forward rate
3.218%
4.123%
5.381%
6.362%
6.903%
7.267%
8.000%
3.000%
-2.000%
-7.000%
forward rate
-12.000%
log discount
-17.000%
-22.000%
-27.000%
16
More precisely, a left continuous step function; notice that we take right normed derivatives, since we are looking
forward - otherwise we should talk about backward rates .
17
In particular, forward rates are not necessarily continuous everywhere; cf. Footnote 14.
17
2.3
Interpolation algorithm
Suppose that in Example 2.2, the forward rates f (0), f (1), f (3), f (4), f (5) have been delivered by
some data provider, but the data point f (2) is missing. Then B(3) is unknown, as well as B(4)
and B(5). However, we do know the difference between log B(3) and log B(4), from the (nonmissing, i.e. delivered) data point f (3). By the above assumption, we want to interpolate the log
discount curve linearly, so we put a straight line between log B(2), log B(3), log B(4). Since the
slope between log B(3) and log B(4) is already determined, we must have the same slope
between log B(2) and log B(3). This determines f (2) := f (3). If f (3) is also missing, we use
f (2) := f (4) instead (with a similar argument).
In other words, we interpolate from right to left from the first delivered data point to the
missing data point.18
2.4
Code implementation
The formula f (T ) = log B(T ) log B(T+1 ) of Example 2.2 is implemented19 in the routine
forward.fe (cf. Appendix A.1), the actual interpolation method is implemented in the
routine backextrapolate.fe (cf. Appendix A.2).
2.5
Suppose that in Example 2.2, the data provider did not deliver the data point for the maturity
interval [0, 1). Then the true value 3.218% is replaced by the interpolated value 4.123% from
the neighbouring maturity interval, resulting in an interpolation error of 91 bp.20 Compared with
the previous days (30/09/1996) true value of 3.266%, this creates a single-day jump scenario of
86 bp on the short end of the DEM curve, which grossly distorts the risk calculation.
Notice that this is not at all a contrived problem. Particularly at the short and long ends of yield
curves, data deliveries are sometimes incomplete.
Notice also that this problem does not stem from the fact that we chose the log-linear method in
Section 2.1. It simply stems from the fact that we ignore the yield curve history, and only
consider todays values. All methods listed in Section 2.1 suffer from this problem.
18
More precisely, by Footnote 16, there exists for every maturity T a small period > 0 such that f (T ) = f (T+ ).
Now if f (T ) is unknown, we look for the smallest > 0 such that f (T+ ) is known, and we estimate f (T ) := f (T+ ).
19
together with a standard formula which first deduces discount factors from swap rates
20
1 bp = 1 basis point = 0.01% = 10 4 .
18
Chapter 3
Previous-day extrapolation
In order to avoid the problem with yield curve interpolation described in Section 2.5, we need
to take the history into account. The simplest solution is to compare todays possibly
incomplete values with yesterdays values which are assumed to be complete (or at least in turn
completed). We discuss variants of this approach.
3.1
This data completion method is simple: If todays value is missing, use yesterdays value
instead. In the example of Section 2.5, we estimate the short end of the 01/10/1996 DEM curve
with the delivered value 3.266% for the previous day 30/09/1996. Since the true 01/10/1996
value is 3.218%, the estimation error reduces to only 5 bp.21
Of course we can also search for particular problem cases for this method: If the previous days
value is not actually delivered, but estimated in turn, the interpolation errors tend to increase.
3.2
[0, 1)
[1, 2)
3.266%
4.211%
(3.218%)
4.123%
(-1.470%)
-2.090%
22
multiplicative change
extrapolated change
completed values
estimation error
-2.090%
3.198%
4.123%
2 bp
0 bp
The values in parentheses are supposed to be missing in the data delivery; italics signify the
estimate, computed as follows: The delivered relative change of 2.090% for the maturity
21
The magnitude of the error tolerance depends of course on the type of application. For the value-at-risk
calculations described in Section 1.1, single-digit errors are usually deemed acceptable. A higher level of accuracy is
generally required by profit and loss calculations.
22
Meaning change = today / yesterday 1. To be consistent with later methods (cf. 4.1), we assume geometrical
Brownian motion (resulting in a log-normal distribution). If we assumed Brownian motion (normal distribution)
instead, one would have to use additive changes (= today yesterday). The error in the example would then be 4 bp,
which is still acceptable.
It is common to assume log-normally distributed forward rates, see e.g. [16], Section 1.5. For an empirical study of
issues related to log-normal distributional assumptions for forward rates and for swap rates, see [17].
19
interval [1, 2) is extrapolated to the interval [0, 1), and applied to the previous days delivered
value 3.266%. The resulting estimated value 3.198% differs from the true value 3.218% only by
an acceptably small error of 2 bp.
3.3
A variant interpolation method for the univariate case23 has been published as an earlier
thesis [21] for the current Diploma in Mathematical Finance program. Let us quickly present the
essential idea:
It proposes to interpolate a missing value f (t ) of some day t not only from the previous trading
day t1, but as a convex combination of several past values, and possibly even future values. So
d
i = d
wi (t ) f (t + i ) according to
d
i = d
Since the simple previous value extrapolation of Section 3.1 is included in this method (just set
w 1(t ) =1 and wi(t ) = 0 for all i 1), it should give at least as good results. But notice that,
except for that special case, this approach is not compatible with Markov processes, such as
Brownian motion. Nevertheless it is claimed in Theorem 4 in Section 2.3.1 of [21] that the
above estimation will be unbiased, if the underlying process is a Brownian motion. In fact this
is only true for the unconditional expectation, but not for the conditional expectation given all
past values, which actually is more relevant in our context.
Apart from this criticism, it should be noted that the thesis [21] also proposes an algorithm for
outlier detection, and subsequent outlier correction via replacement with the estimated value (cf.
our remarks in Section 1.5), by constructing optimal (in a certain sense) weights wi(t ). The
thesis also gives estimation formulas for the interpolation error.
We will not pursue this path any further in this thesis. Instead, we will look at the problem from
a different angle, by studying multivariate data completion methods in the following chapters.
Let us just point out here that we will encounter somewhat similar problems with non-Markov
effects, and with the distinction between conditional and unconditional distributions. We will
sort out these problems in Section 5.3 by switching to GARCH processes instead of Markov
processes.
3.4
Code implementation
See Appendix A.3 for the routine extrapolation.fe which implements the absolute value
extrapolation method of Section 3.1.
For the relative value extrapolation method of Section 3.2, there is no separate routine. It is
implemented by applying the routine backextrapolate.fe (see Appendix A.2) to the
daily changes (rather than the absolute values as in Section 2.4). This application is performed
by a particular case (switch (method) case "RELATIVE") of the general wrapper
function complete.fe (see Appendix A.7).
23
e.g. for a stock index or in our case of yield curves, for an individual forward rate or an individual swap rate
20
Chapter 4
Conditional expectation
In the preceding sections, we presented heuristic data completion methods for time series of
yield curves: either along the structure (Chapter 2), or along the history (Section 3.1), or some
mixture of both (Section 3.2).
We now approach the latter idea more sophisticatedly. We want to use the maximum available
information from the delivered data set in both the structural and the historical dimension, i.e.,
we want to not only consider neighbouring, but rather all structure grid points, and we do not
want to restrict the reference period to the previous day only, but extend it farther into the past.
According to statistical theory, the best tool24 to achieve these means is the conditional
expectation operator.25
Before we apply this concept to our problem, let us fix some notation.
4.1
In the risk control practice of many banks, it is customary to start with the RiskMetrics model of
financial returns (cf. [13], Section 4.6, equation 4.54). We adapt this model to our situation. Let
f (t, T ) be the (instantaneous) forward rate (process) for maturity term T, observed at calendar
date t. For a given maturity grid T1, , Tm, we write fi (t) = f (t, Ti ). We assume joint
geometrical Brownian motion without drift for f1, , fm,
d (log fi ) = i dwi,
where w1, , wm are standard Wiener processes with
dwi (t ) ~ N(0, dt ),
dt , if u = t ,
Cov dwi (t ), dw j (u ) = ij
if u t ,
0,
24
25
As usual, the covariance of the (log) forward rates then is ij dt := Cov[d log fi (t),
d log fj (t)] = i j ij dt; notice that in particular, ii = i2.
Let us clarify the somewhat sloppy formulation without drift employed above: In
accordance with RiskMetrics, we more precisely assume that the drift of the
arithmetical Brownian motion (log fi ) is zero. Then Its Lemma (cf. [4], Section 3.4.2,
or [10], Section 10.6) implies d fi / fi = i2 dt + i dwi, and we obtain a non-zero drift
i2 / 2 for the geometrical Brownian motion fi itself.
In view of option pricing applications, there is an additional drift aspect: in the Libor
market model (cf. [2] and [12]), or more generally in the Heath-Jarrow-Morton
framework (cf. [5] and [9]), the assumption of zero drift for one particular forward rate
The term best is used in the sense of sufficiency. See [15] for details.
For a short introduction to conditional expectation, see my homework essay [19] for the current M.Sc. course.
21
imposes the risk-neutral martingale measure for this particular rate on all other
forward rates, which results in non-zero drifts for the other rates (see also [11]). Notice
however that we work in the so-called real world measure, and not in any risk-neutral
one. So let us emphasize that the assumption of zero drift everywhere is perfectly good
for data completion purposes. For an empirical justification, see [13], Section 5.3.1.1.
4.2
We allow correlation ij between grid points i, j {1, , m}, but we do not allow
autocorrelation.26 This seems in contrast to the RiskMetrics model, which explicitly
assumes autocorrelation in [13], Section 4.6. But RiskMetrics is not stubborn here,
since it also provides evidence that returns are not autocorrelated ([13],
Section 4.3.2.3), although not statistically independent, i.e. the underlying process is
non-Markov. However, for the normal distribution, which is assumed in [13],
equation 4.54 as well, it is well-known that zero correlation is equivalent to
independence.27 Later on, for the maximum likelihood estimation of the conditional
expectation (cf. [13], Section 8.2.2), they work under the assumption of statistical
independence between time periods. We will do just that, and drop autocorrelation for
the moment.28
Covariance estimation
For any particular realisation30 of the market , and for any grid point i = 1, , m, we
denote by fi the path followed by the forward rate fi for the maturity term Ti. Given the series
of realised values
fi (t), fi (t t), fi (t 2 t), , fi (t d t)
of this path at historical calendar dates t, t t, t 2 t, , t d t (for some historical depth d,
and equidistant time steps t), we first construct the series of its log-changes
ri (s) := (log fi )(s) := log fi (s) log fi (s t), s = t, t t, , t (d1) t.
For i, j = 1, , m, we then estimate the covariance parameter ij of 4.1 by the exponentially
weighted moving average (EWMA) of the products of log-change series i and j,
ij (t ) :=
1 1 d 1 k
ri (t k t ) rj (t k t )
t 1 d k = 0
for some decay factor (0, 1], where in the limit case = 1 of the equally weighted moving
average, the singular term (1) / (1 d ) is replaced by its limit value 1/d. Our formula
corresponds to the formulae given by RiskMetrics in [13], Section 5.2.1, Table 5.1. We have,
26
N.B. Autocorrelation is implicitly already excluded by our notation. Recall that all Wiener processes are Markov,
and thus changes over time are independent.
27
To be fair, one has to add that leptokurtotic distributions are indeed discussed in [13], Section 4.5, and the choice
of the normal distribution is only made for the sake of simplicity and analytical tractability.
28
We will extend our model in Section 5.3 further below, such that these seemingly inconsistent statements will be
reconciled.
29
We will relax this restriction later on, in Section 5.3. Cf. Footnote 28.
30
In the language of random variables. For a short introduction to random variables, see my homework essay [18]
for the current M.Sc. course.
22
however, corrected the weight parameters (cf. ibid, equations 5.1 and 5.2, and notice that
d 1
k =0
Then ij is an unbiased estimator for the covariance ij. This necessarily implies bias for
volatility and correlation, since both are non-linear functions of the covariance (due to Jensens
inequality for the expectation operator, cf. [8], Section 5.6, Exercise 15).
We still have to choose the decay factor. Under our assumption of constant process
parameters ij, the choice = 1 yields the best linear unbiased estimator (BLUE), and the
maximum likelihood estimator (ML) at the same time (but notice that even then, the
estimation ij (t ) will depend on calendar time t). However, our framework works also for
mildly time varying distributional parameters ij. If we relax our assumptions to this more
general case, it may be more favourable to choose a value < 1, where recent data carry a
higher weight than past ones. Empirical studies by RiskMetrics evince that for the case t = 1
(trading) day, the value
= 0.94
yields optimal results (cf. [13], Section 5.3.2.2, and Table 5.9). Since we are indeed concerned
with daily market data (cf. Section 1.2), we will henceforth use this value.31
4.3
V
= T
U
U
m m,
W
x.
4.4
Estimation algorithm
Fix the notation of 4.2, and set t = 1, where (calendar) time is measured in (trading) days.
Suppose that we have a complete historically realised data matrix
( fi (t k ) : i = 1, , m, k = 1, , d ),
31
See Section 5.3 for the theoretical framework motivated by these empirical results. See also Footnote 29.
As for notation, we use column vectors. Transposed vectors and matrices are denoted by the capital letter T in the
exponent.
33
with zero mean, i.e. null unconditional expectation vector
32
23
:= ( ij (t 1) : i, j = 1, , m ).
Now consider an incompletely delivered data vector ( f1 (t ), , fm (t )) for todays forward
curve. By applying a permutation on {1, , m} if necessary, we may assume w.l.o.g. that
f1 (t ), , fa (t ) are known values, and that fa+1 (t ), , fm (t ) are missing. In the notation of
Section 4.3, this can be written as X := ( f1 (t), , fa (t)), x := ( f1 (t), , fa (t)), Y := ( f a+1 (t),
, fm (t)), and the missing values y := ( f a+1 (t), , fm (t)) can then be estimated by
y := E [Y | X = x ] .
4.5
Code implementation
The main work, namely the calculation of the conditional expectation, is performed by the
routine conditionaldistribution.fe, which is listed in Appendix A.5. (It actually
calculates not only the conditional expectation, but the conditional covariance matrix as well;
this is only for the sake of completeness of the code, and it is not used here.34) It assumes that
all parameters are already ordered such that delivered values come first, followed by the
missing values. If this is not the case, the calling routine must apply a suitable permutation.
Such a calling routine is included in one particular case (switch (method) case "CE")
of the general wrapper function complete.fe (cf. Appendix A.7), which actually starts by
estimating the (unconditional historical) distribution parameters with the help of the subroutine
covarmatrix.fe (listed in Appendix A.4).
34
But it may become useful for the study of the questions listed in the outlook Chapter 9.
24
Chapter 5
5.1
Expectation Maximisation
In [13], Section 8.2 and not only there, the expectation maximisation algorithm is presented
as the state of the art data completion algorithm. So we must also include it in our analysis
here. Unfortunately, the exposition in [13] is rather difficult to read; let us instead refer to the
more general, but also much more clearly written original paper [3] of Dempster et. al.:
This paper presents a general approach to iterative computation of maximumlikelihood estimates when the observations can be viewed as incomplete data.
[] Each iteration of the EM algorithm involves two steps which we call the
expectation step (E-step) and the maximization step (M-step). [] We now
present a simple characterization of the EM algorithm which can usually be
applied when []35 holds. Suppose that (p) denotes the current value of
after p cycles of the algorithm. The next cycle can be described in two steps, as
follows:
E-step: Estimate the [missing values] by finding y (p) = E [ y | x, (p) ].
M-step: Determine (p+1) as the [] maximum likelihood36 estimator of , which
depends on (x, y) only through y.
What now, one may ask, is the difference to the conditional expectation described in Chapter 4?
Certainly not the E-step, which is just another application of Section 4.3. The tiny difference
lies with the M-step, where the estimator for is now given as
(p+1) = ( ij (t ) : i, j = 1, , m ).
Notice that here we use the covariance estimator at time t, rather than at time t1 as in
Section 4.4. This means that the completed (in the E-step) data vector for todays (originally
incomplete) data delivery is included in the estimation (in the M-step) of the distributional
parameters which in turn govern the estimation of the completed data vector (in the next Estep). This can obviously only be achieved by an iterative approach.
5.2
Philosophically, one may argue about which makes more sense; the inclusion or the exclusion
of missing (respectively, to be completed) values in the estimation of the distribution
parameters which govern the completion. On a numerical level however, in anticipation of our
empirical study in Chapter 7, the results are virtually indistinguishable. It is in fact very difficult
35
The condition mentioned here is that we work in the framework of a regular exponential family with parameter
vector , complete data vector x, sufficient statistics t(x), and observed data y. This condition is indeed satisfied
here: We work in the framework of a normal distribution family (instead of the more general exponential family)
with covariance matrix (instead of the more general parameter ), complete data vector (x, y) (instead of x in the
original text), missing data y = projection(x, y) (instead of the more general form t(x) in the original text), and
observed data x (instead of y in the original text). To avoid confusion, I have adjusted the notation in the quotation
accordingly.
36
N.B. maximum likelihood for y = y (p) to happen
25
to construct a pathological example where the EM algorithm does not converge after a single
cycle (and thus degenerates into a simple conditional expectation estimation as in Chapter 4).
Finally, with an eye on implementation, it becomes obvious that conditional expectation is only
one part of expectation maximisation, namely the first E-step; one saves the coding effort37 for
the M-step, the iterative loop, and the stop criterion.
This sounds as if EM would not make much sense. But remember that we have a very special
case of data here, namely time series data. In Chapter 4, we exploited this calendar structure in
order to construct an algorithm which is simpler, but equally useful (namely single-step
conditional expectation). However, the general EM algorithm works also for chronologically
unordered data sets, and so it is not just more complex, but in general also more powerful.
5.3
After what has been said in the preceding Section 5.2 about the virtual equality of results of the
EM algorithm and of simple conditional expectation, it seems unnecessary to discuss theoretical
convergence properties of the EM algorithm. Nevertheless, some minor theoretical
inconsistencies seem to have leaked into the preceding exposition. By this, I do not mean actual
logical contradictions, but rather some at first sight unnatural choices in our process model.
This section is devoted to clarify these issues, and to show that the choices made are, at a
second glance, indeed practicable.
First let me point out what has been impure in our process model so far. In order to be able to
prove theoretical convergence properties for the EM algorithm, maximum likelihood (ML)
estimation of the process parameters ij is required (cf. Section 5.1 and [3]). Under our process
assumption of independent process innovations (log fi )(t), the ML estimator is given by = 1,
as has been pointed out in Section 4.2. But we have chosen < 1; more precisely, the industry
standard = 0.94 as introduced by RiskMetrics. So how does this fit together?
Let us rewrite the covariance estimator of Section 4.2, in order to embark on the justification of
our Ansatz. As usual, let us assume t = 1 (trading day), and consider for simplicity only the
case i = j, i.e. only the (univariate) variance estimator for some yield curve grid point i {1, ,
m}:
ii (t ) =
1 d 1 k
2
ri (t k )
d
1 k =0
d 1
1
2
(
)
k ri (t (k + 1) )2 d ri (t d )2
+
r
t
i
d
1
k =0
d
(1 ) r (t d )2
1
2
=
ri (t ) + ii (t 1)
i
d
1
1 d
If we compare this with Section 31.1.2 of [4], we find that the above equation describes a
GARCH(p, q) process with volatility forecast
p
h =1
k =1
(t + 1) 2 = a 0 + bl (t + 1 h) 2 + a k ri (t + 1 k )2 ,
37
And also for productive use in a banks internal IT environment the extensive software testing effort.
26
where
(t + 1) 2 = ii (t ) ,
p = 1,
q = d+1,
a0 = 0,
a1 = (1) / (1 d ),
ad+1 = d (1) / (1 d ),
b1 = .
Notice that we have to shift the volatility forecast by 1 day in order to make our approach
compatible with the GARCH model. This introduces the mildly time varying distributional
parameters, as quoted from 4.2.
Notice that a GARCH process cannot be a Markov process, since the process innovations no
longer are independent. But also notice that we have nowhere actually used the assumption of
independent process innovations, neither for the conditional expectation method, nor for the
EM method. In both cases, two distinct steps have to be performed: Firstly, the estimation of the
process parameters with an estimation formula that may ( < 1) or may not ( = 1) implicitly
assume some form of statistical dependence (as outlined above). Secondly, calculation of the
conditional expectation for one grid point of the yield curve, given the already determined
process parameters, and depending on the other yield curve grid values for the same trading
day. In other words, we are not conditioning upon past values in this second step. This may
sound a little complicated, but it merely boils down to proper accounting, or in our case: a clear
distinction between the maturity dimension of the yield curve (along the curve structure), and
its historical dimension (along the time series).
How do we now obtain a maximum likelihood estimator? In our special version of the GARCH
model, the underlying process parameter is no longer the covariance ij, but the decay factor .
Now if we are given the ML estimator for , then Theorem 5.1.1 of [22] shows that the ML
estimator for ij is given by the formula in Section 4.2. However, since the (unconditional)
distribution for the GARCH process is not known analytically (in the sense that we can write
down an analytical Lebesgue density function, cf. [18]), the ML value must be determined
numerically. This is just what RiskMetrics has done for many financial time series, and it found
that is always reasonably close to 0.94. So for all our practical purposes, we can rest assured
that the formula in Section 4.2 with = 0.94 is a very close approximation of the ML estimator
for the covariance ij in our GARCH model.
This refines the theoretical foundation for our heuristic study. It should also illuminate the not
very lucid exposition of the EWMA and the EM algorithm in [13].
5.4
Completion algorithm
Let us now continue with the practical aspects of our study, and present the adaptation of the
EM algorithm to our situation.
For this, we use a similar framework as described in Section 4.4. In particular, we start the EM
algorithm with the initial parameter estimation (which has been left unspecified in Section 5.1)
(0) := ( ij (t 1) : i, j = 1, , m ).
27
We then alternately calculate y (p) and (p) as described in Section 5.1, until either
y (p) - y (p-1) 2 (numerical stop criterion), or p = 10 (maximum number of iterations).
5.5
Code implementation
This is very similar to the coding of the conditional expectation (cf. 4.5). The difference is the
additional iteration loop in the expectation maximisation case (switch (method) case
"EM") of the general wrapper function complete.fe (Appendix A.7). The stop criterion is
defined by the threshold = 10 10 = 0.01 (bp) 2, which suffices for our purposes.
28
Chapter 6
6.1
Principal component analysis is a general technique. It is often adapted to the special task of
pricing exotic interest rate options (cf. Chapter 3 of [16]). In most applications, it allows the
financial analyst to think of the interest rate curve in more abstract, geometric terms such as
curve level, slope, or curvature, rather than the concrete maturity buckets represented by the
curve grid points. Let us look at this method in more detail.
We start with the forward yield curve process f defined in 4.1. Its covariance matrix := ( ij :
i, j = 1, , m ) is positive semi-definite. In particular, it is symmetric, so there exists an
orthogonal matrix Q m m such that
Q T Q =: D =: diag( 1, , m)
is a diagonal matrix. The numbers 1, , m 0 on the diagonal of D are the (non-negative38)
eigenvalues of , and the columns q 1, , q m of Q are the associated eigenvectors, which we
call the principal axes (of the forward yield curve).
Consider the log-change history ( ri (s ) : s = t, t-1, , t-d ) of a realisation of f on some
maturity grid point i {1, , m}, as given in the discrete setting of 4.2, with integer-valued
time step t = 1 (trading day), and integer-valued time parameter s. We write r (s) := (r 1 (s), ,
r m (s)) T in (column) vector notation with respect to the canonical basis39 (corresponding to the
maturity grid points) of the state space m.
Instead of this canonical representation, we now want to study this vector valued time series
with respect to the principal axes. For this, we have to perform a rotation40 of the coordinate
system from the canonical axes to the principal axes, described by the variable transformation
m m, r a z := Q T r.
We write z = (z1, , zm) T, and we call these new variables z1, , zm the principal components
(of the yield curve). Their covariance matrix is D, and therefore they are uncorrelated, with
respective variances 1, , m.
By permuting the eigenvectors (principal axes) q1, , q m if necessary, we may w.l.o.g. assume
that the eigenvalues are ordered via 1 m 0. It has been established by many empirical
studies (e.g., see Figure 3.6 of [16], or Abbildung 33.1 of [4]) that under this assumption, the
first principal component z1 can intuitively be interpreted as (changes in) the average level of
the yield curve, the second principal component z2 as (changes in) its average slope, and the
third principal component z3 as (changes in) its average curvature.41 They are colloquially
38
29
referred to as shift, twist, and hump (of the yield curve), respectively. They almost
comprehensively describe the dynamics of the yield curve, since their combined variance ( 1 +
2 + 3) makes up most of the total variance ( 1 + + m ) typically around 98-99%
(cf. [4], Section 33.2).
6.2
We now want to adapt principal component analysis to Risk Control by constructing a data
completion algorithm based on principal component analysis. The idea is to extrapolate
principal components zi instead of original rates ri (i = 1, , m). Since the principal components
are uncorrelated by definition, each of them can be estimated independently of the others.
Estimation here simply means replacement of the variable with its expectation, which is zero in
our model.
There is one difficulty with this approach: Missing values in the rates ri do not correspond
bijectively to missing values in the principal components zi, since the transformation z = Q T r
has smeared the holes of r all over z, colloquially speaking. So we must choose which of the
zi we want to estimate, i.e. in our case, set to zero. We want to make our choice in such a
manner that we leave the big principal components (shift, twist, hump, etc.) unchanged, and
set only the principal components with the smallest influence on curve movements to zero, i.e.
those zi corresponding to the smallest i. This makes sense, since replacing a variable zi by its
expectation E[zi] means forcing its variance i to be zero, so picking the smallest eigenvalues i
keeps a maximum of market move information provided by the data vendors.
Let us examine this approach in detail: As in Section 4.4, let us partition r T = (r1, , ra | ra+1,
, rm) = ( x | y ) into the a known values and the b := m a missing values. If we partition Q T
accordingly, we can write
R
z = Q T r =
K
H x
.
S y
Since x is known, this equation system gives us a conditions. We need b additional conditions
y of y. For this, we set the last b entries of z to zero, i.e. we estimate ~
zT=
for the estimator ~
( z , 0, , 0), where z = (z1, , za) T is the appropriate truncation of z. This gives
H x z
= ,
S ~
y 0
and it follows from the lower part of this equation system that the estimation of y is given by
~
y = S 1 K x.
At this point, we do not know if S is invertible. However, since Q has full rank, it follows that
H
m b has rank b. By permuting its rows if necessary, we may even assume
S
the matrix
that the last b rows are linearly independent, i.e. that S b b is in fact invertible. Notice
however that we have just permuted the rows of Q T, i.e. the columns of Q, and so we can no
longer uphold the assumption of Section 6.1 that the eigenvectors q1, , qm are ordered in such
a way that the eigenvalues satisfy 1 m 0. We must relax this a little, and can only
assume that among all permutations of q1, , qm which result in an invertible submatrix S, we
30
have picked one such that a+1 + + m is minimal (since there are exactly m! permutations on
{1, , m}, in particular only finitely many, the minimum exists and is actually assumed).
6.3
Completion algorithm
6.4
Code implementation
6.5
~
y = S 1 K x
and
y = U T V 1 x ( = E [ Y | X = x ] )
x
r = ,
y
V
= T
U
U
,
W
R
Q T =
K
H
,
S
Q T Q = D = diag( 1, , m).
Formally, the two estimation formulas for y look very similar. We want to examine the
y = y ?
circumstances under which they coincide. In other words: when is ~
Let us put ourselves in the position of an interest rate option trader, and let us assume that the
dynamics of the yield curve only depend on the first two or three, at most a < m principal
components.
Then a+1 = = m = 0, and therefore the relationship Q T = D Q T can be rewritten in block
matrix form as
31
H V
S U T
U D
=
W 0
0 R
0 K
H
,
S
where D = diag( 1, , a) is the appropriately truncated submatrix of D. The lower left hand
corner of this matrix equation yields K V + S U T = 0 R + 0 K = 0. But then S 1 K + U T V 1 = 0,
y = y .
and so ~
Therefore, unsurprisingly, under the usual practical assumptions and approximations, data
completion via principal component analysis yields the same results as taking direct conditional
expectations. Principal component analysis is however more difficult to implement (cf. 6.4).
32
Chapter 7
Method comparison
We now want to compare the prediction quality of the data completion methods described in the
preceding chapters.
7.1
Let us pick two trading days, 10/09/2001 and 20/09/2001,42 for the US dollar swap43 curve with
terms 1Y10Y. We simulate an incomplete data delivery by deleting the 8Y10Y rates. These
gaps are then filled again by our data completion methods. The results are as follows:
1Y
2Y
3Y
4Y
5Y
6Y
7Y
8Y
9Y
10Y
Bloomberg USSA1 USSW2 USSW3 USSW4 USSW5 USSW6 USSW7 USSW8 USSW9 USSW10
10/09/2001
3.460
4.027
4.503
4.831
5.067
5.241
5.372
5.475
5.556
simulation
3.460
4.027
4.503
4.831
5.067
5.241
5.372
n/a
n/a
20/09/2001
2.710
3.484
4.032
4.397
4.661
4.880
5.081
5.224
5.342
simulation
2.710
3.484
4.032
4.397
4.661
4.880
5.081
n/a
n/a
5.372
5.372
5.372
-10.3
-18.4
-26.3
5.081
5.081
5.081
-14.3
-26.1
-37.2
5.430
5.504
5.579
-4.5
-5.2
-5.6
5.222
5.339
5.450
-0.2
-0.3
-0.3
5.469
5.543
5.619
-0.6
-1.3
-1.6
STRUCT
10/09/2001
3.460
4.027
4.503
4.831
5.067
5.241
5.372
2.710
3.484
4.032
4.397
4.661
4.880
5.081
error (bp)
EXTRA
10/09/2001
3.460
4.027
4.503
4.831
5.067
5.241
5.372
2.710
3.484
4.032
4.397
4.661
4.880
5.081
error (bp)
RELATIV
E
10/09/2001
error (bp)
42
5.453
n/a
error (bp)
20/09/2001
n/a
structural interpolation
error (bp)
20/09/2001
5.635
4.027
4.503
4.831
5.067
5.241
5.372
Notice how the trading days have been picked before and after September 11, 2001.
In Chapter 2Chapter 6, we have considered forward curves, but here we examine the swap curve. However, this
does not make much difference for the sake of this illustrative example. Later on, we will compare the impact of all
data completion methods on several yield curves in both forward and swap quotation. See also Footnote 45.
43
33
20/09/2001
1Y
2Y
3Y
4Y
5Y
6Y
7Y
8Y
9Y
2.710
3.484
4.032
4.397
4.661
4.880
5.081
5.205
5.321
5.432
-1.9
-2.1
-2.1
5.470
5.547
5.623
-0.5
-0.9
-1.2
5.223
5.347
5.462
-0.1
0.5
0.9
5.470
5.547
5.623
-0.5
-0.9
-1.2
5.223
5.347
5.462
-0.1
0.5
0.9
5.470
5.547
5.622
-0.5
-0.9
-1.3
5.214
5.344
5.463
-1.0
0.2
1.0
error (bp)
CE
10Y
conditional expectation
10/09/2001
3.460
4.027
4.503
4.831
5.067
5.241
5.372
error (bp)
20/09/2001
2.710
3.484
4.032
4.397
4.661
4.880
5.081
error (bp)
EM
expectation maximisation
10/09/2001
3.460
4.027
4.503
4.831
5.067
5.241
5.372
error (bp)
20/09/2001
2.710
3.484
4.032
4.397
4.661
4.880
5.081
error (bp)
PCA
10/09/2001
3.460
4.027
4.503
4.831
5.067
5.241
5.372
error (bp)
20/09/2001
2.710
3.484
4.032
4.397
error (bp)
4.661
4.880
5.081
It is perhaps somewhat surprising that September 11 seems not to have any impact on the
prediction quality of any of the methods, but apart from that, the estimation comparison clearly
confirms the conjectures of our theoretical discussions in Chapter 2Chapter 6: Structural
interpolation (respectively extrapolation) produces unacceptably large errors. Previous-day
extrapolation performs much better, but we cannot decide between the absolute and the
relative alternative. Conditional expectation shows the best results. Expectation maximisation
is identical to conditional expectation. The results of the PCA method are also quite good here
(in spite of the implementation problems mentioned in 6.4).
Notice that these have been only exemplary results for two trading days in a single currency
with missing values at the long end of the yield curve. In order to make a more profound
statement about the quality of the methods, we must examine a larger data set in the following
sections.
7.2
Additionally to US$ (cf. Section 7.1), I have also downloaded from Bloomberg daily swap rate
time series for Deutsche Mark (tickers DMSW1, DMSW2, , DMSW10) and Pound Sterling (tickers
34
BPSW1, BPSW2, , BPSW10). The terms are 1Y, 2Y, , 10Y, and the histories are about five
Time series deliveries with missing values have then been simulated for each currency in the
following manner:
Loop through the (ten) grid points for 1Y, , 10Y, and delete each value with
probability 1/10 (i.e., on average, one value is deleted per curve).
Pass the incompleted curve to our six competing data completion methods.
Collect from each method the completed data vector and compare it with the
original (true) values. Store the estimation error for each completed grid
point (rounded to full basis points, with positive/negative sign).
Repeat the above ten times for each day, with different values missing each time.
Collect all estimation errors for all our six completion methods.
error histogram,
Notice how this algorithm does not measure errors as absolute (or squared) deviations from
the true value, but as simple difference (estimated value true value) with positive or negative
sign. Therefore, we can distinguish between over- and underestimation, and analyse the
empirical error distribution.
7.3
Experimental set-up
All curves have undergone the above procedure in both swap45 and forward rate notation (where
the swap rates have then been converted to forward rates with 1Y-tenors by the Formula Engine
routine forward.fe listed in Appendix A.1). Additionally, the above experiment has been
performed in a modified way where the completion has been performed on the forward rates,
but the estimation error has been measured in par (swap) rate notation.
44
The DEM time series of course reflect EUR swap rates since 1998, but the tickers DMSWx have been continued by
Bloomberg. They have been chosen for the experiment instead of their Euro equivalents because of their longer
histories.
45
As has been pointed out before in Footnote 43, our data completion methods have been defined on forward curves
in Chapter 2Chapter 6. It is however easy to see that they all carry over to par curves (swap curves) as well with
one exception, namely structural interpolation (Chapter 2). Bearing this in mind, it is nevertheless possible to
numerically apply this method also to swap curves, and measure the resulting errors.
35
So altogether six methods, three currencies, and three quotation variants have been examined:
Completion of par rates and error measurement (experiment evaluation) in par rates: see
Sections 7.4.3, 7.5.3, 7.6.3.
The above experiments have been performed automatically by hybrid UNIX shell script and
Formula Engine routines.46
7.4
7.4.1
The experiment has simulated 10293 deliveries of randomly missing data, which have been
completed by our methods. If the estimation errors are measured with positive/negative signs
(over-/ underestimation), the following statistical quality parameters are obtained:
(in bp)
STRUCT
EXTRA
RELATIVE
CE / EM
AVG
21.2
0.0
0.1
0.0
STDEV
24.8
9.6
14.8
9.0
MAX
257.0
136.0
284.0
115.0
MIN
-224.0
-117.0
-220.0
-117.0
Conditional expectation is the best method, although it does produce outliers. The EM
algorithm yields identical results.47 Surprisingly, simple previous-day extrapolation is not
46
This is a common technique for programming the Asset Control data base. From a mathematical point of view, not
much insight can be gained from the script source codes, since it is mostly concerned with book-keeping: loading all
subroutines, formatting the curve data, running the loops, passing the data to the mathematical routines, collecting
the returned data, and storing the errors in log files. Therefore I have decided not to include the (lengthy) code listing
in Appendix A.
47
Separate experiments have been performed for CE and for EM, but the results are always identical. So we will not
list the results in separate columns. This equally remains true for the experiments described in the following
Sections 7.4.2 ff. See also our theoretical discussion in Section 5.2.
36
The next table shows the results of a similar experiment, where DEM forward curves have been
completed (a simulation with 10381 randomly missing data points). However, the completed
curves have been transformed back into par curves, before the deviations from the true values
have been determined:
(in bp)
STRUCT
EXTRA
RELATIVE
CE / EM
AVG
12.4
0.0
0.1
0.0
STDEV
20.2
2.3
3.3
2.2
MAX
212.0
18.0
56.0
38.0
MIN
-46.0
-21.0
-53.0
-30.0
The errors are much less pronounced in par notation as they are in forward notation (cf. with the
histograms in Appendix B.1.2).
This is easily explained, since according to our simulation algorithm described in Section 7.2,
on average one forward rate (out of ten rates per curve) is missing. So each swap rate of the
completed and transformed curve consists of approximately only one estimated, but several
originally delivered forward rates. This averaging effect of the transformation from forward
curve to swap curve reduces the statistical error. However, we can now more easily compare
completion of forward curves to completion of par curves in the following section.
7.4.3
In the preceding experiments, Sections 7.4.1 and 7.4.2 above, we have completed the yield
curves in forward notation, according to the definitions of our methods.
In theory, this also makes sense: Structural interpolation for forward rates has been derived
from the principle of log-linear interpolation on discount factors (Chapter 2). It is not at all clear
what it will resemble, if we apply exactly the same method to swap rates. In the case of
previous day extrapolation (cf. Chapter 3), one would expect forward rates to have drift zero
rather than swap rates. For the completion routines CA, EM, and PCA (Chapter 4 Chapter 6),
which are all based on the covariance matrix, recall that maturity intervals of swap rates
overlap, so there is some necessary interdependence between swap rates stemming from noarbitrage arguments. The covariance matrix of swap rates therefore reflects these no-arbitrage
dependencies, plus additional statistical correlations. But the conversion to forward curves
where the maturity intervals do not overlap removes these dependencies; the correlations are
then of a purely statistical nature.
37
In practice however, there is one additional problem which we have ignored so far. The rates
have been originally delivered as swap rates. If the delivered curve is incomplete, we cannot
convert it to forward rates, complete the curve, and then convert it back. We must complete the
curve directly in swap rate notation. This is what we will do now, by applying our methods
directly on swap curves (without adjustment). We will find out whether the results are worse or
still acceptable (10366 simulated missing data points):
(in bp)
STRUCT
EXTRA
RELATIVE
CE / EM
AVG
12.5
0.0
0.0
0.0
STDEV
12.2
4.3
1.7
1.5
MAX
114.0
21.0
21.0
36.0
MIN
-37.0
-22.0
-21.0
-36.0
The results are very similar to completion on the forward rates, evaluated on par rates in
Section 7.4.2, which is the natural benchmark method for comparison. But recall that contrary
to Section 7.4.2, we do not have an averaging effect here which might reduce our statistical
error. So we can also compare the results to Section 7.4.1, where both completion and
measurement have been on forward rates; we find that they are much better. See also the
histograms in Appendix B.1.3.
Therefore, it seems to make sense to apply the successful completion methods previous day
extrapolation (including relative value extrapolation) and conditional expectation (including
EM) directly on the delivered swap rates, and not to change the curve representation from par
to forward notation.
We will now examine if these statements carry over to yield curves in other currencies.
7.5
7.5.1
The statistical experiment for the GBP forward curve completion has been performed on 9587
arbitrarily missing data points. The results give a similar picture as for DEM in 7.4.1:
(in bp)
STRUCT
EXTRA
RELATIVE
CE / EM
AVG
-5.9
0.2
0.3
0.2
STDEV
28.7
14.3
22.4
14.3
MAX
266.0
217.0
484.0
288.0
MIN
-351.0
-359.0
-348.0
-342.0
One striking result is the big outlier of 484 bp produced by the relative value extrapolation. The
histograms in Appendix B.2.1 do not add much to the general picture here.
38
7.5.2
The forward completion with evaluation in par notation has been performed on 9622 missing
values, with the following result:
(in bp)
STRUCT
EXTRA
RELATIVE
CE / EM
AVG
-1.7
0.1
0.1
0.1
STDEV
21.7
3.1
4.8
3.1
MAX
145.0
24.0
102.0
55.0
MIN
-143.0
-33.0
-51.0
-29.0
The interpretation is similar to the DEM case in Section 7.4.2. This is also supported by the
histograms in Appendix B.2.2.
7.5.3
The experimental run for the par completion / par evaluation case of the GBP swap curve has
simulated 9777 missing values. The results are very close to the DEM case of 7.4.3. (The same
holds true for the GBP histograms in Appendix B.2.3.)
(in bp)
STRUCT
EXTRA
RELATIVE
CE / EM
AVG
-2.4
0.2
0.0
0.0
STDEV
12.2
5.2
2.5
2.1
MAX
84.0
45.0
42.0
47.0
-103.0
-37.0
-34.0
-24.0
MIN
39
7.6
The experiments for USD confirm the general findings of the preceding Sections 7.47.5 for
DEM and GBP. The results are therefore listed in the following without further comment. The
histograms for the USD curves are displayed in Appendix B.3.
7.6.1
7.6.2
STRUCT
EXTRA
RELATIVE
CE / EM
AVG
10.9
0.0
0.2
0.0
STDEV
26.2
15.5
22.9
11.7
MAX
384.0
180.0
391.0
189.0
MIN
-160.0
-386.0
-473.0
-208.0
EXTRA
RELATIVE
CE / EM
8.2
0.1
0.1
0.0
20.2
3.2
5.2
2.6
MAX
267.0
42.0
123.0
75.0
MIN
-47.0
-39.0
-82.0
-24.0
STDEV
7.6.3
STRUCT
STRUCT
EXTRA
RELATIVE
CE / EM
7.3
0.2
0.0
0.0
11.5
5.9
2.6
2.2
MAX
120.0
26.0
45.0
22.0
MIN
-37.0
-45.0
-43.0
-42.0
STDEV
40
Chapter 8
48
Empirically exact up to basis points, in each single run of the experiment. One can however construct bizarre
synthetic examples where this is not so. These examples have nothing in common with real-world yield curves.
41
Chapter 9
In conjunction with the topic discussed here, several related questions arise. We list some of
those:
We have studied randomly missing data, which correspond to practical problems such
as human failure, networking problems, maybe also temporarily low liquidity, etc.
Another question is to study the impact of systematic errors on the performance of
our data completion methods. Such errors include missing data for the same grid point
several days in a row, missing data for whole intervals of the curve (short, middle, or
long section), inconsistent data sources for different parts of the curve (long/short end),
or for different periods in the time series history. Such problems also occur in practice,
and the reasons include permanently low liquidity for certain instruments, and changes
of the data provider (Bloomberg, Reuters, ) or data contributor (broker or exchange).
It was mentioned in Section 1.5 that data validation and correction is closely related to
data completion, but requires the additional provision of error bars and tolerance levels.
It shall be worthwhile to perform a similar empirical study also for this case.49
Extend the current empirical study to other types of market data (volatility matrices,
sets of related stock prices or stock indices, FX rates, commodity prices, deposit rates,
bond curves, credit spreads, etc).
Study the move from Markov processes to GARCH processes (Section 5.3) in more
detail. In particular, notice that the volatility will not increase with the square root of
time, as opposed to Brownian motion (cf. [4], p. 511 ff ). Nevertheless it may happen
that in the data flow chart presented in Section 1.1, the market data base publishes a
daily covariance matrix estimated with the EWMA formula of Section 4.2 for the case
t = 1 (trading) day, and the Risk Engine scales the according daily volatilities to
annual volatilities with the factor 255. Perform an empirical evaluation on various
kinds of market data to find out whether this conceptual inconsistency can be neglected
in practice due to only insignificant scaling errors.
Perform an empirical study for the algorithm suggested in [21] (also briefly mentioned
here in Section 3.3), and compare it to the algorithms described here. Generalise it to
the multivariate case, and check whether it can suitably be combined with our methods.
Calculate its (average) conditional bias (it is unconditionally unbiased, cf. 3.3), and
examine whether this can be rectified with a GARCH type approach (cf. Section 5.3).
Patch the eigenvalue calculation function of Asset Control with a C routine from [14].
49
42
Bibliography
[1]
Asset Control International BV, Asset Control User Manual, Version 4.1, 27 Feb 2001.
[2]
A. Brace, D. Gatarek, M. Musiela, The market model of interest rate dynamics, Math.
Finance 7, (1997), 127-154.
[3]
A.P. Dempster, N.M. Laird, D.B. Rubin, Maximum Likelihood from Incomplete Data via
the EM Algorithm, J. Royal Statistical Society, Series B, 39 (No. 1), (1977), 1-22.
[4]
[5]
Jeffrey N. Dewynne, Heath, Jarrow, and Morton, lecture notes module 5, Diploma Math.
Finance, Department for Continuing Education, University of Oxford, 1999.
[6]
[7]
Front Arena, DY Yield Curves ATLAS, Document FCA 1031-3C, Version 3.2, December
2001.
[8]
G.R. Grimmet, D.R. Stirzaker, Probability and Random Processes, 2nd ed., Oxford
Science Publications, Oxford 1992.
[9]
D. Heath, R. Jarrow, and A. Morton, Bond pricing and the term structure of interest rates:
A new methodology for contingent claims valuation, Econometrica 60 (No. 1), (1992),
77-105.
[10] John C. Hull, Options, Futures, and Other Derivatives, 4th ed., Prentice-Hall
International, London 2000.
[11] Chris Hunter, Practical implementation of Libor Market Models, lecture notes module 9,
M.Sc. Math. Finance, Department for Continuing Education, University of Oxford, 2002.
[12] Farshid Jamshidian, Libor and swap market models and measures, Finance and
Stochastics 1, (1997), 293-330.
[13] J.P. Morgan, Reuters,
New York 1996.
RiskMetrics
Technical
Document,
Fourth
Edition,
[14] William H. Press et. al., Numerical recipes in C: the art of scientific computing, 2nd
edition, Cambridge University Press, Cambridge 1992.
[15] M.M. Rao, Conditional measures and applications, Marcel Dekker, New York 1993.
[16] Riccardo Rebonato, Interest-rate option models, 2nd ed., John Wiley & Sons,
Chichester 1998.
[17] Riccardo Rebonato, On the pricing implications of the joint lognormal assumption for the
swaption and cap markets, J. Comp. Fin. 2 (No. 3), (1999), 57-76.
[18] Richard Rossmanith, Short cut to probability-theoretic random variables, assignment
report # 7, M.Sc. Math. Finance, Kellogg College, University of Oxford, 15 April 2002.
[19] Richard Rossmanith, Shortcut to conditional expectations, assignment report # 9, M.Sc.
Math. Finance, Kellogg College, University of Oxford, 27 May 2002.
43
Finanzmathematik,
Seminar
IFF,
IIR
Deutschland
GmbH,
[21] Daniel Schoch, Outlier Correction in Financial Market Data, thesis, Diploma Math.
Finance, Department for Continuing Education, University of Oxford, 16 Aug 2001.
[22] Shelemyahu Zacks, The Theory of Statistical Inference, John Wiley & Sons,
New York 1971.
44
For remarks on the Asset Control Formula Engine macro language, see Section 1.6. Contact
me if you want electronic copies of the listed routines.
A.1
forward.fe
Notice that there is also a similar function forward2par.fe for the inverse operation (input
forward rates, output par rates), which is not listed here for the sake of brevity.
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
function forward
Syntax:
forward(p)
Description:
Output:
Example:
x=[5,5,5,6,6,6,7,7,7];
y=forward(x);
[4.87, 4.87, 4.87, 9.03, 5.82, 5.82, 14.08, 6.76, 6.76]
(result has been rounded/shortened here.)
Date
2001-10-17
2001-10-21
Author
RR
RR
Description
changed forwards from
annual to instantaneous
function forward(p)
{
local i;
local t;
local s;
local b;
local f;
local r;
local n;
local c;
local unit;
unit=100;
// if units are percent, divide by 100, otherwise set 1 here
n=len(p);
// total number of coupons
s=0;
// init sum of discount factors
bb=1;
// init previous discount factor
f=[];
// init list of forward rates
for(i=0;i<n;i++)
// loop over par rates
45
{
c=p[i]/unit;
// t=i+1;
b=(1-s*c)/(1+c);
r=log(bb)-log(b);
f+=[r*unit];
bb=b;
s+=bb;
}
return(f);
}
A.2
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
backextrapolate.fe
function backextrapolate
Syntax:
backextrapolate(f)
Description:
Output:
Example:
x=[NA, 20, NA, 40, 50, NA, NA, 80, NA, NA];
y=backextrapolate(x);
print(y);
[20, 20, 40, 40, 50, 80, 80, 80, 80, 80]
Date
2001-10-21
Author
RR
Description
annual to instantaneous
function backextrapolate(f)
{
local c;
c=inverse(f);
c=fill(c);
46
c=inverse(c);
return(c);
}
A.3
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
extrapolation.fe
function extrapolation
Syntax:
extrapolation(a,b)
Description:
Output:
completed list
Example:
x=[2,4,6,$NA,10,12,$NA,16,18,20];
y=[1,3,5,7,9,11,13,15,17,19];
print(x);
print(y);
z=extrapolation(x,y);
print(z);
[2, 4, 6, NA, 10, 12, NA, 16, 18, 20]
[1, 3, 5, 7, 9, 11, 13, 15, 17, 19]
[2, 4, 6, 7, 10, 12, 13, 16, 18, 20]
Date
Author
Description
function extrapolation(a,b)
{
local c;
local i;
local n;
n=len(b);
c=list(n);
for (i=0;i<n;i++)
if (is_na(a[i]))
c[i]=b[i];
else
c[i]=a[i];
return(c);
}
A.4
covarmatrix.fe
// function covarmatrix
//
// Syntax:
covarmatrix(r,horizon,mode)
//
// Description: estimates the covariance matrix for a list of data series
//
which are given as lists themselves
47
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
r
-list of lists of historical time series
horizon -should be 1 (daily vola) or 250 (vola p.a.),
but can be any number on principle
mode
-string with values "EQUAL", "EXPONENTIAL"
Output:
Example:
x=[[1,3,5,7,9,8,8,8,8,8,8,8,8,8],
[2,4,6,6,6,7,8,9,10,11,12,13,14,15],
[3,5,7,5,3,6,8,10,12,17,16,18,20,22]];
y=covarmatrix(x,1,"EQUAL");
Date
Author
Description
function covarmatrix(r,horizon,mode)
{
local
local
local
local
local
local
decay;
n;
i;
j;
d;
c;
decay=0.94;
n=len(r);
d=list(n);
c=list(n);
//
//
//
//
//
if (mode == "EXPONENTIAL")
for (i=0;i<n;i++) {
d[i]=avg_ew(r[i],decay);
// exponential weighting
// loop over rows i=0,..,n-1
// Asset Control's sample drift, but notice
// that *our* estimation is zero (see below)
c[i]=list(n);
// reserve memory space for row c[i]
for(j=0;j<i+1;j++) {
// loop over cols j=0,..,i
cov=covar_ew(r[i],r[j],decay);
// calculate sample covariance
cov=cov+d[i]*d[j];
// Add sample drift in order to obtain
// *our* covariance estimation. Notice that
// d[j] is already calculated, since j<=i.
// Explanation: Following RiskMetrics, we estimate "a priory"
// the drift to be zero; consequently Covar[r[i],r[j]]=E[r[i]*r[j]].
// However, sample average and sample covariance in Asset Control use
// the "ordinary" formulas, so we have to add the sample drift again,
// according to the formula
// Covar[r[i],r[j]] = E[r[i]*r[j]] - E[r[i]]*E[r[j]].
c[i][j]=horizon*cov;
// Scale w/ horizon
c[j][i]=c[i][j];
// Fill matrix symmetrically. Notice
// that row c[j] already exists, since j<=i.
}
}
else
for (i=0;i<n;i++) {
d[i]=avg(r[i]);
48
c[i]=list(n);
for(j=0;j<i+1;j++) {
cov=covar(r[i],r[j]);
cov=cov+d[i]*d[j];
c[i][j]=horizon*cov;
c[j][i]=c[i][j];
}
}
return(c);
}
A.5
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
conditionaldistribution.fe
function conditionaldistribution
Syntax:
conditionaldistribution(x,m,v)
Description:
Output:
Example:
Author: Dr. R. Rossmanith, d-fine GmbH, Eschborn/Frankfurt
Version: 1.0
Date 2001-10-07
Changes:
Version
Date
Author
Description
function conditionaldistribution(x,m,v)
{
local n;
local i;
local j;
local y;
local nx;
local ny;
local mx;
local my;
local vx;
local vy;
local vyx;
local q;
local w;
n=len(m);
nx=len(x);
49
y=[];
// init vector y
w=[];
// init matrix w
ny=n-nx;
// determine dimension of unknown vector y
if(ny>0)
// otherwise nothing to do
{
mx=elt(m,[0,nx-1]); // determine unconditional expectation of x
my=elt(m,[nx,n-1]); // determine unconditional expectation of y
vx=submatrixcorners(v,0,0,nx-1,nx-1); // determine uncond covar of x
vy=submatrixcorners(v,nx,nx,n-1,n-1); // determine uncond covar of y
vyx=submatrixcorners(v,nx,0,n-1,nx-1); // determine uncond covar of y,x
vx=matrix(vx);
// technical change of data type
vy=matrix(vy);
// technical change of data type
vyx=matrix(vyx); // technical change of data type
q=matrix.inverse(vx);
q=matrix.multiply(vyx,q,[]);
// construct transformation matrix
y=seq(i=0;i<nx;i++) x[i]-mx[i]; // initialise y as central difference of x
y=matrix([y]);
// technical change of data type (row matrix)
y=matrix.multiply(y,q,["transpose_second"]); //apply transformation matrix
my=matrix([my]);
// technical change of data type (row matrix)
y=matrix.add(y,my,[]); // add uncond mean my to obtain cond mean y
y=elt(y,0);
// technical change of data type (back to lists)
w=matrix.multiply(q,vyx,["transpose_second"]);
w=matrix.add(w,vy,[]); // add uncond cov my to obtain cond cov w
w=elt(w,[0,ny-1]);
// technical change of data type (back to lists)
}
return([y,w]);
}
A.6
pcaestimation.fe
The functions of the type matrix.xyz in the listing below are not (yet) reliable (in Version 4.1
of the software, see also the remarks in 6.4). The code is listed for completeness.
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
function pcaestimation
Syntax:
pcaestimation(x,m,v)
Description:
Output:
Example:
Author: Dr. R. Rossmanith, d-fine GmbH, Eschborn/Frankfurt
Version: 1.0
Date 2001-10-21
Changes:
Version
Date
Author
function pcaestimation(x,m,v)
Description
50
{
local
local
local
local
local
local
local
local
local
local
local
local
local
local
local
local
local
local
n;
i;
j;
y;
nx;
ny;
mx;
my;
vf;
q;
qt;
qy;
qyx;
w;
ev;
h;
check1;
check2;
n=len(m);
// determine overall dimension
nx=len(x); // determine dimension of known vector x
ny=n-nx;
// determine dimension of unknown vector y
mx=elt(m,[0,nx-1]); // determine unconditional expectation of x
my=elt(m,[nx,n-1]); // determine unconditional expectation of y
vf=matrix(v);
// technical change of data type (matrix of floats)
h=matrix.eigenvectors(vf);
q=first(h);
// eigenvectors are in cols of q
// technical data format is matrix_float
ev=second(h); // eigenvalues are sorted in descending order
qt=matrix.transpose(q); // qt has eigenvectors in cols
qt=elt(qt,[0,n-1]);
qy=submatrixcorners(qt,nx,nx,n-1,n-1); // lower/right partition of qt
qyx=submatrixcorners(qt,nx,0,n-1,nx-1); // lower/left partition of qt
qy=matrix(qy);
// technical change of data type
qyx=matrix(qyx); // technical change of data type
h=matrix.inverse(qy);
// construct the ...
h=matrix.multiply(h,qyx,[]);
// ... transformation matrix h
y=seq(i=0;i<nx;i++) x[i]-mx[i]; // initialise y as central difference of x
y=matrix([y]);
// technical change of data type (row matrix)
y=matrix.multiply(y,h,["transpose_second"]); // apply transformation matrix
my=matrix([my]);
// technical change of data type (row matrix)
y=matrix.add(my,y,[1,-1]); // subtract from mean my to obtain expectation y
y=elt(y,0);
// technical change of data type (back to lists)
return(y);
}
A.7
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
complete.fe
function complete
Syntax:
complete(a,r,method)
Description:
51
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
//
completed list
Example:
Author: Dr. R. Rossmanith, d-fine GmbH, Eschborn/Frankfurt
Version: 1.0
Date 2001-10-07
Changes:
Version
Date
Author
Description
function complete(a,r,method)
{
local c;
local b;
local n;
local m;
local i;
local j;
local s;
local t;
local ds;
local dt;
local p;
local x;
local y;
local v;
local it;
local yold;
local d;
local e;
local nx;
local ny;
switch (method)
{
// CASE STRUCT: extrapolate along structure (right to left)
//
Here we ignore reference data.
case "STRUCT":
c=backextrapolate(a);
return(c);
// return completed list
break;
// CASE EXTRA: previous-day value extrapolation
case "EXTRA":
n=len(r);
// dimension of problem
c=list(n);
// provide storage space for completed list
for (i=0;i<n;i++) // loop through incomplete list
if (is_na(a[i]))
// if entry missing
c[i]=r[i];
// take value of reference vector
else
// if entry exists..
c[i]=a[i];
// ..take its entry
return(c);
// return completed list
break;
// CASE RELATIVE: extrapolation of relative changes w.r.t. previous day
//
(equivalent to extrapolation of log-changes).
case "RELATIVE":
n=len(r);
// dimension of problem
52
c=list(n);
// provide storage space for completed list
s=seq(i=0;i<n;i++) a[i]/r[i]; // relative changes today/yesterday
s=backextrapolate(s); // extrapolate changes as in case STRUCT
for (i=0;i<n;i++)
// loop through incomplete list
if (is_na(a[i]))
// if entry is missing ...
c[i]=s[i]*r[i];
// ... apply multiplicative change to ...
// ... reference value (this is equivalent to ...
// ... additive log-change application).
else
// else if entry exists..
c[i]=a[i];
// ..take its entry.
return(c);
// return completed list
break;
// CASE CE: Conditional expectation
case "CE":
n=len(r);
// dimension of problem
b=[];
// initialise list of "existing" values
x=[];
// initialise list of "existing" indices
y=[];
// initialise list of "missing" indices
for (i=0;i<n;i++)
// loop through incomplete list
if (is_na(a[i]))
// if entry is missing..
y+=[i];
// ..remember its index, ..
else
// ..else if it exists..
{
b+=[a[i]];
// ..take its value and..
x+=[i];
// ..remember its index.
}
p=x+y; // define permutation: existing entries first, then missing ones
// values of y are not needed any further now.
if(len(y)>0)
// otherwise nothing to do (nothing missing)
{
s=list(n);
// provide storage for permuted reference data
for (i=0;i<n;i++) // loop thru time series
s[i]=r[p[i]];
// permute reference data accordingly
v=covarmatrix(s,1,"EXPONENTIAL"); // calculate covar on permuted data
m=seq(i=0;i<n;i++) 0;
// initialise a priori drift as zero
// for next step, overwrite y:
y=conditionaldistribution(b,m,v); // calculate cond distr parameters
y=first(y);
// cond expectation (throw away cond covariance)
}
b+=y;
// append estimated values to known values
c=list(n);
// provide storage for "completed" vector
for(i=0;i<n;i++) // loop thru time series
c[p[i]]=b[i]; // permute completed values into original order
return(c);
// return result
break;
// CASE EM: Expectation maximisation
case "EM":
n=len(r);
// dimension of problem
b=[];
// initialise list of "existing" values
x=[];
// initialise list of "existing" indices
y=[];
// initialise list of "missing" indices
for (i=0;i<n;i++)
// loop through incomplete list
if (is_na(a[i]))
// if entry is missing..
y+=[i];
// ..remember its index, ..
else
// ..else if it exists..
{
b+=[a[i]];
// ..take its value and..
x+=[i];
// ..remember its index.
}
p=x+y; // define permutation: existing entries first, then missing ones
// values of x,y are not needed any further now.
if(len(y)>0)
// otherwise nothing to do (nothing missing)
{
s=list(n);
// provide storage for permuted reference data
53
54
}
b+=y;
c=list(n);
for(i=0;i<n;i++)
c[p[i]]=b[i];
return(c);
break;
}
}
//
//
//
//
//
55
Appendix: Histograms
All histograms on the following pages have been truncated at (approximately) 50 bp.
B.1
B.1.1
Frequency
Frequency
9 17 24 32 39 47
Frequency
Frequency
5 11 18 25 32 39 46
12 20 28 36 44 52
7 14 22 29 36 43 51
56
B.1.2
Frequency
Frequency
3 10 17 25 32 39 46 53
9 12 15 17
Frequency
Frequency
4 11 18 25 32 39 45
8 13 18 23 27 32 37
57
B.1.3
Frequency
Frequency
4 10 16 22 28 34 40 46
8 11 14 17 20
Frequency
Frequency
9 11 14 17 20
5 10 15 20 25 30 35
58
B.2
B.2.1
Frequency
Frequency
-8
10
19
28
37
46
Frequency
Frequency
10
19
29
39
49
7 15 22 30 37 44 52
59
B.2.2
Frequency
Frequency
12 20 28 36 44
7 11 15 19 23
Frequency
Frequency
6 14 21 28 35 42 49
60
B.2.3
Frequency
Frequency
8 15 23 30 37 45
9 15 21 26 32 38 44
Frequency
Frequency
9 14 19 25 30 35 41
6 11 16 21 26 31 36 41 46
61
B.3
B.3.1
Frequency
Frequency
14 23 31 40 48
-6
12
21
30
39
47
Frequency
Frequency
-51
-41
-31
-21
-11
-1
10
20
30
40
50
4 12 20 28 35 43 51
62
B.3.2
Frequency
Frequency
5 12 19 27 34 41 49
7 12 18 24 29 35 41
Frequency
Frequency
8 15 22 29 37 44
8 14 19 25 30 36 41 46
63
B.3.3
Frequency
Frequency
6 12 18 24 30 37 43 49
5 10 15 20 25
Frequency
Frequency
7 13 19 25 31 37 44
8 12 17 21