Sei sulla pagina 1di 34

Considerations in the Design and

Construction of Investment Real Estate


Research Indices

Authors D av i d G e l t n e r a n d D av i d C . L i n g

Abstract This paper surveys some of the major technical issues in the
design and construction of real estate research indices, both
appraisal-based and transactions-based. The paper considers
property sampling issues, differences between transaction prices,
market values, and appraised values, the trade-off between
random measurement error and temporal lag bias, optimal
reporting and property revaluation frequencies, and the uses and
limitations of modern statistical techniques. Although one of the
conclusions of the analysis is that most research questions are
best addressed with transactions-based, rather than appraisal-
based, indices in the United States, the paper suggests how
appraisal-based indices can still be useful for some research
purposes.

The past decade has seen the development of technological and information
advances that hold great potential for moving investment real estate performance
measurement to a new level of accuracy and usefulness for the industry. In
particular, the development of large-scale electronic databases of commercial
property transaction prices, and the advance of econometric techniques honed by
the academic real estate community offers a tremendous new opportunity to
advance the level of information and knowledge about the commercial real estate
asset class.
Geltner and Ling (2001, 2006) conclude that the real estate investment industrys
needs for performance measurement, research, and decision support are too diverse
to be optimally met by a single index product or a single type of index. They
present arguments for creating two separate families of index products: one
focused on the asset class research support role, the other focused on the
evaluation benchmarking and performance attribution support role.
Although Geltner and Ling (2001, 2006) discuss the general characteristics of
benchmark and research indices, the present paper focuses more narrowly, but
more deeply, on some basic technical considerations associated with the design
and construction of commercial real estate return indices for asset class research

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 1 2 G e l t n e r a n d L i n g

support purposes. In particular, this paper discusses in detail property sampling


issues, differences between transaction price and appraisal-based indices, the
trade-off between random measurement error and temporal lag bias, optimal
reporting and property revaluation frequencies, and the uses and limitations of
some econometric methods of index construction developed over the past decade
in the real estate academic literature.
The paper proceeds as follows. First, the important statistical qualities of a real
estate return index are identified. Second, there is a discussion of the essential
differences among transaction prices, appraised values, and market value. Third,
a simple stylized model of the property valuation estimation process for asset
class research is presented, followed by a discussion of the practical implications
of the model for the construction of research-oriented return indices. Fourth, there
is a discussion of the optimal reporting and property revaluation frequencies for
appraisal-based indices, as well as property sampling issues in a transaction-based
research index. The paper closes with a summary of the findings and concluding
remarks.
It is worth noting that this paper is not a classical scientific research article in
that a hypothesis is not presented and tested. Rather, this article is meant to be
expository and demonstrative and to communicate some essential key points on
investment real estate index construction to a broader audience of both academics
and technical practitioners.

Statistical Qualities of a Real Estate Return Index


The dynamic statistical quality of a return index refers to the type of periodic
time-series statistics that can be computed from the index, as well as the quality
of those statistics. For example, how frequently can return statistics be calculated?
Does the index tend to be noisy? To what degree do its periodic returns exhibit
temporal lag bias? The four major attributes (or dimensions) of the index that
interact to determine its dynamic statistical quality include:
1. Index return reporting frequency;
2. Frequency of revaluation observations per property;
3. Number of properties in the underlying population tracked by the index;
and
4. Index construction technology, or methodology used to construct the
index from the underlying valuation observations.
Reporting frequency refers to the periodicity of reported returns or price-
changes in the index (e.g., annually, quarterly, or monthly). Labeling this
frequency m per year, m 1 for an annual index, m 4 for a quarterly index,
and m 12 for a monthly index. Higher frequency reporting implies shorter
individual return periods, as the period length is the inverse of the frequency.
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 1 3

Revaluation frequency refers to the average number (or fraction) of transaction


price observations or (serious) reappraisals per year per property in the index
population.1 Labeling this per year, 1 if each property is reappraised on
average once per year, 0.2 if properties in the subject population transact (are
bought and sold) on average once every five years.
The property population refers to the number of properties in the subject
population of properties whose performance is to be tracked over time by the
research index. This population may be viewed as a market or an asset class
or a portfolio. Labeling this population p, it can be seen that the average
valuation observation sample size per index reporting period, the sample
density, labeled n, is defined as n p/m.
The nature of the underlying valuation data has an important impact on the
dynamic statistical nature and quality of the index. The fundamental source of
information about property values is transaction prices in the relevant population.
Appraisals are indirect indicators of value, which are based, in turn, on
observations of transaction prices. Thus, if the valuation observations used to
construct the index are appraisals, then the sample density n will actually be
based on a larger sample of underlying transaction observations (including all
transactions that influenced the appraisals), albeit a sample that likely spans more
than one index reporting period. To eliminate confusion and make this distinction
explicit, henceforth the label n will refer only to the sample of underlying
transaction price observations per index reporting period.
Finally, index construction technology refers to the procedure by which
individual disaggregate valuation observations are aggregated to compute the
reported index return each period. In the case of an appraisal-based index, this
may be as simple as averaging the constituent properties appraised values and
comparing this with the average of the same properties valuations the preceding
period. However, modern statistics provides more sophisticated procedures for
producing transaction price-based indices and for dealing with problems such as
stale appraisal reports in appraisal-based indices.

D e n s i t y o f Tr a n s a c t i o n s v e r s u s R e p o r t i n g F r e q u e n c y

For a given index reporting frequency, statistical quality improves the greater the
frequency of property revaluation, the greater the number of properties in the
population, and the more statistically efficient and effective is the index
construction methodology employed. Holding these characteristics constant, the
greater the index return reporting frequency, the lower the dynamic statistical
quality per reporting period (as the underlying sample density, n, shrinks
accordingly). Thus, broadly speaking, there is a trade-off between statistical
quality per period and the frequency of index reporting, holding constant the
overall quantity and quality of raw valuation data and index construction
methodology. This trade-off is depicted in Exhibit 1a.

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 1 4 G e l t n e r a n d L i n g

E x h i b i t 1a The General Trade-off between Index Statistical Quality Per Period, and the Index Reporting
Increase Statistical Quality Frequency
Per Period

Increase Return Reporting Frequency m

On the other hand, the usefulness of an index for research purposes clearly
increases the greater the frequency of reporting, holding statistical quality constant
(per period). Thus, the usefulness, or utility, of a research index is two-
dimensional. It is likely that there is diminishing marginal substitutability between
these two dimensions: statistical quality per period, and the frequency of periods.
Thus, utility isoquants are convex, as indicated in Exhibit 1b.
The result is that the optimal index reporting frequency, in general, may be either
ambiguous or not unique, as suggested in Exhibit 1c. Multiple indices optimally
designed for different reporting frequencies may be useful for representing and
studying the same underlying population of properties. For example, it might be
useful to have both a quarterly index, optimally designed as such, and a separate
annual index, optimally designed for that frequency, that is not simply a temporal
aggregate (such as compounded growth) of the quarterly returns from the quarterly
index.
The frequency of valuation observations per property is generally beyond the
control of the index producer in a transactions-based index. This is because
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 1 5

E x h i b i t 1b Index Utility Isoquants: Index Utility Increases With Both Statistical Quality Per Period and
Reporting Frequency: U0 U1 U2; With Diminishing Marginal Substitutability (convex isoquants)
Increase Statistical Quality
Per Period

U2
U1
U0

Increase Return Reporting Frequency m

properties are sold at the discretion of their owners, not for the sake of providing
input data for an index. As a result, in a transaction-based index, only three of
the four quality-determining attributes can be manipulated: return reporting
frequency, scope of the subject property population, and index construction
methodology. On the other hand, the frequency and timing of reappraisals might
be determined to some extent by policy set by the index producer in the case of
an appraisal-based index that tracks a population of properties that is regularly
marked to market. However, appraisals are costly (as is the collection of
transaction price data), so there are limits to how much the index producer can
control valuation observation frequency, even in the case of an appraisal-based
index.
In the remainder of this paper, it is generally assumed that the index reporting
frequency (m), and the valuation observation frequency (), are determined
exogenously. The focus will be on the other two dimensions of index design: the
optimal index construction technology, and the optimal definition of property
population scope.

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 1 6 G e l t n e r a n d L i n g

E x h i b i t 1c Optimal Index Reporting Frequency is Ambiguous or Not Unique


Increase Statistical Quality
Per Period

U2
U1
U0

Increase Return Reporting Frequency m

S o m e B a s i c C o n c e p t s : Tr a n s a c t i o n P r i c e , A p p r a i s e d
Va l u e , a n d M a r k e t Va l u e
Transaction prices and appraised valuations (or appraisals) are empirically
observable values, but occur and exist only when a property transacts or is
appraised. In contrast, market values are conceptual constructs, also referred to as
true value. Market value exists for each property at each point in time, although
any given propertys (or a portfolio of properties) market value generally changes
continuously through time because information arrives continually that is relevant
to property values.
The market value of a property is frequently defined in real estate as the most
likely (or the expected) transaction price of the property, as of a given point in
time. It may therefore be thought of as the mean of the ex ante transaction price
probability distribution as of the stated date. Market value is therefore the
opportunity cost of holding onto the property rather than selling.
Market value should also closely approximate the actual transaction price in a
highly liquid, dense market where homogeneous assets are frequently bought and
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 1 7

sold by numerous buyers and sellers, such as the trading of equity shares in the
stock market. In such an environment, market values represent market clearing
prices at which the number of buyers equals the number of sellers for
homogeneous, divisible assets.
In private real estate markets, market values are not empirically observable, unlike
transaction prices or appraised values. This is because whole assets must be bought
and sold, and these lumpy assets are unique and infrequently traded, and
exchanged in a private transaction between two parties. The inability to observe
market values, however, does not imply they do not exist.
The classical model (whose roots go back at least to Plato) is that observable
transaction prices are individual draws from underlying probability distributions
that are centered around the unobservable true market values of the properties
being transacted. Transaction prices at any given point in time therefore exhibit
cross-sectional dispersion around the underlying (unobservable) true market values
as of that point in time. This difference between the observable transaction price
and the unobservable true value is often referred to as transaction price noise,
or transaction price error. Cross-sectional dispersion occurs because market
participants cannot observe the true market value of the property being traded, so
neither side in the negotiation knows exactly at what price the property should
trade. In a given transaction, one side or the other will typically end up getting
the better deal, although it will usually be impossible to know which side it is.
Either the buyer or seller may have better information, be under a little less
pressure to close the deal, or simply be more skillful or lucky in the negotiation.
Appraised values are also dispersed cross-sectionally around true market values,
as of any given point in time. If two appraisers are employed to value a property,
and they are prevented from communicating with each other, they will almost
certainly not arrive at the same estimate of market value. At least one of them
must be wrong in the sense that their valuation differs from the true market
value of the subject property. In fact, both appraisers are probably wrong in
this sense. The difference between a given appraised value and the (unobservable
true) market value is called appraisal error, although there is no implication that
the appraiser has exhibited any incompetence, negligence, or impropriety.
Although appraised values are dispersed around the underlying true values, unlike
transaction price dispersion, the appraisal value dispersion is not necessarily
centered on the true value. In other words, the ex ante mean appraised value may
not equal the true market value as of the same point in time. Such bias may result
from very rational behavior on the part of the appraiser given the nature of the
empirical information available in the real estate market. A major bias that is
likely to exist is that appraised values lag true contemporaneous market values.
This is referred to as temporal lag bias.2 Normative and behavioral models
consistent with the temporal lag bias hypothesis of appraisal have been put forth
in the academic literature (e.g., Quan and Quigley 1991; and Chinloy, Cho, and
Megbolugbe 1997).

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 1 8 G e l t n e r a n d L i n g

A S i m p l e E x a m p l e o f Te m p o r a l L a g B i a s

While there may be more than one source or type of temporal lag bias in
appraisals, the following illustrates what is probably the major hypothesis for
commercial property indices. Suppose you hire an appraiser who faces the
following choice. The appraiser can give you a value estimate based on Method
A, which will be unbiased, but this value estimate has only a 50% chance of
being within 10% of the true market value. Alternatively, the appraiser can provide
a value estimate based on Method B, which is biased. In particular, the expected
value of the Method B appraisal equals the true market value from six months
ago, rather than the true value of today. Thus, if true market values were 1% lower
six months ago, Method B has a 1% bias on the low side. However, there is a
75% chance that the Method B appraisal will be within, say, 5% of the true market
value of the property. Furthermore, Method B will provide you with more solid
historical evidence explicitly documenting the estimated value (e.g., more
comps). The appraiser says he will charge you the same price for either method.
Which would you prefer?
The most typical answer is Method B, because it will probably give you a more
useful estimate of value, provided you are concerned about documenting the value
of the individual property that is the subject of the appraisal. Said differently, you
are probably willing to accept a little bit of temporal lag bias in order to get a
more precise and well-documented estimate of market value. Of course, if you
own the property and are in the process of negotiating a sale price, you would
probably hesitate to sell for a price equal to the appraised value because you
would probably be aware, in this example, that market values have been increasing
over the past half year and, furthermore, that the appraisal likely contains some
error. You would try to sell for as high a price as you could, above the appraisal
if possible.
However, if you are not concerned about accurately estimating the value of a
single property, but rather primarily interested in obtaining an up-to-date estimate
of the aggregate value of a large portfolio of properties, then you should prefer
the result of Method A for the individual property in question. This is because
the purely random valuation error in Method A will be diversified away at the
portfolio level, thanks to what is known in basic statistics as the Square Root of
N Rule. In contrast, the temporal lag bias in Method B is systematic, and will
not be diversified away at the portfolio level. But the standard practices and
procedures of the appraisal profession in the United States, and the typical method
of employing and using appraisers in the institutional real estate investment
industry in the country, probably results in appraisal methods more similar to
Method B than A.
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 1 9

A S i m p l e M o d e l o f P r o p e r t y Va l u e E s t i m a t i o n f o r A s s e t
Class Research
A simple stylized model of the property value estimation process as viewed from
the perspective of asset class research follows. The objective of this process is to
estimate the market value (as defined in the previous section) of a specified
population of properties (the market or asset class of interest) or,
equivalently, the market value of a representative property within that population,3
as of each period of time, t. True market value is defined as Vt. Suppose that
within each period of time, n properties are sold with observable transaction
prices. These transaction prices are labeled Pi,t, for i 1, 2, . . . , n.4 Each
transacting property is an equally valid representative of the type or class of
properties in the population of interest. As described in the preceding section,
these n observable transaction prices are dispersed randomly around the true
market value:

Pi,t Vt i,t, (1)

where i,t is the random error or noise term, which is iid normal with zero mean
and standard deviation .5
Now consider the arithmetic average across N of the transaction prices starting
with transactions that are among the n that occur within period t. This estimator
is V t[N]:

P .
N
1
V t[N] i,t (2)
N i1

Basic statistics relates that V t[N] will have a purely random standard deviation (or
standard error) of / N. That is, the purely random error (or noise)
component in the market value estimate is reduced according to the Square Root
of N Rule. Furthermore, since each transacting property is an equally valid
representative of the population, if the N transaction observations all occur within
period t (which is only possible if N n), then V t[N] will also have a mean of
Vt. In this case, V t[N] will be an unbiased estimator of the current (period t) market
value of the population.6
Now consider what happens if the sample size N is greater than n, the number of
transaction observations in any one period t. The random error magnitude of the
estimator will still equal / N, but the mean will no longer be Vt. Some of the
N transaction price observations will have to be drawn from one or more earlier
periods in time; for example, from t1, or perhaps even from t2, and t3, and

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 2 0 G e l t n e r a n d L i n g

so on. This will cause the mean of V t[N] to be some average of Vt and previous
market values of the property population: Vt1, and perhaps Vt2, Vt3, and so on.
This will induce temporal lag bias into the V t[N] estimator.
Given a population transaction density of n per period, the policy question
becomes: how shall N be selected so as to develop the estimate of Vt each period?
Clearly there is a trade-off between the two types of error that will exist in the
estimates. The larger the N, the smaller will be the purely random error whose
magnitude is: / N. However, increasing N beyond n will add temporal lag bias.
The ratio N/n determines how far back in time to reach for the estimation sample.
If N 2n, the transaction observations are drawn from two periods of time, t and
t1. To simplify the analysis of the policy question without loss of generality,
assume a sample size would never be chosen that is less than a whole multiple
of n. That is, if the decision is to go back to a given period of time in order to
increase the transaction sample size, all transactions within that period will be
included. Thus, the ratio N/n will be an integer, and the maximum lag in the
estimation sample, L, will be one less than the N/n ratio: L N/n1. The general
expectation of V t[N] can now be specified as:

V
N / n1 L
1 1
E[V t[N]] Vts . (3)
N/n s0 L1 s0
ts

This implies that there will be L/2, or (N/n1)/2, periods of lag bias in the
estimator.
To use this model to decide the optimal sample size (and hence, the optimal lag
bias, if any), an objective function needs to be specified. A logical one often used
in statistics is to minimize the mean squared error (MSE) in the estimate, where
the error is the total error, including both the random noise and the lag bias. That
is, N is selected (or equivalently L) so as to minimize the expectation:

MSE E[(V t[N] Vt)2]

E (P V ) (P V )
1
N
n

i1
i,t t
2n

in1
i,t1 t

. . . (P V )
N 2

i,tL t (4a)
iNn1

N
n (V V )
N N / n1
1
2
2
i
2
ts t
2
i1 s1

(n(L1))
n (V V ) .
n(L1) L
1
2
2
i
2
ts t
2
i1 s1
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 2 1

If, for illustrative purposes, the true market value follows a random walk: Vt
Vt1 rt, where rt is iid with zero mean,7 and the true market volatility is r,8
this objective function simplifies to:


VAR[i] n2VAR[rt] N / n1 2 2 n2 r2

N / n1
MSE s s2
N N2 s1 N N2 s1
r

2 2 L
s2. (4b)
n(L 1) (L 1)2 s1

The MSE thus consists of two terms. The first term on the RHS of Equation 4b
is the purely random error effect, which consists of the cross-sectional price
dispersion variance 2 times a noise factor, 1/(n(L1)), that is diminishing in
the number of lags. The second term on the RHS of Equation 4b is the lag bias
error effect. It is the true market return volatility (squared) times a lag factor,
which is the sum of the squared lags divided by (L1)2.
The objective is to find the value of L that minimizes the MSE. Clearly the noise
factor in the overall error diminishes with every increment in L (and more so the
larger is n, the transaction density per period). If purely random error were the
only type of error, as large a lag as possible could be selected (i.e., use all of the
historical price data just to estimate the current value of Vt). However, the lag
factor component in the overall error term increases with each increment in L,
thus presenting a trade-off. As lags are added, the noise factor is reduced by a
diminishing increment, while the lag factor in the MSE is increased by an
increasing increment.9 Thus, if the MSE is not already minimized at L 0, then
it will be minimized at some finite value of L as we step up through the integers:
L 1, 2, . . . n.
The optimal lag can be characterized by examining the finite differential of the
MSE as a function of L:

s s
L1 L


2 2
MSE 1
s1 s1
r2 2. (5)
L L2 L1 n(L 1)(L 2)

Setting this incremental error equal to zero, generates following optimal lag
criterion:10

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 2 2 G e l t n e r a n d L i n g

MSE
0,
L

s s
L1 L


2 2
1
s1
s1
r2 2, (6)
L2 L1 n(L 1)(L 2)


L1 L


s2 s 2
2
s1
s1
(L 1)(L 2) .
L2 L1 n 2r

As the LHS of Equation 6 is clearly an increasing function of L, the optimal lag


is an increasing function of the price dispersion , and a decreasing function of
both the transaction density n and the market return volatility r, all of which
agrees with common sense.11 The model also shows that the optimal lag is a
function only of the ratio of the two dispersion parameters, cross-sectional divided
by longitudinal standard deviation, not of the absolute values of these parameters
individually. To determine the optimal lag, a numerical algorithm is used that starts
with L 0, and increments one lag at a time as long as the MSE is decreasing.

Practical Implications for Research Index Specification

Exhibit 2 shows the optimal maximum lag, L, measured in the number of index
return reporting periods (t), as a function of the transaction density per period, n,
and the ratio of the cross-sectional to longitudinal standard deviation: ( / r).
Thus, for example, if there are two transaction observations per period, and the
cross-sectional standard error in the transaction price observations is four times

E x h i b i t 2 Optimal Lag (L) as Function of Dispersion Ratio and Transaction Density

Dispersion Ratio ( / r)

Transaction Density 4 2 1 0.5

n1 6 3 1 0
n2 4 2 0 0
n 10 1 0 0 0
n 35 0 0 0 0

Note: Optimal lag in number of periods.


C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 2 3

the longitudinal volatility per period in the true market returns, then the optimal
lag is four periods. That is, the MSE of the value estimate each period will be
minimized with a sample size of N 10, which will include two observations
from the current period t, plus eight additional observations, two from each period
going back through t4. Such an index estimation methodology will result in
L/2 4/2 2 periods of temporal lag bias (E[V t[N]] Vt2). The purely random
component of the total error in the index (log) value levels will have a standard
deviation of / N / 10 /3.16.
Note that the optimal lag diminishes rapidly as the transaction density increases
and as the dispersion ratio decreases. The optimal lag is zero (i.e., the population
value estimation uses contemporaneous transaction observations only) for
transaction densities as low as two transaction observations per period with the
dispersion ratio as high as unity. With this in mind, it is important to note that
the dispersion ratio decreases with the length of the index reporting periods (i.e.,
with the inverse of the index reporting frequency). For serially independent market
returns, the dispersion ratio is exactly proportional to the inverse of the square
root of the length of time in each index reporting period. Thus, for example, the
dispersion ratio in a quarterly index would be twice that of an annual index of
the same market.
On the other hand, the transaction density is a direct linear function of the length
of time in each reporting period, such that n for a quarterly index is one-fourth
that for an annual index of the same property population. The combination of
these two relationships keeps the optimal lag measured in calendar time
approximately the same, for a given market (that is, for a fixed population of
properties characterized by a given transaction density per unit of calendar time).
There is considerable empirical evidence that typical values of and r range
from 5% to 10% per year.12 Thus a dispersion ratio of about unity, with a range
of between 1/2 and 2, is probably realistic for annual frequency periods, implying
a dispersion ratio between 1 and 4 would be plausible for quarterly periods.
Assuming 10%, Exhibit 3 shows the root MSE (RMSE),13 as well as the
optimal lag (in number of periods) and the optimal sample size (N). The RMSEs
are based on the optimal lag methodology, under three different periodic volatility
assumptions that correspond to three different index reporting frequencies (annual,
quarterly, and monthly). It is assumed that r 10% per year.14
Exhibit 3 reveals the types of trade-offs a research index designer faces. In
principle, the index designer can control the reporting frequency and the
estimation sample size for each periodic value estimate. However, the designer
cannot control the underlying transaction density per unit of calendar time, or the
underlying market price dispersion (), or the volatility per unit of calendar time
(r). Thus, if the underlying transaction density per month is one, the index will
face a transaction density per index reporting period of either n 1, n 3, or n
12, depending on whether the reporting frequency is set at the monthly,
quarterly, or annual frequency. Similarly, if the underlying market volatility is 10%

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 2 4 G e l t n e r a n d L i n g

E x h i b i t 3 RMSE at Optimal Lag (L, & Optimal Sample Size N) as a Function of Transaction Density and
Market Volatility (both per period)

Volatility per Period, r

Transaction Density (Monthly Index) (Quarterly Index) (Annual Index)


per Period 2.9% 10% / 12 5.0% 10% / 4 10.0% 10% / 1

n1 5.42% (L5, N6) 6.85% (L3, N4) 8.66% (L1, N2)


n3 3.95% (L3, N12) 4.79% (L1, N6) 5.77% (L0, N3)
n 12 2.50% (L1, N24) 2.89% (L0, N12) 2.89% (L0, N12)

Note: Table assumes 10%, r 10% / year.

per year, then the index will face an underlying true volatility per period of either
r 2.9%, r 5%, or r 10%, depending on whether the index is to be
monthly, quarterly, or annual, respectively.
In this case, the index designer faces the possibilities indicated in the cells on the
diagonal in the table in Exhibit 3. The designer can improve the accuracy of the
index as measured by its RMSE by reducing the index reporting frequency
(increasing the period length), moving from a RMSE of 5.42% for a monthly
index to as little as 2.89% error in an annual index. However, the usefulness of
an index for research purposes is not purely a function of its RMSE. For many
important research questions, a higher frequency index would be more useful, and
for some questions the random noise component of the RMSE would not matter
as much as the lag bias component.
Conceivably, the index designer might attempt an alternative strategy to improve
the accuracy of the index. Instead of reducing the reporting frequency (aggregating
across time), the index designer might attempt to broaden the underlying
population of properties covered by the index, thereby increasing the underlying
transaction density per unit of calendar time. This would increase the transaction
density n for any given period length (i.e., without reducing index reporting
frequency). With this type of strategy (aggregating across space or across property
market segments), the movement down within the columns of Exhibit 3 reveal the
kind of improvements in valuation RMSE that can be obtained, first from tripling
the size of the covered population of properties, then by quadrupling it again. For
example, viewing all properties as representative of the target market, the top
row might refer to an index covering 1,000 properties, the middle row to an index
covering 3,000 properties, and the bottom row to an index covering 12,000
properties. However, the trade-off here may be that as one expands the cross-
sectional scope of the covered population of properties, one loses precision in the
definition of the actual real estate market that is represented by the index.
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 2 5

For example, instead of an index of Boston office properties, one might be dealing
with an index of all office properties in the Northeast, or in the entire U.S.

I m p l i c a t i o n s f o r A p p r a i s a l - B a s e d a n d Tr a n s a c t i o n s -
Based Indices
In this section, the simple property value estimation model is used to examine
whether it would be better for a real estate research index to be based on traditional
appraisals or on direct transaction prices. First consider Exhibit 4a, which presents
a qualitative picture of the general noise-versus-lag trade-off and property value
estimation methodology optimization modeled in the previous section.
In Exhibit 4a, the vertical axis quantifies how up-to-date the value estimate is;
points farther up the axis represent less temporal lag bias in the value estimate
(and hence in any index based on such estimates). The horizontal axis quantifies
how precise is the value estimate; points farther to the right on the axis have less
purely random error (noise). The dashed curves represent user utility isoquants,
reflecting the utility (usefulness) of the value estimates for those using them. Such
utility increases from southwest to northeast: U0 U1 U2. The isoquants

E x h i b i t 4a The Noise vs. Lag Trade-off Frontier and Optimal Property Value Estimation Methodology
Reduced temporal lag bias

A
LA /2

U2
U1
T U0

2 /NA

Reduced random error

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 2 6 G e l t n e r a n d L i n g

are convex because of diminishing marginal utility in either type of value


estimation improvement. Once a value estimate has little temporal lag bias, it is
preferable to reduce purely random error if such error is relatively large. And once
a value estimation process has very little purely random error, a bigger increase
in utility is realized by reducing temporal lag bias than by further reducing random
error if the temporal bias is relatively large.
The concave curve labeled T is the lag-versus-noise trade-off frontier afforded
by a particular property value estimation technology (or methodology). This
trade-off frontier is concave because of the Square Root of N Rule, while the
sample size is directly proportional to the temporal lag in the value estimation
methodology (recall that: L N/n1). Points northeast of the trade-off frontier
are not technically feasible.
In the present context, the term value estimation technology refers to whether
the value estimate is appraisal-based or transactions-based, which is a
regression-based procedure such as hedonic or repeat-sale models (sometimes
referred to as mass appraisal or automated valuation models). Of course,
both traditional appraisals and transactions-based regression models are based
on transaction price observations, so this labeling is somewhat imprecise.
Nevertheless, in the context of commercial property index design, these labels are
helpful, as indices can be based either on the aggregation of individual traditional
appraisals within the subject population of properties, or based on transactions-
based regression models of the populations aggregate value. The key difference
between appraisal-based models and regression-based models is in how many
transaction observations are used, and how they are used.
For a given reporting frequency and a given definition of the property market
represented by the index (i.e., for a given underlying population of properties),
the optimal transaction sample size and implied optimal lag for the indicated index
technology is found by the point of tangency between the noise-versus-lag trade-
off frontier and the index utility isoquants, as indicated at point A in Exhibit
4a. This is the point that maximizes the utility of the index. This implies an
optimal estimation sample size of NA, which results in purely random variance in
the index value estimates of ( 2 /NA), and a maximum lag in the transaction
observation sample of LA NA /n1, with resulting lag bias of LA /2 periods.
Not all research applications of indices place the same relative importance on
minimizing lag bias versus random error. For example, a study that is primarily
interested in the long-term mean investment return achieved by the subject
population of properties might not be very sensitive to a few periods of temporal
lag bias. In contrast, a study of high frequency correlation between real estate and
stock market returns would be very sensitive to even a small amount of temporal
lag bias, but would be less affected by random error.15 For this reason, the
objective function for the index may differ according to usage. As an example,
Exhibit 4b displays the same index construction technology as in Exhibit 4a, but
depicts an index usage utility function that places greater value on minimizing
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 2 7

Reduced temporal lag bias E x h i b i t 4b Greater Utility for an Up-to-Date Valuation

T
LB /2 B

U2

U1

U0

2 /NB

Reduced random error

temporal lag bias. The isoquants are now shallower, tilted more horizontally to
skew the trade-off preference toward minimizing temporal lag bias. The result is
a different optimal point on the trade-off frontier, point B, that specifies a
smaller optimal estimation sample size: NB NA, resulting in less average lag:
LB /2 (NB /n1)/2 (NA /n1)/2, but with more purely random error.
The minimization of the MSE performed in the previous section may be viewed
as a particular utility function, which reflects a particular weighting of lag bias
error and purely random error. In particular, it weights these two forms of error
equally. Minimization of the MSE (or, equivalently, of the RMSE) is not
necessarily the most appropriate objective function for all uses of a research index,
but it is a widely used objective function in statistics. Exhibit 5 quantifies a
particular optimization based on the minimum-MSE objective function and the
stylized model of the previous section. Recall from Exhibit 2 that with one
transaction price observation available per period (n 1) and a cross-sectional-
to-longitudinal dispersion ratio of ( / r) 4, the optimal maximum lag in the
sample is LA 6 (first row, first column in Exhibit 2). This, in turn, implies an
optimal transaction sample size of NA n(LA1) 1(61) 7. In this situation,
if it assumed that the random error standard deviation is 10%, then the
RMSE (or standard total error percent) is 5.09% based on Equation 4b.16

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 2 8 G e l t n e r a n d L i n g

E x h i b i t 5 Optimal Valuation Methodology for Traditional Appraisal:


U MIN[MSE], n 1, (/r) 4:
LA 6, NA 7.
E[lag] 3 periods; Random Error Variance 2 /7.
RMSE 5.09% if 10%.

0
Lag Bias in Periods

Optimum @
Tange nt Pt. A.
3
A

U = MIN[RMSE]
= 5.09%

1 1/2 1/7

Random Error (fraction of 2 )

Now consider the difference between an index based on the stylized appraisal-
based value estimation technology versus one based on the stylized
transactions-based technology. Traditional appraisals of commercial properties
consider only a few comparable sales transactions (comps), spanning perhaps
a year or more in time. In contrast, transactions-based index regressions typically
employ sample sizes of at least a couple dozen transaction price observations
within each index reporting period. By taking as the objective the valuation of a
given subject property, the appraiser limits the relevant transaction sample to only
those properties that are very similar in structural and locational characteristics to
the subject property. The result is a sample density, labeled n (the number of
available relevant transaction observations per unit of time) that is inevitably very
small. In contrast, by taking as the objective the valuation of an aggregate subject
population of properties (the market or asset class that the index is meant to
track), the regression (mass appraisal) methodology is able to expand the
relevant transaction sample density many-fold, to a much larger value of n.
For example, a traditional appraisal of a specific Class A office building in
downtown Boston might be able to find, say, one comp per calendar quarter. Thus,
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 2 9

nA 1. As seen in Exhibit 5, if the ratio of cross-sectional to (quarterly)


longitudinal dispersion is 4, then an appraiser seeking to minimize the MSE of a
valuation would select an optimal sample size of NA 7 comps, spanning seven
quarters, including one comp from six quarters ago (LA 6). These selections
would give the appraisal a temporal lag bias of LA /2 6/2 3 quarters. As
previously noted, if the cross-sectional transaction price dispersion is 10%,
this will give the appraisal a RMSE of 5.09%.17 This methodology may indeed
be optimal for the valuation of a single individual office property in Boston.
However, if the market (the property population) of interest is not just that of
a single office building in Boston, but rather that of all of the downtown Class A
office buildings in the 35 largest U.S. cities (roughly, the major league cities
in which most traditional core institutional investment has occurred), then
instead of n 1, n 35.
One way to estimate the value of such a population would be to construct an
appraisal-based index, in which the traditional appraisals of the Boston property
are averaged with 34 other similar traditional appraisals of 34 other similar
properties in the other 34 cities. The result would be, effectively, a sample size
of 245 (35*7). The purely random component of the valuation error would be
reduced by a factor of the square root of 35, from / 7 to / 245, which is
indicated in Exhibit 6. Constructing such an appraisal-based index of U.S. Class
A downtown office property values is represented in Exhibit 6 by the movement
from point A to point B. Assuming 10%, the purely random error
component will be reduced from 3.77% (10%/ 7%) in the individual appraisals
down to 0.64% (10%/ 245) in the aggregate index. This reduction in random
noise is obtained automatically by virtue of the aggregation of the subject
population of properties being valued. However, the index will still have the
typical appraisal lag of three quarters. Given the assumptions of a cross-sectional/
longitudinal dispersion ratio of 4 and cross-sectional dispersion of 10%, the
market volatility is 2.5% per quarter, which would cause the three-quarter lag
bias to induce a lag error component of 3.41% in the value estimation.18 This
gives the appraisal-based index a standard error of 3.47% (RMSE
0.00642 0.03412). Although this is better than the standard error of 5.09% in
the individual appraisals, it is not optimal by the MSE-minimization criterion.
Referring back to Exhibit 2, the optimal lag with n 35 is not six quarters, but
rather zero. Moreover, the optimal sample size is 35, not 245. Such an optimal
valuation procedure is represented by point C in Exhibit 6. This would be the
type of valuation methodology typically employed in a transactions-based index
regression model, such as a hedonic or repeat-sales model. By expanding the
cross-sectional scope of the population of interest (from just Boston to the whole
country) within the valuation exercise itself, the valuation procedure can be
tailored to such an aggregate valuation problem, enabling the regression-based
methodology to eliminate completely the lag bias. The standard error of such a
transactions-based index, based on the MSE-minimizing optimal procedure of
point C, is 1.69% (10%/ 35), all of which is purely random noise. Thus, a

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 3 0 G e l t n e r a n d L i n g

E x h i b i t 6 Optimal Methodology for a Transactions-Based Index:


U MIN[MSE], n 35, (/r) 4:
LC 0, NC 35.
E[lag] 0 period; Random Error Variance 2 /35.
RMSE 1.69% if 10%.

0 Corner Solution
Lag Bias in Periods

TAgg C Optimum @ Zero


Lag: Point C.
TDis

3 UC: RMSE=1.69%
A B

UB: RMSE=3.47%

UA: RMSE=5.09%

1 1/7 1/10

Random Error (fraction of 2 )

quantitative comparison of the appraisal-based versus the transactions-based index


construction methodologies in this example is as shown in Exhibit 7.
Note that while the transactions-based index is clearly superior in both the total
error and the lag bias dimensions, the appraisal-based index is superior in the

E x h i b i t 7 Example Comparison of Appraisal-Based vs. Transactions-Based Aggregate Property Value Index

Error Type Appraisal-Based Index Transactions-Based Index

Purely Random (noise) 0.64% 1.69%


Temporal Lag Bias 3.41% 0
Total 3.47% 1.69%

Note: Error components measured in square root of variance.


C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 3 1

random error dimension. This is because in the present model, the individual
appraisals have less noise than individual transaction prices.19 In a population
where the properties are reappraised more frequently than they tend to transact,
another reason for such a result would be the larger cross-sectional sample size
of appraisals compared to transactions during each index reporting period.
Although the advantage of an appraisal-based index in this regard should not be
ignored, it is important to realize that many research questions are more sensitive
to temporal lag bias than to purely random error, particularly when the random
error is already low, as would be the case when the underlying property population
is sufficiently large to provide a large transaction sample each period.
Exhibit 6 reveals that the valuation procedure for index construction can be
conceptualized as an expansion of the noise-versus-lag trade-off frontier, from the
traditional appraisal-based frontier oriented toward disaggregate valuation, labeled
TDis, to the transactions-based regression procedure frontier aimed at aggregate
valuation labeled TAgg. Indeed, this frontier is not only expanded, but its shape
is changed. For estimating aggregate market value, the feasible noise-versus-lag
trade-off frontier becomes much steeper and straighter as it is pushed out to the
right toward the 2 0 boundary in the noise dimension (as it is impossible to
have a variance less than zero). Although this frontier is still concave in shape, it
is effectively kinked at the lag 0 boundary in the vertical dimension. The
result is that, even with the MSE-minimization utility function (which does not
value lag bias error minimization any more than random error minimization), the
optimum in aggregate valuation will often be a corner solution at zero lag. This
important general result is shown in Exhibit 8.20

S u m m a r i z i n g t h e M o d e l s I m p l i c a t i o n s
The stylized model described in the preceding two sections is clearly a
simplification of reality. For example, appraisers do not mechanically estimate
values by taking a simple average of the prices of the comps they use, and they
do not ignore changes in the market since the times when the comps transacted.
Nevertheless, appraisers tend to be conservative (and usually rightfully so) when
deviating from the direct market evidence presented by sales of comparable
properties. The model captures the essence of the distinction between basing a
real estate performance index on the aggregation of traditional individual property
appraisals versus constructing the index directly from transaction prices in a
regression-based procedure. With this in mind, the key policy implications from
the preceding analysis can be summarized as follows.
Consider a population of properties that are regularly appraised (marked to
market) and among which there are transactions in every period. However, the
transactions are less frequent than the appraisals; for example, the properties might
be reappraised annually, but transact on average only every 510 years. For a
research index meant to track the investment performance of such a population
over time, the following general conclusions apply:

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 3 2 G e l t n e r a n d L i n g

E x h i b i t 8 Noise-vs.-Lag Trade-off Frontiers with Disaggregate (Traditional Appraisal) and Aggregate


(Transactions Based Regression) Valuation Methodologies
Reduced temporal lag bias

Lag Bias
in Periods

Corner Solution
Optimum @ Zero
Lag.
0 C
T Agg
T Dis

LA /2
A B
U1
T Agg

T Dis
U0

1/NA 1/NC 1/(QNA) Random Error (fraction of 2 )


0

Reduced random error

NA Disaggregate (Appraisal) Optimal Sample Size (# comps) nA(LA1), nA comps density / period;
NC Aggregate (Transactions Based Regression) Optimal Sample Size (obs per period);
Q Number of Properties (or market segments) in the Index Population.

For research that is highly sensitive to temporal lag bias, but less sensitive
to random error in the index returns, transactions-based indices are
preferable to appraisal-based indices, because of the temporal lag bias
inherent in appraisal-based returns.
For research that is highly sensitive to random error, but not very
sensitive to temporal lag bias, an appraisal-based index may be preferable
to a transactions-based index because of the greater frequency of the
appraisals in the population and (possibly) because the appraisals may
be less noisy than the transaction prices.
For research that is equally sensitive to lag bias error and random error
(e.g., as represented by the MSE-minimization objective for the index),
transactions-based indices are preferable to appraisal-based indices.
Except for indices tracking small populations of properties where
transaction density is less than two or three dozen observations per index
reporting period, transactions-based indices minimizing the MSE
criterion can be produced with no temporal lag bias.
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 3 3

In addition to the above general conclusions, the analysis here suggests an


intriguing possibility. A non-traditional appraisal-based index can be imagined
that might include the major advantages of both the traditional appraisal-based
index and the transactions-based index. If the regular property appraisals in the
population could be made independent of any lagged property market information,
they would not have the lag bias problem, yet their greater frequency than the
transactions within the subject population of properties would increase the
effective sample size available to construct the index. To accomplish such a result,
new appraisers would probably have to be hired for each reappraisal, and not
provided with the previous appraisal report, such that each appraisal would be
truly independent of any prior appraisal.21 Furthermore, the appraisers would have
to be instructed to only use comparable sales from the current index reporting
period. Such appraisals would violate traditional appraisal practice guidelines, and
would no doubt not be optimal for individual property value estimation, but when
aggregated into an index, they would provide a more effective source of data for
estimating the population market value.22

Reporting and Property Revaluation Frequency for


Appraisal-based Indices
Although the above analysis suggests that transactions-based indices will most
often be optimal for research purposes, appraisal-based indices will likely remain
an important source of research information for property populations that are
regularly marked to market. There is, however, an important aspect of appraisal-
based indices that affects their temporal lag bias and index reporting frequency in
a manner not considered in the previous model. In particular, appraisal-based
indices present the question of the reappraisal timing policy within the index
subject property population. In general, it is not necessary that the index reporting
frequency in an appraisal-based index must equal the property revaluation
frequency of the constituent properties in the population. For example, an index
could report quarterly returns when its properties are being revalued annually.
However, there are some important trade-offs and methodological issues that
should be considered in addressing this question.
Suppose you have a population of properties, each of which is reappraised once
per year. If all properties are reappraised as of the same time (e.g., as of December
31), you can produce an annual index that will report up-to-date annual returns
based on the entire population of property revaluations. That is, there will be no
lagging or smoothing in the index relative to contemporaneous appraised values.23
With this annual-common-date appraisal procedure, you will not be able to
produce a higher frequency index because no intermittent valuation observations
will be available.
Assume, on the other hand, that appraisals are still performed once-per-year at
the individual property level, but staggered throughout the year in the population

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 3 4 G e l t n e r a n d L i n g

of properties. For example, some properties are on a January reappraisal cycle,


some are on a February cycle, and so forth. In this case, an index with greater
than annual frequency could be produced. However, if such an index is constructed
as a simple aggregate of all the properties reported values each quarter, the index
will suffer from a temporal lag bias due to a large proportion of effectively stale
valuation reports in each quarter. Properties reappraised in January will still be
entering the index the following December at values equal to their previous
January valuation. This stale appraisal effect would result in temporal lag bias in
the index even if the individual appraisals had no lag bias when they were
conducted on each property. The lag bias in the index would be greater, measured
in average number of reporting periods of lag, the greater the reporting frequency.
For example, with annual property-level reappraisals, the average stale appraisal
lag in an annual index would be one-half period, two periods in a quarterly index,
and six periods in a monthly index (i.e., one-half year in all cases).

Mitigating the Stale Appraisal Problem

The problem of stale appraisals may not be solved by simply requiring all data-
contributors to fully reappraise their properties each period. In practice, such brute-
force techniques may work fairly well at the annual frequency, for example, in
Britain with the IPD Annual Index. But there is some evidence that the brute-
force technique does not work well at quarterly or greater frequencies. For
example, the IPD Monthly Index in Great Britain shows signs of temporal lag
bias of the type expected from a stale appraisal effect. The brute-force technique
has trouble working at quarterly or monthly frequencies for two reasons: expense,
and lack of sufficient current real estate market information.
Data contributors to appraisal-based indices often try to mitigate the stale appraisal
problem by updating their valuation reports between major annual reappraisals.
But in order for an appraisal update to be temporally independent, it must be
completely independent of the previous value report (or any prior value report)
for the subject property. This is not merely an independent valuation report in
the legalistic sense that an external fee appraiser signs off on the update. Temporal
independence for the purpose of avoiding the stale appraisal effect requires that
the appraiser estimating the updated value be entirely unaware of any previous
value estimates for the subject property. Superficial updating of the value reports
(so-called desktop appraisals) between the annual full (or clean-slate)
reappraisals will be inadequate, because they will inevitably rely heavily on prior
valuation reports.
Fundamentally, there is often too little purely contemporaneous valuation
information, within the current quarter or month, for the appraiser to get a
sufficiently precise indication of the subject propertys value. Unless the property
is a very common or homogeneous type that trades frequently, there may be little
or no transaction price evidence available purely from the current period. This
makes it extremely difficult for the appraiser to avoid the use of stale information.
This is a bigger problem the shorter is the time interval between valuation updates.
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 3 5

It is possible to produce a return index whose reporting frequency is greater than


its reappraisal frequencies without suffering from the lagging caused by the stale
appraisal effect. The most widely-used procedure to accomplish this is the
repeated-measures regression (RMR). This technique uses the entire available
sample of serious reappraisals in the index population. The RMR procedure
enables all the properties in a regularly marked-to-market population to be
reappraised, say, annually (or even less frequently), yet the index could be reported
on a quarterly (or more frequent) basis and not suffer from a stale appraisal effect,
as long as each index reporting period contains a sufficient number of serious
reappraisals. Typically, a few dozen repeat-valuation observations per index
reporting period are sufficient. Under some traditional statistical assumptions, it
can be shown that the RMR is the most powerful method for making such
inferences (see Bailey, Muth, and Nourse, 1963).

I n d e x e s t o Tr a c k D i f f e r e n t Ty p e s o f P r o p e r t y P o p u l a t i o n s
The above analysis is oriented to the type of index design problem where the
index is meant to track a population of properties that is owned by a particular
class of investors who require that the properties be regularly marked to market.
This is true of the NCREIF Index in the U.S. and the IPD Index in Great Britain.
In such populations, appraisals are available for index construction, and they are
more frequent than transaction prices. However, if we take the population of all
commercial properties of a given type and/or geographic location; appraisals will
often not be available for most properties in the population, and/or appraisals will
often be less frequent than transactions. For indices designed to track such
populations or markets, transactions-based indices have even more advantages.
Indeed, in such circumstances, transactions-based indices may be the only
solution, and the emphasis should be on using statistically rigorous techniques
such as hedonic or repeat-sales regressions rather than simple median price indices
that do not control for quality differences in the properties that transact each
period.
Another type of aggregate property value estimation process is represented by the
estimation of net asset value (NAV) for real estate investment trust (REIT)
properties. The process of regular estimation of REIT NAV by stock market
analysts who follow the REIT industry is akin to the process involved in
transactions-based index estimation in that NAV estimates are aggregate (or mass
appraisal) valuations. In other words, NAV estimation is oriented toward
optimizing the value estimate of a portfolio of properties, namely all those owned
by a given REIT. However, the actual NAV estimation process is less formal and
less based on econometric techniques than transactions-based index estimation.
An intriguing possibility in NAV estimation is the role played by the subject
REITs share price on the stock exchange. There is considerable evidence that
REIT share prices lead the property market in time.24 The NAV analysts attempt
to apply cap rate information from relevant property markets to NOI information

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 3 6 G e l t n e r a n d L i n g

from the REITs own reported accounts. However, in the numerous judgments that
must be made in the NAV valuation process, it is likely that analysts are influenced
by the availability of a prominent, very appropriate comp, i.e., the REITs
current stock price. The influence of the REIT share price, as a leading indicator
of the actual relevant property market value, would act to make the NAV estimate
less lagged than would be a typical property appraisal. This is combined with the
fact that, as noted in the previous analysis, optimal mass appraisals (which is what
NAV estimates essentially are) have less lag than traditional individual property
appraisal. The implied result is that NAV estimates, though informal and therefore
difficult to rigorously compare to econometrically-based valuation methods, may
provide a rather accurate and up-to-date estimate of their subject property portfolio
values. Clayton (2004) provides some evidence that this is indeed the case.

P r o p e r t y S a m p l i n g I s s u e s i n a Tr a n s a c t i o n s - B a s e d
Research Index
Indices of property populations that are regularly marked to market can be quite
useful for research purposes if optimally designed for research, as described above.
Nevertheless, it is also desirable to develop indices of broader populations of
properties, and for specific commercial property markets. Therefore, this section
presents some basic considerations in the design of such indices. In particular,
what is the optimal population size and scope for a transactions-based research
index.
Statistics theory puts forth that the power and usefulness of a sample drawn from
a large population is a function of three characteristics:
1. How much random dispersion exists in the underlying population;
2. The absolute size of the sample (not the percentage of the population
included in the sample); and
3. The representativeness of the sample (i.e., the lack of bias in its
collection).
As noted previously, the effect of sample size on statistical power is a function
of the square root of the number of independent observations in the sample.
Because the square root is a concave function, declining benefits to scale exist for
sample size increments, other things being equal. However, there may also be
economies of scale in collecting data when tracking commercial property markets,
at least up to a point. Thus, the benefit/cost ratio of expanding the sample size
may be more favorable than implied by the Square Root of N Rule.
Because of declining returns to scale, it may be wasteful to attempt to include all
or even a large fraction of a large population in a statistical sample to be used for
research purposes. Rather, if the population is diverse and complex, it may be
more efficient to stratify the sample. Sample stratification refers to the
identification of strata or cells consisting of sub-populations or segments of
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 3 7

the overall population that are relatively homogeneous. For example, one cell or
strata could be office properties in Florida. It is necessary to sample from each
cell, but within each cell the Square Root of N Rule applies.
How big a property sample would be necessary for a transactions-based research
index of U.S. commercial property investment performance? The answer depends
on a number of considerations. At a general level, the most important include:
1. What types of research questions will the index be optimized to address?
2. How much random dispersion exists in the underlying population
regarding the variables that are important for addressing the target
research questions?
3. How much standard error (e.g., in terms of a 95% confidence interval
around the best estimate) is acceptable in the index reports?
As suggested previously, there is evidence that cell sample sizes as low as two
dozen transaction observations per index period and sometimes even less are
sufficient to produce useful return indices in a commercial property market when
regression analysis is combined with noise-filtering techniques such as the
Bayesian use of ridge regression (see, for example, Geltner 1997a; Gatzlaff and
Geltner 1998; and Geltner and Goetzmann, 2000).
To investigate how to determine the minimum necessary cell sample size, consider
a simple stylized example, building on the previous model. Assume as before that
the random noise in transaction prices () has a standard deviation of 10%. Now
assume that the true market return volatility (r) is 5% per index reporting period.
Assume also that the research index periodic return will have a standard error
equal to no more than one-half the true periodic return volatility. Thus, the sample
needs to be designed so that the index returns will have a standard error of 2.5%
each.25
As before, the standard error of the index value level is / NC, where NC is the
transaction sample size within each cell.26 Because returns are first differences of
value levels, the index return standard error is equal to the square root of twice
the square of the index value level standard error. Thus, the index return standard
error is 2( / NC)2. Setting this equal to 2.5% and solving for NC, the equation
becomes:

2 (10%)2
NC 1 64, (7)
(r) /2 (2 5%)2 /2
2

where is the fraction of the true return volatility the index return standard error
should fall within. Said differently, the index requires a cell sample size of 64
transactions per period per cell. This result holds regardless of how many

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 3 8 G e l t n e r a n d L i n g

properties are in the underlying population within the cell and would apply with
classical regression technology.27 The Bayesian noise-filtering technique noted
previously can often reduce this sample size requirement by half or more.28
This type of analysis can be used to determine how many cells, or property market
segments, on which the index can report; that is, how finely the overall
commercial property asset class can be carved up for research purposes. For
example, suppose the total population of U.S commercial properties consists of
1,000,000 properties, which transact, on average, once every five years. This
provides an average of 200,000 transactions per year, or 50,000 per quarter. If a
minimum cell size is NC 50 (which is a bit conservative), this implies that the
overall commercial real estate market (or asset class) in the U.S. can be broken
into 1,000 cells, or market segments, that can be included in a quarterly index.
For example, assume the overall market can be usefully classified into, say, five
property sectors (office, retail, etc), with an average of three size or quality
categories per sector (e.g., Class A, B, etc. or, alternatively, below $5 million; $5
50 million; and greater than $50 million). Assume also that there are 35
metropolitan areas or geographic regions of interest. This would imply a total of
525 cells (5 3 35), well within the 1,000 cell limit imposed by the accuracy
requirements for the index. However, some gerrymandering of cell definitions
would normally be necessary to achieve sufficient sample size in all cells (e.g.,
requiring the combination of some smaller market segments into composite cells).
Moving up the strata to define, for example, national property sector indices, the
relevant sample size increases, leading to greater statistical power and less need
to gerrymander the cells. In the previous example with 525 cells of 50 observations
each, the resulting 26,250 (525 50) overall transaction sample per quarter would
include an average of 5,250 (26,250/5) observations per quarter, per property type
sector. This far exceeds the number of observations necessary to provide a highly
accurate return index. In order to aggregate properly across cells, however, the
relative sizes of each market segment must be known as a proportion of the total
population. Thus, scientific sample stratification requires not only a determination
of the minimum sample size in each cell, but the relative magnitude of the
population within each cell. For example, we would like to know the relative
market value of each MSA, each property type within each MSA, and each sub-
type or size category within each type. In general, it is considered acceptable to
estimate the relative cell sizes without recourse to an entire population census.
Nevertheless, some method of surveying or otherwise estimating the relative cell
magnitudes would need to be applied.
Such a stratified index can be designed to be far more useful than an index with
a larger overall sample size but where the observations are drawn randomly from
the population, with over-representation from some cells and no representation at
all from others.
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 3 9

Conclusion
Since the founding of NCREIF almost three decades ago, statistical methodologies
useful for investment performance index construction have advanced dramatically,
notably including developments such as the repeated-measures regression (RMR)
and related noise-filtering techniques. In recent years, electronic databases of
commercial property prices have been developed that far exceed the quality and
coverage of those databases available only a few years ago. Together, these two
developments offer capabilities for transactions-based indices, mass appraisal, and
other tools that could be very useful for improving real estate research indices.
Real estate indices developed or improved going forward should strive to
incorporate these modern standard statistical methodologies and databases.29
This paper has surveyed some of the major technical issues in the design and
construction of such modern real estate research indices, both appraisal-based and
transactions-based. It has discussed property sampling issues, differences between
transaction prices, market values, and appraised values, the trade-off between
random measurement error and temporal lag bias, optimal reporting and property
revaluation frequencies, and the uses and limitations of modern statistical theory.
Although one of the conclusions suggested by the analysis is that most research
questions are best addressed with transactions-based, rather than appraisal-based,
indices in the U.S. (largely because of the temporal lag bias problem in appraisal-
based indices), this paper discusses how appraisal-based indices can still be useful
for some research purposes, especially if they are optimized for such purposes.
Clearly, great benefit can be obtained from expanding the property universe
coverage in developing new research-oriented indices going forward. In this
regard, the paper discusses how stratified statistical sampling techniques can be
used to facilitate the most efficient and effective expansion of asset class coverage.
Finally, as the U.S. real estate investment industry continues to improve the
quantity and quality of indexing /benchmarking information products, it should
carefully guard against the use of, or reliance on, black boxes. To increase
the credibility of the private real estate asset class, and to ensure continued
improvement of the index products, it is vitally important that proprietary
production techniques and methodologies used in the development and production
of index products be fully available to public and academic scrutiny and criticism
(while nevertheless protecting the confidentiality of proprietary data).

Endnotes
1
In an appraisal-based index, revaluation frequency refers to how often the properties
composing the index are reappraised. In practice, reappraisal has multiple definitions,
ranging from superficial desk-top appraisal updates to full independent fee
appraisals. For the purpose of determining or understanding the important dynamic
characteristics of an index, superficial reappraisals do not have much effect. As a result,

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 4 0 G e l t n e r a n d L i n g

the term revaluation or reappraisal frequency will be used to refer to serious


reappraisals, which generally means more than just a cursory update of value.
2
Early treatments of appraisal lag (or smoothing) include Brown (1985) and Blundell
and Ward (1987). For articles describing empirical or clinical evidence of temporal lag
bias, see Chaplin (1997), Chinloy, Cho, and Megbolugbe (1997), Diaz and Wolverton
(1998), Hamilton and Clayton (1999), Fisher and Geltner (2000), and Fu (2003). A
theoretical challenge to the appraisal lag bias hypothesis was presented by Lai and Wang
(1998).
3
A representative property is defined as a property whose market value is a constant
fraction of the population market value.
4
For now, assume that market value is constant within any period t, or the only interest
is the average market value within each period (and that transaction observations are
uniformly likely at any time during each period).
5
Think of all prices and values in logs, so that the additive error here is multiplicative
in straight levels.
6
Strictly speaking, per the previous note, this will be an estimate of the log value, but
this will not affect the key insights available from this model.
7
That is, the log value increments are iid with zero mean. Again, this assumption
simplifies the math without altering the essence of the insight we obtain from the model.
8
That is, the longitudinal standard deviation per period in the true market returns is r,
or: 2r VAR[rt].
9
For example, with L 1, the noise factor is 1/(2n) and the lag factor is 1/4. With L
2, the noise factor is 1/(3n), or two-thirds of the noise factor with L 1 (with the
actual value of this incremental reduction in the factor being inversely proportional to
n), while the lag factor in the overall error grows to (1222)/(21)2 (14)/32
5/9, which is more than twice the factor with L 1 and an absolute increment of
0.5560.250 0.301. With L 3, the noise factor of 1/(4n) is a shrinkage only down
to 3/4 of what it was at L 2, while the lag factor now at (1222 32)/(31)2
14/16 0.875 includes a larger increment of: 0.8750.556 0.319 over its L 2
value. This pattern is general, and continues, guaranteeing the existence of a unique,
optimal lag value at some L 0.
10
The differential in Equation 5 will never exactly equal zero because the domain of
Equation 5 is discrete valued. But it will be negative for small values of L, and then
turn positive at some larger value of L. The optimal (MSE minimizing) value of L is
on the boundary between the last negative and first positive value of the differential (i.e.,
it is one of the other of those two values). The nature of this optimum can be explored
by notionally setting the value of the differential in Equation 5 equal to zero.
11
This model is essentially a discrete time version of Quan and Quigleys (1991)
continuous time optimal appraisal model, except that the Quan-Quigley model was
conditioned on the existence of a prior value estimate by the same appraiser (optimal
updating), whereas the present model is an unconditional optimization, not assuming
any prior value estimation. By working in discrete time, the important index policy
issues can be addressed, such as index reporting frequency and transaction price sample
size. The model here is also very similar to, and the results here are consistent with,
those in Giaccotto and Clapp (1992) and Geltner (1997b). Giaccotto and Clapp explored
a broader range of appraisal rules and allowed for serial correlation in the market value
returns. Geltner (1997b) employed a numerical simulation rather than an analytical
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 4 1

model, had declining appraiser weights on comps with the age of the comp, and also
relaxed the random walk assumption of the underlying true real estate returns. [Also,
Geltner (1997b) considered the use of REIT returns to improve appraisal accuracy.]
12
For the evidence on , see: Goetzmann (1993), Diaz and Wolverton (1998), Geltner,
Young, and Graff (1994), Geltner (1998), and Crosby, Lavers, and Murdoch (1998). For
the evidence on r, see: Fisher, Geltner, and Webb (1994), Fisher, Gatzlaff, Geltner, and
Haurin (2003), and Fisher, Geltner, and Pollakowski (2007).
13
Recall that the model is using log values, so the value error units equate to percentage
errors in straight levels.
14
Note that return standard errors are greater than valuation standard errors. The square
of the noise variance component doubles in the returns compared to the log value
estimates, as the returns are the first-differences of the log value levels across time.
15
The covariance between the real estate returns and the stock market, or the beta of
real estate, will be unaffected by purely random error. However, the correlation
coefficient between the two returns series, and the precision in the estimation of the
beta, will be reduced by the presence of noise in the real estate returns, while the real
estate volatility will be increased by purely random error.
16
With ( / r) 4, and 10%, r 2.5%. Plugging these values (and N7) into
Equation 4b, gives:

5.09% .102
7

.025
7
2

s.
6

s1
2

17
Empirical evidence from the NCREIF Index suggests that this is, roughly, a typical
amount of lag and of error in such an appraisal (see Fisher and Geltner, 2000; and
Geltner and Goetzmann, 2000).
18
Substituting r 2.5% and LA 6 into the lag bias error component of Equation 4b,
the root of the lag bias error component becomes:


2 6
.025
s2 (.00357)2(91) 0.0341.
7 s1

19
In the real world, there is some question whether this is in fact the case. For example,
the studies of appraisal accuracy cited in the earlier endnote suggest typical random
appraisal error magnitudes at least in the 5% to 10% range, which is not much, if any,
below the transaction dispersion magnitude found in repeat-sales regression models of
housing prices (see, e.g., Goetzmann, 1993).
20
Giaccotto and Clapp (1992) found a similar result. However, they characterized the
different appraisal objectives as that between estimating returns versus estimating
values, whereas the focus here is on the difference between estimating values (or,
hence, returns) for individual properties versus aggregate portfolios or indexes of many
properties, or in effect, using transactions-based versus traditional appraisal-based
indices for tracking changes in values.

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 4 2 G e l t n e r a n d L i n g

21
Perhaps this could be accomplished by a rotation of appraisers.
22
Ling, Naranjo, and Nimalendran (2000) propose the use of a latent variable model for
the construction of total return indices that makes use of both appraisal-based and
transaction-based returns.
23
The tendency of the appraised values themselves to exhibit temporal lag bias was
addressed in the previous model.
24
See, for example, Gyourko and Keim (1992), Barkham and Geltner (1995), Giliberto
and Mengden (1996), and Geltner and Goetzmann (2000).
25
With normally distributed errors, this would mean that each periodic return reported by
the index would have about a two-thirds chance of being within 250 basis points of the
true return that period.
26
Note that according to the stylized model discussed previously: NC nC, the observation
sample size used to construct the index will equal the cell populations available
transaction density per period, as transactions-based indices generally have an optimal
lag of zero. Exceptions to this rule might occur, for example, with a very small
underlying property population size, or with a very short index reporting period, or a
research purpose that places extreme importance on reducing random error while not
caring much about temporal lag bias error.
27
This assumes the cell population is large. If the cell population is very small, then the
necessary 2.5% standard error might be achieved with a smaller sample size.
28
Other recently developed econometric techniques may also be useful in this regard, such
as spatial autocorrelation models.
29
A recent example is the transactions-based version of the NCREIF Index developed and
published by the MIT Center for Real Estate. [See: Fisher, Geltner, and Pollakowski
(2007).]

References
Bailey, M., R. Muth, and H. Nourse. A Regression Method for Real Estate Price Index
Construction. Journal of the American Statistical Association, 1963, 58, 93342.
Barkham, R. and D. Geltner. Price Discovery in American and British Property Markets.
Real Estate Economics, 1995, 23:1, 2144.
Blundell, G.F. and C.W.R. Ward. Property Portfolio Allocation: a Multi-Factor Model. Land
Development Studies, 1987, 4:2, 14556.
Brown, G. The Information Content of Property Valuations. Journal of Valuation, 1985, 3,
35062.
Chaplin, R. Unsmoothing Valuation-based Indices Using Multiple Regimes. Journal of
Property Research, 1997, 14:3, 189210.
Chinloy, P., M. Cho, and I.F. Megbolugbe. Appraisal, Transaction Incentives, and
Smoothing. Journal of Real Estate Finance and Economics, 1997, 14, 89112.
Clayton, J. Tracking Value in the Private Real Estate Market: A Comparison of Green
Street NAV and NCREIF Value Indices. Working Paper, University of Cincinnati, 2004.
Crosby, N., A. Lavers, and J. Murdoch. Property Valuation Variation and the Margin of
Error in the UK. Journal of Property Research, 1998, 15:4, 30530.
Diaz, J. III and M.L. Wolverton. A Longitudinal Examination of the Appraisal Smoothing
Process. Real Estate Economics, 1998, 26, 34958.
C o n s i d e r a t i o n s i n t h e D e s i g n a n d C o n s t r u c t i o n 4 4 3

Fisher, J. and D. Geltner. De-Lagging the NCREIF Index: Transaction Prices and Reverse
Engineering. Real Estate Finance, 2000, 17, 722.
Fisher, J., D. Gatzlaff, D. Geltner, and D. Haurin. Controlling for the Impact of Variable
Liquidity in Commercial Real Estate Price Indices. Real Estate Economics, 2003, 31:2:
269303.
Fisher, J., D. Geltner, and H. Pollakowski. A Quarterly Transactions-Based Index of
Institutional Real Estate Investment Performance and Movements in Supply and Demand.
Journal of Real Estate Finance & Economics, 2007.
Fisher, J., D. Geltner, and R.B. Webb. Value Indices of Commercial Real Estate: A
Comparison of Index Construction Methods. Journal of Real Estate Finance & Economics,
1994, 9:2, 13764.
Fu, Y. Estimating the Lagging Error in Real Estate Price Indices, Real Estate Economics,
2003, 31:1, 7598.
Gatzlaff, D. and D. Geltner. A Transaction-Based Index of Commercial Property and its
Comparison to the NCREIF Index. Real Estate Finance, 1998, 15, 722.
Geltner, D. Bias & Precision of Estimates of Housing Investment Risk Based on Repeat-
Sales Indexes: A Simulation Analysis. Journal of Real Estate Finance & Economics, 1997a,
14:1/2, 15572.
Geltner, D. The Use of Appraisals in Portfolio Valuation and Index Construction. Journal
of Property Valuation & Investment, 1997b, 15:5, 42348.
. How Accurate is the NCREIF Index as a Benchmark, and Who Cares? Real Estate
Finance, 1998, 14:4, 2538.
Geltner, D. and W. Goetzmann. Two Decades of Commercial Property Returns: A
Repeated-Measures Regression-Based Version of the NCREIF Index. Journal of Real Estate
Finance and Economics, 2000, 21, 522.
Geltner, D. and D.C. Ling. Ideal Research and Benchmark Indexes in Private Real Estate:
Some Conclusions from the RERI/PREA Technical Report. Real Estate Finance, 2001,
17:4, 112.
. Benchmarking and Return Performance in the Private Commercial Real Estate
Market, International Real Estate Review, forthcoming 2006.
Geltner, D., M. Young, and R. Graff. Random Disaggregate Appraisal Error in Commercial
Property: Evidence from the Russell-NCREIF Database. Journal of Real Estate Research,
1994, 9:4, 40319.
Giaccotto, C. and J. Clapp. Appraisal-Based Real Estate Returns under Alternative Market
Regimes. Journal of the American Real Estate and Urban Economics Association, 1992,
20:1, 124.
Giliberto, S.M. and A. Mengden. REITs and Real Estate: Two Markets Reexamined. Real
Estate Finance, 1996, 13:1, 5660.
Goetzmann, W. The Single Family Home in the Investment Portfolio. Journal of Real
Estate Finance & Economics, 1993, 6:3, 20122.
Gyourko, J. and D. Keim. What Does the Stock Market Tell Us About Real Estate Returns.
Real Estate Economics, 1992, 20:3, 45786.
Hamilton, S.W. and J. Clayton. Smoothing in Commercial Property Valuations: Evidence
from the Trenches. Real Estate Finance, 1999, 16, 1626.
Lai, T-Y. and K. Wang. Appraisal Smoothing: The Other Side of the Story. Real Estate
Economics, 26, 51136.

J R E R Vo l . 2 8 N o . 4 2 0 0 6
4 4 4 G e l t n e r a n d L i n g

Ling, D., A. Naranjo, and M. Nimalendran. Estimating Returns on Commercial Real Estate:
A New Methodology Using Latent Variable Regression. Real Estate Economics, 2000,
28(2). 20531.
Quan, D.C. and John M. Quigley. Price Formation and the Appraisal Function in Real
Estate Markets. Journal of Real Estate Finance and Economics, 1991, 4, 12746.

The authors thank the Pension Real Estate Association and the Real Estate Research
Institute for providing partial funding for this research. However, all statements and
assertions are the opinions of the authors. They also thank Ko Wang, the editor, and
several anonymous referees for helpful comments and suggestions.

David Geltner, Massachusetts Institute of Technology, Cambridge, MA 02139 or


dgeltner@mit.edu.
David C. Ling, University of Florida, Gainesville, FL 32611-7160 or ling@ufl.edu.

Potrebbero piacerti anche