Sei sulla pagina 1di 6

Vol. 100, No.

914 The American Naturalist September-October, 1966

MEASUREMENT OF "OVyERLAP" IN COMPARATIVE


ECOLOGICAL STUDIES

HENRY S. HORN

Department of Zoology, University of Washington, Seattle*

Many animal ecologists are currently engaged in comparative studies of


diet, habitat preference, seasonal patterns of abundance, and faunal lists.
Quantitative analysis of such problems requires an objective measure of
the amount of overlap between samples of species, time, or energy, dis-
tributed proportionally into various qualitative categories.
A number of indices are available for measuring this overlap. Those that
compare the relative occurrences of pairs of species have been reviewed by
Morisita (1959b), who also proposed an index of similarity between com-
munities. Indices with similar characteristics have been derived from in-
formation theory by Margalef (1956) and MacArthur (1965). It is the purpose
of the present paper to review and interpret those indices of overlap that
are sensitive to varying proportional compositions of the samples compared,
to present them in the simplest general form that is consistent with their
use as empirical measures of overlap, and to give some indication of the
conditions under which each is most appropriate.
The indices presented below are intended to serve as empirical measures
and should not be interpreted as estimates of statistical parameters of the
populations from which the samples are drawn, or as "tests" for hetero-
geneity. The distinction between a measure and a parameter or a test has
been discussed with reference to ecological data by Cole (1949) and by
Lloyd and Ghelardi (1964).

THE INDICES OF OVERLAP

We have characteristic samples {XO} and I Y0 from populations X and Y,


respectively. Out of a total of S species in both samples, species i is

represented xi times in IXOJ and yi times in JYOJ

S S

X= xi; Y= jyi.
-i=1 i=1

Morisita (1959b) presents the following index of overlap:

2 Xi Yi

CX~ = i=1
(AX + ly)XY

*Present address: Department of Biology, Princeton University, Princeton, New


Jersey.

419

This content downloaded from 131.252.125.148 on Fri, 02 Dec 2016 00:31:54 UTC
All use subject to http://about.jstor.org/terms
420 THE AMERICAN NATURALIST

where A is Simpson's (1949) index of diversity:

s s

Lxi(xi - 1) EYi(Yi - 1)
i~~~~l ~~i=1

Ax = X-1 Y (Y - 1 )
This is interpreted as the probability that two individuals drawn randomly
from populations X and Y will both belong to the same species, relative to
the probability of randomly drawing two individuals of the same species
from X or Y alone. CA varies from 0 when the samples are completely
distinct (containing no species in common) to about 1 (Morisita 1959b, p.
69, 75) when the samples are identical with respect to proportional species
composition.
Morisita's formula can be simplified by use of an estimate of A appropri-
ate for a model of sampling with replacement, rather than that given by
Simpson. This simpler index is useful as an empirical measure, though its
probability interpretation is only rigorous when all xi and yi are very large.
The new estimates of Xare:

s s

Xj2
Exi2~ Eyi2
_ _ _ _ _ i= 1

The formula for CA is simplified further when the sample sizes X and Y
are equal:

2 X EYi

+ Z~

i=1 i=1

This last formula is also appropriate where the data are expressed as the
proportions xi and yi of the respective samples composed of species i. As
an empirical measure, this last formula has an advantage over that pre-
sented by Morisita, since its upper limit is exactly 1.
An index of overlap with similar characteristics may be derived from in-
formation measures. Again we have samples JXQ} and {Y0}, and the no
is as before. From these samples we can directly calculate the following
Shannon-Wiener measures of information:

H(X) = Xlog
ilX xi

This content downloaded from 131.252.125.148 on Fri, 02 Dec 2016 00:31:54 UTC
All use subject to http://about.jstor.org/terms
MEASURING "OVERLAP" IN ECOLOGY 421

H(Y)= -?Llog Y
'i=j Y Yi

H(X + Y)= L bY log Xi+Y


z =1

Since we shall later be interested only in the ratios of these measures of


information rather than in the values of the measures themselves, the
logarithms may be taken to any convenient base.
If the samples IX0} and {Yc7I contained no species in common, H(X + Y)
would attain its maximum value:

logX + y X___
Hmax= ; X log + Y i og .
X +y Xi X + Y

If the samples {XOj and {Y0} contain


tions, H(X + Y) would equal H(X) or H(Y). Since H(X) will generally not
equal H(Y), the best estimate of H(X + Y) in this case would be its mini-
mum value:

X Y
Hmin = [H (X)] + [H (Y)] .
X +y X+ y

The value of H(X + Y) actually observed is of course:

Hob5 = H (X+ Y).


From the maximum, minimum, and observed values of H (X + Y), we can
construct an index of overlap that varies from 0 when the samples are com-
pletely distinct (containing no species in common), to 1 when the samples
are identical with respect to proportional species composition:

overlap = Ro = Hmax - Hobs


Hmax - Hmin

An index of "heterogeneity" that varies from 0 when the samples are


identical with respect to proportional species composition, to 1 when the
samples are completely distinct follows directly:

heterogeneity = Rh 1 - Ro = Hb -
Hmmax Hmin

Under the assumption that the samples {Xc7I and {YO} characterize the popu-
lations X and Y, Rh can be interpreted as the redundancy (Shannon and
Weaver, 1949, p. 25) of the information gained by considering these popula-
tions to be distinct rather than identical. This interpretation is consistent
with the proposed function of Rh as an index of heterogeneity.
The weighted form of the indices, presented above, is appropriate where

This content downloaded from 131.252.125.148 on Fri, 02 Dec 2016 00:31:54 UTC
All use subject to http://about.jstor.org/terms
422 THE AMERICAN NATURALIST

comparable sampling effort leads to consistent differences in sample size.


The weighted form can be calculated directly from abundances rather than
frequencies. For example, the formula for R0 can be expanded into the folb
lowing computational formula:

R E (Xi + yi) log (xi + yi) - IXi log xi - E Yi log Yi


(X + Y) log (X + Y) - X log X - Y log Y

Both Ro and Rh can be calculated in an unweighted form for most cases


where the data are already in the form of frequencies. Simply set X = Y = 1

and let xi and yi represent the proportions of the respective sam


posed of species i in the above formulae.
The numerator of Rh is similar to an index of heterogeneity used by Kohn

(1959) and King (1962), after Margalef (1956). The use of the Shannon-
Wiener measure of information (Shannon and Weaver, 1949, p. 19) rather than
that derived from Brillouin (1951, p. 339), and comparable weighting in the
calculation of Hobs and Hmin, gives (Hobs - Hmin) certain advantages over
the index of Margalef: it is easier to calculate; is unaffected-by sample
size per se; and has a readily interpreted, fixed lower bound and a readily
interpreted, calculable upper bound. Division by this upper bound gives an
index with a uniform interpretation, which allows meaningful comparisons
between different sets of data.
MacArthur (1965) has presented an index of heterogeneity, e(HT -H) [in
the present notation e(Hobs Hmin) where Hobs and Hmin are calculated in
the unweighted form using loge] which has the same advantages over those
previously in use. It is interpreted as the ratio of the number of equally
distributed species in the sum of the samples to the number of equally
distributed species in the individual samples for the same values of the in-
formation measures. e(HT H) can be shown to equal 2Rh, where Rh is
calculated in the unweighted form. The indices may be compared by con-
sidering two equivalent samples with even distributions of equal numbers
of species and by varying the number of species held in common. Rh is
then equal to the proportion of each sample which consists of species not
represented in the other sample. e(HT H) is an exponential function of
this proportion; but, since its maximum departure from linearity is only 97o
of its range, the two indices give comparable results. Which index is ap-
propriate in a given situation, will depend on the interpretation to be
placed on its value.

DISCUSSION

Simpson's diversity index, from which Morisita's overlap index is de-


rived, is one of probability and, thus, is a measure of the availability of
items within certain categories; successive choices are combined multi-
plicatively. The Shannon-Wiener expression, from which the information
theory indices are derived, is designed as a measure of the choices which
can be made among items in certain categories; successive choices are

This content downloaded from 131.252.125.148 on Fri, 02 Dec 2016 00:31:54 UTC
All use subject to http://about.jstor.org/terms
MEASURING "OVERLAP" IN ECOLOGY 423

combined additively. Thus either CA or R0 is the more appropriate index


depending on whether the differential availability of items or the dif-
ferential choice among items, respectively, is of primary importance.
For example, there are two possible approaches to measuring the amount
of overlap between the diets of two animals. If we are interested in the
overlap in foraging habitat, the diet data could be expressed as the propor-
tions of the total diet taken from various portions of the habitat, and the in-
formation theory measure of overlap would be more appropriate. Alterna-
tively, if we are interested in the overlap in exploitation of alternative food
sources from within the same habitat, the diet data could be expressed as
the proportions of the total diet taken from various taxonomic categories,
and the overlap measure of Morisita would be more appropriate.
The problem of judging the statistical significance of trends shown by
these indices should be mentioned. Statistical tests of limited hypotheses
equivalent to requiring R0 or CX to be equal to zero or one could be de-
veloped after the methods of McGill (1954) or Morisita (1959a), respec-
tively, but tests comparing different values of the indices would be more
valuable.
The indices outlined above are only appropriate in situations in which
there is implicit confidence that the proportions of items in each category
are adequately characterized. Given the varying degrees of contagion which
occur in the distributions of animals in space and time, tests of this as-
sumption should depend less on the characteristics of the index under the
assumption of randomness than on the uniformity of values of the index for
replicated samples. Thus, at the present state of our knowledge of the
distribution patterns of organisms, standard statistical techniques applied
to replicated simple measures may be of more value than exact tests based
on the theoretical variation of a more complex index.

SUMMARY

Objective, empirical measures of overlap between samples of items dis-


tributed proportionally into various qualitative categories are presented and
reviewed. These indices of overlap, derived from either probability or in-
formation theory, should prove useful to the ecologist in comparative stud-
ies of diet, habitat preference, seasonal patterns of abundance, faunal lists,
or similar data.

ACKNOWL EDGMENTS

This paper is a result of discussions with E. R. Pianka, R. T. Paine,


and G. H. Orians. Critical comments by R. H. MacArthur were very helpful.
While developing this paper, I have had the support of a National Science
Foundation Predoctoral Fellowship.

LITERATURE CITED

Brillouin, L. 1951. Physical entropy and information. II. J. Apple. Phys.


22:338-343.

This content downloaded from 131.252.125.148 on Fri, 02 Dec 2016 00:31:54 UTC
All use subject to http://about.jstor.org/terms
424 THE AMERICAN NATURALIST

Cole, L. C. 1949. The measurement of interspecific association. Ecology


30:4i 1N424.
King, C. E. 1962. Some aspects of the ecology of psammolittoral nematodes
in the northeastern Gulf of Mexico. Ecology 43:515%523.
Kohn, A. J. 1959. The ecology of Conus in Hawaii. Ecol. Monogr. 29:47-90.
Lloyd, M., and R. J. Ghelardi. 1964. A table for calculating the 'equita-
bility' component of species diversity. J. Anim. Ecol. 33:217-225..
MacArthur, R. H. 1965. Patterns of species diversity. Biol. Rev. 40:510O
5 33.
Margalef, R. 1956, Informacion y diversidad especifica en las comunidades
de organismos. Invest. Pesquera (Barcelona) 3:99-106.
McGill, W. J. 1954. Multivariate information transmission. Psychometrika
19:97- 16.
Morisita, M. 1959a. Measuring of the dispersion of individuals and analysis
of the distributional patterns. Memoirs of the Faculty of Science,
Kyushu Univ., Series E (Biology) 2:215'235.
1959b. Measuring of interspecific association and similarity be-
tween communities. Memoirs of the Faculty of Science, Kyushu
Univ., Series E (Biology) 3:65-80.
Shannon, C. E., and W. Weaver. 1949. The mathematical theory of com-
munication. Univ. Illinois Press, Urbana.
Simpson, E. H. 1949. Measurement of diversity. Nature 163:688.

This content downloaded from 131.252.125.148 on Fri, 02 Dec 2016 00:31:54 UTC
All use subject to http://about.jstor.org/terms

Potrebbero piacerti anche