Lexical Replacement and Cognate

This article was downloaded by: [University of Auckland Library]
On: 05 January 2014, At: 20:54

Publisher: Routledge
Informa Ltd Registered in England and Wales Registered Number: 1072954
Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK
Australian Journal of Linguistics
Publication details, including instructions for authors and
subscription information:
http://www.tandfonline.com/loi/cajl20
Lexical replacement and cognate
equilibrium in Australia
Barry Alpher
a
& David Nash
b
a
American University , 3218 Wisconsin Ave NW, Apt B2,
Washington DC, 20016, USA E-mail:
b
ANU, AIATSIS E-mail:
Published online: 14 Aug 2008.
To cite this article: Barry Alpher & David Nash (1999) Lexical replacement and cognate
equilibrium in Australia, Australian Journal of Linguistics, 19:1, 5-56
To link to this article: http://dx.doi.org/10.1080/07268609908599573
PLEASE SCROLL DOWN FOR ARTICLE
Taylor & Francis makes every effort to ensure the accuracy of all the information (the
Content) contained in the publications on our platform. However, Taylor & Francis,
our agents, and our licensors make no representations or warranties whatsoever
as to the accuracy, completeness, or suitability for any purpose of the Content. Any
opinions and views expressed in this publication are the opinions and views of the
authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the
Content should not be relied upon and should be independently verified with primary
sources of information. Taylor and Francis shall not be liable for any losses, actions,
claims, proceedings, demands, costs, expenses, damages, and other liabilities
whatsoever or howsoever caused arising directly or indirectly in connection with, in
relation to or arising out of the use of the Content.
This article may be used for research, teaching, and private study purposes. Any
substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
systematic supply, or distribution in any form to anyone is expressly forbidden. Terms
& Conditions of access and use can be found at http://www.tandfonline.com/page/
terms-and-conditions
Australian Journal of Linguistics, Vol. 19, No. 1, 1999
Lexical Replacement and Cognate
Equilibrium in Australia
BARRY ALPHER AND DAVID NASH
We estimate the degree to which languages resort to borrowing as a means of lexical
replacement, within a group of neighbouring languages of southwestern Cape York
Peninsula, using several methods: (1) sound-correspondences and correspondence-mimicry;
(2) the proportion of 'local' words in single-language lists; and (3) the creation of the
vocabulary of special registers. We find that borrowing accounts for at most half of lexical
replacement in these languages, and most usually is well below half. We demonstrate that
this rate is crucial in the prediction of what fraction of vocabulary might in the long term
be common to two neighbouring languages (the 'equilibrium percentage') in a model of
lexical similarity that does not distinguish borrowings from common retentions. We then
apply these findings to the case study, and compare determinations by lexicostatistical
subgrouping (with and without recognition of loans), with results from classification by
classical means. We find substantial agreement, and that the effect of 'borrowing to
equilibrium' on lexicostatistical subgrouping is tolerably small.
Introduction
1
Loanwords, both detected and undetected, pose problems for the genetic
classification and subclassification of languages. In some cases, notably languages
with longstanding traditions of writing, like those of the Germanic family, borrow-
ings over a period of a thousand years and more can be identified with precision and
1
Our collaboration on this paper began upon Alpher's presentation 'Borrowing and non-borrowing'
to the Seminar on Australian Aboriginal Languages at the Massachusetts Institute of Technology,
8 January 1979, which included the key ideas of Sections 2.2 and 3.1.2. We are grateful to Paul Black
for providing us with a copy of Black (1997) in preliminary version (and his 1979 paper referred to
therein) and for other useful comments, most recently in his capacity as a referee. We thank Graeme
Williams and Adolfo Constenla for kindly running J. Guy's SIMULA program and the SPSS cluster
analysis program, respectively, on our data sheets. Numerous others helped us with one or another
aspect of the work; we thank among them Marcia and Robert Ascher, Gavan Breen, Sheila Embleton,
Philip Hamilton, John Haviland, Kenneth Hale, Harold Koch, Johanna Nichols, and Bruce Rigsby.
We are grateful for extensive comments from an anonymous referee. The 1982 version of this paper
was pre-circulated and presented at an annual meeting of the Australian Linguistics Society (Alice
Springs, August 1984) and was further revised and expanded in December 1985 (Alpher) and
February 1986. Nash's work was funded in part by National Science Foundation grant number
BNS-7913950 and by grants from the Australian Institute of Aboriginal Studies and the Australian
Research Council (No. A58932251).
0726-8602/99/010005-52 1999 The Australian Linguistic Society
5
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

6 B. Alpher and D. Nash
dated. From cases like these, it is demonstrable that languages have incorporated
loanwords into their vocabularies at different rates (the rates differ from one
language to another over a given period and within the same language at different
periods). Whatever the case may be as to the limits and distribution of variation of
rates of lexical replacement in a given list of 'meanings', it is clear that the fraction
of lexical replacement that is accomplished by borrowing from other languages has
a frequency distribution. Among the Germanic languages, this rate, calculated with
respect to a standard 200-word list, has varied from 2% (Transylvanian Saxon, since
the time of Old High German) to 48% (Faroese, since the time of Old Icelandic),
and the rate for English (45%, since the time of Old English) during the last 1,200
years or so is among the highest of those surveyed (Embleton 1986: 99; see our
Table 2).
Where there is no time-depth to the written corpus and where the normal
methods of identifying borrowings fail to give revealing results (a problem long
recognized; see for example Hale 1962; Dyen 1963: 61; Embleton 1986: 126), the
rate at which the lexicon is replaced by borrowing seems currently to be left out of
consideration or to be a matter of guesswork. These conditions are notoriously the
case with the languages of the Aboriginal peoples of Australia. Athough related to
one another in various degrees, many of these languages are phonologically con-
servative, making the use of shared phonological innovations as a tool in language
classification and the identification of loanwords highly problematic. The total
populations of speakers of these languages are rarely over 1,000 and frequently
under 500; marriage with a speaker of another language, and hence bi- and
multilingualism, is in some areas the norm (Sutton 1978: 106-113; Heath 1978a:
14-21; to our knowledge, the first report in the literature of an extensive system of
this kind is that of Sorensen (1967, with regard to peoples of the Amazonian
rainforest)), and extensive borrowing of bound forms as well as free-form lexical
items has been shown to occur (in northern Australia, for example, by Heath
(1978a)). Nowhere to our knowledge has a quantitative characterization of borrowing
under such circumstances appeared, but the general impression given by writings on
the subject is that it is considerably more frequent than in other parts of the world.
As well, the practice, common among speakers of Australian languages, of proscrib-
ing the use of the name of a recently deceased person and words that resemble it has
been advanced as a cause of an unusually high rate of lexical replacement (whether
by borrowing or other means), again without quantitative data but with presumably
drastic effects on the usefulness of quantitative methods and indeed of the compara-
tive method itself insofar as it relies on the comparison of 'cognate' forms.
The problem is of interest in areas of study other than Australia, notably with
regard to the assessment of language relationships of such great time-depth that the
method of comparative reconstruction becomes difficult because of the degree of
attrition of common vocabulary. Such is the case with the indigenous languages of
the Americas, where the time-depths at which the comparative method becomes
unworkable are of the same order as the time-depth of the beginnings of agriculture.
Before agriculture, the sizes of populations of speakers, the affinal relationships
among them, and the extent of bi- and multilingualism might very well have
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Lexical Replacement 7
resembled those observed in parts of Australia at present. There is another area of
long-range comparison to which prior multilingualism is considered to have been of
relevance, in the production of unidentifiable borrowing in presumably massive
amounts (Austerlitz 1991: 363): this is the internal and external relations of the
Uralic languages (whose common ancestor is presumed to have been the language
of people with a hunting-and-gathering economy (Austerlitz 1991: 355)). Another
contemporary region for which a situation of this kind has been observed is
Papua New Guinea (Ross 1996: 202; the groups in question in this study
practise agriculture). Even the wildest estimate of the rate of lexical replacement by
borrowing in a case like the Australian one might prove of some use in such cases.
An estimate of the degree to which lexical replacement is not totally accomplished
by borrowing in areas like Aboriginal Australia is also of relevance for general models
of vocabulary distribution over space. If, as runs a tacit assumption of one of the
quantitative models (Dixon's1970 and 1972) of the distribution of sets of lexical
'cognates' in neighbouring Australian languages, all lexical replacement is by bor-
rowing and the origin of a word as a retention or a loan is always opaque,
vocabularies of adjoining groups (at least, where each has just two neighbours) will
resemble each other to an extent that approximates 50% over the long run. We show
below that it is unlikely that more than half, and likely that far less than half, of
lexical replacement in these languages is accomplished by borrowing and thus that
the 50% equilibrium rate is probably never a concern in language classification. We
emphasize here that the assumption that all lexical replacement is by borrowing is an
implicit premise of Dixon's model and is not one subscribed to by its proponent; it
is in fact one Dixon (1980: 28, 99) has explicitly denied. Our interest here is in
obtaining quantitative data and feeding it into an improved model.
How often do languages resort to borrowing as a means of lexical replacement?
We attempt below, by various direct and indirect methods, to estimate this rate for
a group of neighbouring languages of southwestern Cape York Peninsula. We
evaluate one of these methods, that of 'local words' (Section 2.2.2), against the
otherwise inexplicably high
2
lexicostatistical count of 'sames' between two languages
of this group, Yir-Yoront and Kuuk-Thaayorre. These two languages we believe to
be in a Sprachbund relationship but not to be especially closely related genetically:
the genetic interrelationships of these languages are highly problematic. Using the
data obtained through the local-words method, we compare the lexicostatistically
generated subgrouping with subgrouping inferred on other grounds. There are
general implications for the use of lexicostatistical methods in genetic subgrouping,
and in addition, we believe that the methodology of estimating rates of borrowing
has an intrinsic interest.
Among other conclusions, we find that the expected 'equilibrium' figures are in
general low enough that language classification can proceed using lexicostatistics as
a pointer to a first approximation, and with one eye kept on relevant geographical
2
In hypothesizing genetic subgroups, we attach much more importance to shared morphological
paradigms, particularly irregular ones, than to information of other types (shared phonological
innovations, shared vocabulary retentions).
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

facts, without undue worry about undetected borrowings. We consider the types of
situations where lexicostatistics gives anomalous results in language subgrouping,
examining in detail a case from Cape York Peninsula, and in so doing unearth some
curious and at present mysterious facts with regard to unexpectedly high rates of
lexical sameness in neighbouring languages. We believe that our findings apply to
the question of geographical boundaries across which languages are related only
distantly, with particular regard to the reality of the proposed 'Pama-Nyungan'
genetic subgroup, but we do not pursue this matter in this paper.
1. Replacement Strategies
Australian Aboriginal languages, like all languages, replace vocabulary items with new
words, either shifting the meaning of the old form, limiting it to specialized uses, or
forgetting it. An example from English is animal, replacing the older and now
specialized form deer (as retained in modern Dutch dier, German Tier 'animal'). The
reasons for such replacement are numerous; one suspects that semantic specialization
of the old form often triggers it. With Aboriginal languages, however, another
phenomenon is of relevance: the death-tabooing of words. The Australian Aborigines
avoid saying the name of a recently deceased person, and avoid other words that
sound similar or otherwise recall a proscribed name. This custom has been remarked
upon by many observers for over a century (see Black 1997 for references). It has
repeatedly been alleged as the driving force behind lexical replacement in Australian
languages, often to the exclusion of any other mechanisms. We are aware at present,
however, of only a few recorded examples of the name of a deceased together with
the words avoided after this person's death, and, in the nature of the case, even
though a particular lexical item in a language may be identifiable as a loan, it is
impossible to tell after three or four generations have elapsed whether the motive for
the lexical replacement was death-taboo or something else.
The usual technique for avoiding a tabooed word is to say another word in its
stead. The expressions chosen to replace words homophonous with the name
(or totem) of a recently deceased person are, as far as we can tell, of the following
types: (a) the temporary use of a synonym, from any of the languages in the
repertoire of the particular community (thus perhaps from the same language or
perhaps from a neighbouring language or English), (b) compounds or other new
formations, (c) use of widened or metonymically or metaphorically shifted meanings
of existing (and untabooed) words (such as widening the term for 'blood' to include
'water', 'liquid'), (d) use of the corresponding term from the lexicon of the auxiliary
'respect language' (which, of course, has the requisite index of politeness), (e) the
use of a hand-sign, or gesture, to serve for the proscribed term, (f) the use of a
hypernym (a more general term), with the extreme being a 'whatsaname' word (such
as Warlpiri nganayi), (g) use of a particular word reserved just for the purpose of
substituting for any tabooed name,
3
and (h) borrowing.
3
Strategies (a)-(d) are mentioned by Dixon (1980: 99), (f) is discussed by Nash and Simpson
(1981), and (g) by Alpher (1991: and individual entries).
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Methods (b) and (c) are of course normal in all languages; that they (as well as
borrowing) are well attested in Australian Aboriginal languages as means of lexical
replacement is not disputed. For exemplification of the others, see Black (1997:
56-58), who shows in some detail how tabooed words often find their way back into
the vocabulary after a suitable period of mourning has passed: instances of tabooing
of which we are in a position to check the history have proved to be temporary. We
are therefore skeptical that death-taboo rather than other motivations has been
responsible for lexical replacement in any large proportion of the cases.
2. Borrowing and Replacement Rates
Dixon (1972: 331) hypothesized that the increased pressure to borrow a word from
a neighbouring language in the context of the Australian practice of name-avoidance
(under strategy l(a)) causes the vocabulary replacement rate for the typical
Australian language to be higher, ceteris paribus, than that of other languages of the
world. In his 1980 book surveying the Australian language family, he does not make
as strong a claim, but still sees the death-taboo as the most significant force in lexical
replacement:
The social custom of name taboo, and the associated proscription on
lexical words that have similar form, is of the utmost significance for
understanding one of the ways in which Australian vocabularies change.
Sometimes a proscribed noun or verb may come back into use within a
matter of months, but it is more normal for it to be some years before the
taboo is relaxed. Often it may be replaced by a synonym from within the
language ... or by a newly coined compound; but more often a new word
will be borrowed from the language of a neighbouring tribe. Even when this
has happened, the original lexeme may return to everyday use ... but in
other cases it will have been completely eliminated from the language, its
place being taken by the borrowed item. (Dixon 1980: 28)
The crucial question is: just how common is this last possibility? This is the only
point in this otherwise unexceptionable summary which does not receive
confirmation in the writings of other researchers.
There are three respects in which Dixon's hypothesis can be examined:
(i) observation of particular instances of replacement (Section 2.1);
(ii) estimation of replacement rates in Australia (Section 2.2);
(iii) estimation of the proportion of shared vocabulary between neighbouring
languages presumed to be 'in equilibrium' (Section 3).
2.1. Estimates of Replacement Rate
Dixon's (1972) hypothesis is, of course, difficult to test directly, for to do so would
require observing a group of languages over a long period of time, of the order of
centuries. To test it indirectly, we need an estimate of the time-depth of the
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

break-up of a genetically related group of languages independent of glottochronol-
ogy. Neither of these tests has been carried out in Australia, and it is inappropriate
to put the hypothesis in such definite terms as the following:
Frequently, when an ordinary word must be dropped from speech, people
borrow the equivalent word from a neighbouring dialect or languagea
practice that accounts in part for the relatively large amount of shared
vocabulary between Australian languages. (Haviland 1979a: 210)
[...] it is certainly the case that sporadic tabooing will have been a factor in
the development and reorientation of pronominal systems in a number of
Australian languages. (Dixon 1980: 351)
What actual evidence is there on replacement rates in Australia? Consider the
following commentary in O'Grady et al. (1966: 26):
Many investigators have felt that the lexical retention rate in Australia is
very lowlower than that of languages in the world generally. But Hale
and O'Grady obtained a test list in Parnkalla in 1960 and checked it
against a vocabulary published by Schurmann in 1844. The two lists
showed almost total agreement (and the few disagreements may well be
due to the fact that the Parnkalla language recorded in 1844 was represen-
tative of one dialect while the Parnkalla language recorded in 1960 was
representative of a slightly different dialect).
Of course, the time allowed for change in the Parnkalla vocabulary in the cited test
probably amounts to a century at most, since the language and culture were not
functioning in the traditional manner for very long this century. Furthermore,
glottochronologists hypothesized a retention rate of around 81% per millennium for
the Swadesh 200-word list (see Swadesh (1950, 1952, 1955) and Lees (1953)), and
thus almost 98% per century. Note that findings of retention rates differ according
to variations in the sizes of lists and by variations in the composition of same-sized
lists. For the Swadesh (1955) 100-word list, for example, findings of retention rates
cluster around 86% per millennium. On the assumption that the list used for
Parnkalla in 1960 was the O'Grady 100-word list (used in this paper and reprinted
in the Appendix)and the published source does not tell usthen it should be
noted that the O'Grady list and the Hale 100-word list (also in the Appendix),
although modelled on the Swadesh list, differ from it in various ways that improve
its usefulness in Australia ('spear [N]' is in, 'ice' is out, for example) but introduce
unknown deviations in the retention rate. If, however, the predicted retention rate
for the Swadesh 100-word list of 98.5% per 100 years is anywhere near the mark for
the O'Grady and Hale lists as used in Australia, it predicts a change in about
one or two words in the 100-word list in a century,
4
and the error term in
4
An interval of 10 millennia at this rate corresponds to a shared vocabulary of about 10% (i.e. about
the amount shared by two nearby non-Pama-Nyungan languages), and 50 millennia to a negligible
fraction of a per cent, so there is no need to postulate a lower retention rate just to account for the
sort of diversity observed in Australia.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

the prediction would usually make observations over a period as small as a century
fairly inconclusive.
On the other hand, if the retention rate were much lower, say 50% per millen-
nium, there would on average be a change in seven words
5
in the 100-word list
observed at an interval of a century, or 13 words after an interval of 200 years, the
longest period we can hope to investigate in Australia.
The maximal period of 200 years is available to us in just one languageGuugu
Yimidhirr. Haviland gives a detailed comparison of Cook's and Banks' 1770 lists
with the modern language, and finds that '[m]ost of Cook's words are completely
recognisable today' (1974: 231), including the one personal name Cook recorded
(in use in 1901, and remembered in Haviland's time1974: 229). Alpher (1997)
has compared the 77 most reliable of Cook's and Banks' items with their modern
GYim equivalents and found a retention rate of 67/77 (87%) for the 200-year
period. The corresponding items in the English of 1770 are retained at the rate of
72/77, or 94%, over the same period.
6
These translate to 50% and 73% per
millennium. To these figures and to the difference between them we can attach
virtually no statistical significance, since they are based on a single 77-word sample
over a single 200-year period.
7
Note that the drastically lower figures derived from
the Cook-Banks list for both languages likely result from differences from the
Swades'h lists in both size and composition. The lack of certain stable items like
pronouns, among other things, is bound to make them less conservative. The same
is doubtless true of the standard lists we cite in the Appendix, although for these we
lack an absolute time scale.
If borrowing is high in Australia, we should recognize items in Cook's list as being
subsequently replaced with loans from the neighbouring languages. Of the 10 items
changed in the version of the Cook-Banks list used above, a maximum of three can
be construed as loans (all from Kuku-Yalanji).
2.2. Estimate of Replacement due to Borrowing
2.2.1. Estimation of Rate of Borrowing using Sound-correspondences as Tracers.
2.2.1.1. Anomalous Sound-correspondences in Two-language Lists. Investigators nor-
5
antilog (log 0.5/10) = 0.93, where log is the natural logarithm.
6
The English items that we count as different between the Cook-Banks list and current usage are
beard for contemporary barb, brow for forehead, ham for thigh, lance for spear, and throttle for throat.
In making these determinations, we have deliberately turned our back on philological knowledge that
we possess for English, for which we have more than a thousand years of written texts and sources
like the Oxford English Dictionary, on the grounds of the shallowness of our knowledge of the history
of Guugu-Yimidhirr. We grant that forehead is attested much earlier than brow as a term for 'forehead'
and that spearhas a far longer history in English than lance (possibly the prototypical military analogue
in Banks' time), and we concede that Banks' and Cook's speech varieties were not necessarily directly
ancestral to modern colloquial Australian. But we lack knowledge of Guugu-Yimidhirr at this depth.
Bergsland and Vogt (1962: 116) noted similar problems with other languages with long literary
histories as a serious obstacle.
7
We acknowledge, however, the possibility that these figures reflect a genuine imbalance in the
retention rates for the two languages.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

mally identify as loanwords those apparent cognates that have failed to undergo
expected sound-changes or which attest the sound-changes proper to another
language's history. Among the languages for which we have reasonably rich data,
Yir-Yoront (YY) and Kuuk-Thaayorre (KTh)
8
are an appropriate pair for using
sound-correspondences to test for loanwords in this way, in order to study the extent
of borrowing. A comparison of verb paradigms argues against an especially close
genetic relationship, but their rate of vocabulary-sharing is unexpectedly high (see
Section 3.2) and they have in common certain innovations in morphophonemic
alternations (one, for example, apparently due to borrowing of a stress pattern and
reduction of unstressed vowels). The speakers of these languages are documented as
having been in intense contact before the days of continuous European presence
(Sharp 1958). Yet a number of sound-changes, particularly those in Yir-Yoront,
have caused these languages to sound conspicuously different, even to unpractised
ears, and provide a means to identify loans.
A detailed comparison of Yir-Yoront and Kuuk-Thaayorre phonological processes
and vocabulary is in process (Alpher 1998). Here are the results for one set of
sound-correspondences. YY has lost nasals that preceded a homorganic stop; KTh
has retained them. Twenty-six forms exhibit this correspondence. An example is YY
puth, KTh punhth 'arm' (PP *punyja). YY unexpectedly attests a nasal in six
known forms of this type; an example is YY pornt, KTh punt 'elbow' (PP *punti).
We accordingly identify the latter as a likely loanword in YY. The other five
members of our putative list of loanwords contain an nhth cluster in YY, like wanhth
'sickness', KTh wanych (PP *wanyji).
9
The six anomalous pairs out of a total of 32
in this set (a ratio of 19%) are by our hypothesis loanwords.
We can treat a list like this as a random collection of 'sames' from a two-language
lexical sample in which the direction of the borrowing and something about the
provenance of all the words can be known. Taking, as an example, the group of 26
8
Kuuk-Thaayorre (KTh) data are mostly from Hall (1968, 1972); Yir-Yoront (YY) data are from
Alpher (1991 and fieldnotes). Transcription of examples in all languages is in a practical orthography,
with lamino-palatal stop j, trilled or tapped liquid rr, continuant r, glottal stop '. For Yir-Yoront, ch
and g represent a lamino-alveopalatal stop and a glottal stop, respectively, and v represents a
mid-central vowel, shwa. Transcriptions of the form 'karr, V for YY verbs display the imperative form
of the verb followed by the ablaut-vowel that replaces the original vowel in certain tense-forms (in
this example, karr 'look (imperative)', kin 'looked'). YY, YTh, and KTh transcriptions of the form
'kolhth, i' for nouns display the citation (absolutive) form followed by the thematic vowel which
follows it in the oblique cases, and which, if it is i or u, conditions umlaut in YY (in this example,
kolhth 'tail', dative kilhthi). Other abbreviations are L, local (in 'local word'); PP, Proto-Paman;
PPNy, Proto-Pama-Nyungan; and as on the map (Figure 1).
9
For some of these forms, there is at least a suspicion that the nasal in YY escaped reduction because
it was not originally homorganic: compare YY kunhthn, KTh kunychn 'pandanus' (Local *kunyjan)
with Djambarrpuyngu gundjak, Gupapuyngu gundjalk 'pandanus' (Zorc 1986). Both of these
Northeast Arnhem Land languages contrast ndj with nydj. The possibility cannot be discounted that
NE Arnhem gundjak was borrowed from a language with the commonly attested class prefix *gun:
*gun-djak; but compare Ngandi ma-gunjak 'pandanus', Nunggubuyu maguj (Heath 1978b, 1982).
If the Arnhem Land forms are true cognates, the other four forms fall under suspicion, leaving us
with just one undoubted loanword out of 32.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

words discussed above, we identify in it some eight words that are 'Old' under the
criteria set out in Section 2.2.3. The remaining 18 can be taken to be 'sames' either
as a result of retention from as yet unsubstantiated 'Old' forms, or as borrowings
that preceded the sound-shift, or as common innovations that are no longer attested
in the other (southern) members of the subgroup (see Section 3.2). We have, then,
24 new 'sames' (these 18 plus the six known loanwords), of which six, or 25%, can
be unequivocally shown by their sounds to have become 'same' by borrowing from
Kuuk-Thaayorre to Yir-Yoront. From such a list (that is, without looking at other
languages) nothing can be learned about the sources of words that are 'New' in both
languages but are not 'sames'.
In lists of this type (counts of items shared in two languages, with the ratio of
number same by borrowing to total number of sames in the lists; see for example
Blake (1979: 117-131)) where no judgement is given or possible concerning the
directions of borrowings and the sources of other 'sames', still less can be learned
of the degree to which innovation is accomplished by borrowing in any single
language. To see why this is so, consider, for example, Heath's (1978a: 29-30; see
also Heath 1979: 405) finding, in a list of 70 nouns in Ngandi and Ritharrngu with
meanings in the 'body-part and -product' domain, of 18 'sames'. Heath claims
that all (or at the very least all but one) of these are borrowings. The time-depth
of the genetic split between Ngandi (non-Pama-Nyungan) and Ritharrngu (Pama-
Nyungan) is very great; hence it can be postulated that of the 52 'unlikes', at most
a very small number are retentions from the protolanguage in either Ngandi or
Ritharrngu. Either language can have in this domain, then, as many as 52 (from
the 'unlikes') plus 18 (from the 'sames', if one language was universally the donor
and the other universally the borrower), or in other words the entire 70, as
innovations by means other than borrowing. In this limiting case, the other
language of the pair has produced 18/70, or 26%, of its lexical replacements by
borrowing. Split the difference, on the assumption that borrowing went in both
directions, and the result is 13%. However, an unknown number of the 52
'unlikes' in either language are themselves the results of borrowing from still other
languages (at most only a very few, by supposition, are retentions from the
protolanguage; in the limiting case of two unrelated or very distantly related
languages, all the 'sames' will be loans); the effect of these, if known, would be to
raise the 13% estimate by an unknown amount. On the other hand, with languages
more closely related, such as the Cape York languages considered below, the
number of 'sames' will include a significantly large fraction that are 'same' not by
virtue of borrowing but by common retention; the grand total of innovations
becomes proportionately smaller, with unpredictable effects on the ratio of innova-
tions by borrowing to all innovations. As can be seen from the above, if all that is
given is a pair of word lists of equal size and of the same 'meanings' in two
languages, together with the number of 'sames' and the number judged 'same' by
virtue of borrowing, it is not possible to infer within any narrow limits the fraction
of lexical replacement that is accomplished by borrowing. We may take our figure
of 13% (above), however, as not incompatible with the estimated maximum of
50% obtained by other means below.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Kuuk-Thaayorre Yir-Yoront verbs
verbs
CV:th
CVmhth
CVth
CVnhth
CVy
Figure 1. "Some possible many-to-one regular sound correspondences".
2.2.1.2. Correspondence-mimicry. We have, in the reckoning above, counted as
loanwords those forms that have not undergone the sound change in question. The
possibility exists, however, that borrowed forms are reshaped to conform with
known correspondences between the two languages. We (e.g. Alpher 1992; Koch
1997; Nash 1997) call this phenomenon CORRESPONDENCE-MIMICRY, or MIMICRY.
As a process it is distinct from the normal assimilation of loanwordsavoidance of
impossible sound-sequences and sound-types.
An example from the area under discussion is Uw-Oykangand adjun 'tail',
evidently borrowed from a language that continued *tjuni (cf. Proto Central New
South Wales *dhun 'tail' (Austin 1997: #40, 29); Black (1980: 233) gives
*tju(:)n(V)). UO has lost all protolanguage initial consonants, as for example atjin
'yamstick' <PP *katjin. Familiar with a-initial words as a salient feature that
distinguishes their language from those of their neighbours, the UO have assimilated
*tjuni as a loan by adding initial a (rather than by the equally plausible, but as yet
unattested, method of dropping the initial consonant to produce *uri). Because
and only becausein opting for initial a they have chosen the 'wrong' form, we are
able to recognize the result as a loan.
Although UO is forced to perform one of the above options to assimilate a
consonant-initial loan to its vowel-initial canon, pronounceability is not in general
the major motivation for correspondence-mimicry. Consider the form akwertengerle
(Alpher's Eastern Arrernte notes) 'a group which holds 'managerial' responsibilities
for country' in some Eastern Arrernte dialects (pronounced (a'kurtV|ngurlv);
kwertengerle in other dialects). The form is clearly a loan from Warlpiri kurdungurlu
(Nash 1982). Since Arrernte in general permits consonant-initial words, pronounce-
ability cannot be the motivation for prefixing the a-. For a full treatment of
correspondence-mimicry, see Alpher (1992).
The frequently many-to-one nature of KTh to YY sound-correspondences sug-
gests another way to estimate the contribution of borrowing to lexical sameness,
should the loan have gone from YY to KTh. If KTh borrows a YY form and subjects
its shape to mimicry, there are frequently several ways (depending on the phonolog-
ical category) in which the 'wrong' form will result. Because, for example, YY has
lost nasals homorganic with following stops, lenited stops in the second syllables
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

of unstressed forms (as, routinely, in verbs, see Alpher 1988), and lost distinctive
vowel length, correspondences like those diagrammed in Figure 1 can result
(C = any consonant, V = any vowel):
To take a KTh word and mimic a YY word, it is a simple enough matter to
perform the necessary changesthe word, especially if a verb or a generic noun, will
sound foreign in YY without themand the result will usually be indistinguishable
from a common retention. But to move in the reverse direction is quite another
matter.
Where correspondences of this type exist, and where there is independent
comparative evidence against which to check the 'correctness' of the KTh forms,
one could proceed as follows: (i) identify 'bad' forms by checking a KTh form
against the protoform and noting whether it is a regular development; (ii) count the
'bad' correspondences and ascertain that they are evenly distributed among
the possible types.
One might assume for a rough calculation that Kuuk-Thaayorre in borrowing
Yir-Yoront forms mimics them always and at random. The frequency of such forms
is then a basis for an estimate of how much of KTh' s vocabulary is borrowed from
YY. Suppose, for example, that there are 100 YY-KTh pairs of words of the
sound-class illustrated above (the YY form terminating in a vowel followed by y) for
which the Proto-Paman etymology is known. Of these, suppose that 25 of the KTh
forms exhibit the 'wrong' correspondence. But one in six of the YY forms in this
sound-class that are borrowed by KTh will, on the assumption of correspondence-
mimicry at random, exhibit the 'right' correspondence (for the wrong reason), and
our figure of 25 would then represent only 5/6 of the actual number of borrowings,
which will have been closer to 30 (out of 100). So this hypothetical case leads to an
estimate that 30% of the 'sames' in a YY-KTh list would be the result of borrowing
from Yir-Yoront into Kuuk-Thaayorre. The estimate could be checked by
examining correspondences in other sound-classes.
In the real world, however, the question is academic. We must suppose that
correspondence-mimicry takes place in only a minority of instances, and in fact we
have to date found no instance of a 'mistake' generated by correspondence-mimicry
in a KTh form known to be borrowed from YY. We conclude on this criterion that
if there has been massive borrowing between these two languages, it has been
unidirectional, from KTh to YY, where its traces are largely invisible.
2.2.1.3. Phonological Conservatism. There remains the very real possibility that a
great amount of borrowing took place before, not after, the diagnostic sound-
changes. This is, of course, the situation that the investigator must suspect in many
other regions of Australia, where the languages are phonologically conservative. In
order to estimate borrowing rates, if possible, despite this handicap, we use the
concept of the LOCAL WORD, as discussed below (Section 2.2.2).
2.2.2. Estimate Based on 'Local' Words and Single-language Lists.
Frequently the researcher encounters a group of cognates that are attested in a group
of more-or-less contiguous languages which do not form a subgroup, such that the
attestation of these cognates is not widespread enough to suggest that the words
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

continue a protoform of the next-larger subgrouping. We designate such forms
LOCAL WORDS. The pseudo-protoform reconstructable from them we write prefixed
with 'Local *'. Cognate sets with a significantly larger distribution, and their
associated protoforms, we designate OLD WORDS. If a word's distribution is 'local',
we suspect that it has been borrowed into one or more of the languages that attest
it, although it is not clear which, or which language is the source. We designate a
word (etymon) whose attestation (as currently known) is limited to a single language
as UNIQUE.
We are conscious that these definitions are dependent on a genetic tree model of
language differentiation, and also are relative to the level of (sub-)grouping under
consideration.
The notion generalizes the situation considered by Guy (1980a: 26). Guy consid-
ers four languages that share forms for two different meanings in the following
pattern:
language
ALPHA
BRAVO
CHARLIE
DELTA
meaning 1
X
X
Y
Y
meaning 2
Z
w
z
w
On the assumption that a daughter language makes lexical innovations indepen-
dently, 'and that no case of an innovation resembling an already existing form may
ever occur', then in a situation such as the above (where four forms X, Y, Z, W are
shared in the given pattern), there must have been two instances of borrowing
(though of which form by which language we cannot immediately tell).
The number of 'local words' shared by two languages appears to decrease as the
two languages are separated by one or more intervening languages.
It is possible to produce an estimate of how much lexical replacement is due to
borrowing by counting the numbers of words in the local, old, and other categories
in a list. Such an estimate involves the heuristic assumptions that 'local' words are
in fact borrowed into at least some of the languages where they occur and that 'old'
words are present in their languages as retentions rather than borrowings. We
emphasize that these and other assumptions made below are heuristic. We are
interested in the amount of the contribution of borrowing to lexical replacement,
and we prefer to overestimate rather than underestimate. Of course, there are ways
in which a word may be falsely identified as 'old' or 'local'. A word may owe its
group-wide distribution to recent diffusion and look like an 'old' word, or conversely
an 'old word' may have been retained only in one area and thus be a 'local word'.
Alternatively, a word may appear to be 'local' simply because its cognates in distant
places have not come to our attention. Because we have had in the course of our
investigation to revise a number of 'local' words into the 'old' category, we suspect
that the latter is the case far more frequently.
Applying the notions 'local' and 'old' to a group of languages centred approxi-
mately around the mouths of the Mitchell River in southwestern Cape York
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

NORTHERN
TERRITORY
SOUTH
AUSTRALIA
Adnyamathanha
NEW SOUTH
WALES
Gumbaynggir
Figure 2. Approximate relative location of languages mentioned. Abbreviations: A, Ayapathu; AL,
Aghu-Laya (Kuku-Thaypan) Aim, Almura; BP, Barrow Paint; Djab, Djabugay; Djam, Djambar-
rpuyngu; FI, Flinders Island; Gup, Gupapuyngu; GYim, Guugu-Yimidhirr; HR, Hann River
Aghu-Tharnggala; Kaa, Kaantju; KN, Kok-Narr; KB, Koko-Bera (Koko-Pera, Kok-Kaper); KTh,
Kuuk-Thaayorre; KYak, Kuuk-Yak; KYa'u, Kuuku-Ya'u; MKul, Mayi-Kulan; MKut, Mayi-
Kutuna; MTh, Mayi-Thakurti; MY, Mayi-Yapi; NP, Northern Paman; Nung, Nunggubuyu; Oik,
Olkola (Olkol, Olgol); ON, Ogo-Nyjan (Ogunjan); Pak, Pakanh (Bakanh); R, Rimanggudinhma;
Ump, Umpila; UO, Uw-Oykangand; WM, Wik-Mungknh (Wik-Mungkanh); Wme, Wik-Me'nh;
WNg, Wik-Nganyjirr; YTh, Yirrk-Thangalkl (Yirr-Thangell); YY, Yir-Yoront (YirrqYorront).
Note: Muluridji and China Camp (CC) are co-dialectal with KYal and south of it. Koko-Babongk
(KBab) is on the coast between KB and KN.
Peninsula (see Figure 2 for locations and Figure 3 for putative subgrouping), we
have used the following working criteria:
(i) A form is 'local' if (a) it is exclusively a west-coast word (Northern Pama to
Kukatj, or roughly from the tip of Cape York to Normanton, and extending no
further inland than the first tier of inland languages, e.g. Ogo-Ndjan); (b) its
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

WM Pak ON AL Kurrtj KYak KTh UO Oik YY YTh KP KBab KN
Figure 3. Putative subgrouping of languages of southwestern Cape York Peninsula on the basis of
shared innovations.
distribution is west-to-east only, in a fairly narrow band; or (c) its distribution
extends up the Mitchell River but not over the Dividing Range,
(ii) A form is 'old' if (a) it is attested beyond the Mitchell headwaters (Djabugay,
Yidiny, and beyond); (b) its distribution extends both north (where north
includes Northern PamaUra and Ngkoth on the map) and east; (c) its
distribution extends both well to the east and well to the south; (d) it is attested
in the Marie languages (a subgroup of Pama-Maric extending south to NSW);
(e) it is attested in other groups south and west from Normanton, like Mayi
(MKul, MKut, KTh, and MY on the map); or (f) it occurs well outside the
region: Centralia, NSW, WA, etc. These criteria should be considered rules of
thumb, and they are not mutually exclusive.
We are, in adopting these criteria, deliberately casting the net wide, in order not
to underestimate borrowing by denning too few words as 'local'. Generalizing these
criteria to something that is replicable in other regions, while bearing in mind the
rudimentary nature of our knowledge of Cape York subgrouping, our working
procedure amounts to the following: 'old' forms are those which (1) occur outside
the larger subgroup; (2) occur in widely separate regions of the larger subgroup's
area; or (3) if they occur in a contiguous but restricted area, are attested in more
than four subgroups (out of some 10-20 in this casesee Section 3.2) of the larger
subgroup, which do not lie along some known or plausible route of communication.
Examples of the application of these criteria to deduce both 'old' and 'local' (with
YY at the notional centre) forms follow:
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

LOCAL
Criterion (a)west coast only:
*nyapin 'egg' in Wik languages, KYak, KTh, YY, YTh;
*punti 'elbow' in Wik-Muminh, KTh, KYak; YY, YTh (in this case phonological
criteria identify the YY and YTh forms as loans, confirming the inference made
on the basis of local-ness);
Criterion (b)west-to-east only:
*ngata 'fish' in Wik languages, KTh, KYak, YY, YTh, Flinders Island, and Barrow
Point (replacing *kuyu);
*pirra 'leaf in KB, GYim, Mul, CC, and KYal (Oates & Oates 1964);
*walmpi 'possum' in YTh, KB, UO, and Flinders Island;
*patin(a) 'skin' in Wik languages, KYak, KTh, YY, YTh, UO, Oik, ON, HR, Laya;
*jamal 'foot' in YY, YTh, KB, UO, Oik, ON, HR, Laya, GYim, Rimanggudinhma,
Almura;
*yirrka 'speak' in YY, YTh, UO, Oik, ON, HR, GYim, CC; 'speech, language' in
YY, YTh.
Criterion (c)west-to-east (criterion b) together with north, but not including
Northern Pama:
*kalu 'take, carry' in Wik languages, Umpila (Harris & Ogrady 1976), KYak, KTh;
*punyja 'arm' in Wik languages, Kaantju-Umpila, KYak, KTh, YY, YTh; 'elbow' in
KB;
*kulan 'possum' in KYak, KTh, GYim, HR, ON, Wik-Me'nh, KYa'u (Thompson
1976);
*yangan 'hair' in Wik languages, Kaantju-Umpila, KYak, KTh, Ayapathu;
Criterion (d)up the Mitchell River and contiguous languages in a generally
eastwards direction, but not to the east coast or southeast over the Dividing Range:
*warri 'where' in YY, YTh, KB, UO, Oik, ON, HR.
Forms with very widespread cognates sometimes occur in a 'local' region with a
specialized sense. We recognize here the borrowing of meanings and categorize the
form as LOCAL SEMANTIC-SHIFTED, although it is in most cases premature to judge
which is the original and which the modified sense. Examples:
*yampa 'leaf locally in YY yap, YTh yap, Ngkoth ambamb, Uradhi yamba; 'ear' in
Umpila; 'place, camp' in Bidjara and other Marie yampa. Compare *pina whose
reflex pin in YY has the senses 'ear', 'leaf, and 'site'.
*kamu 'blood' locally in YY, YTh, KTh, KYak kam; Wik-Muminh, Kaantju-
Umpila kamu; 'nectar' in Yidiny gamu; 'water' in Bidjara, other Marie languages,
and the Mayi languages Ngawun, Mayi-Kulan, and Mayi-Yapi (Breen 1981b:
190) kamu; 'tear' in Wik-Munkan mee'-kam. For an illustration of parallel
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

semantic shifting, compare YY and YTh kawn 'water'; KNarr kawung, Yugarabal
(basin of Brisbane and Caboolture Rivers; Watson 1944) 'kaoun' 'blood',
Nyungar (of Narrogin, WA; Douglas 1976) kawun 'wine'.
OLD:
*pungku 'knee' in Northern Pama, Wik languages, Kaantju-Umpila, KTh, KYak,
YY, YTh, UO, KNarr, GYim, Warungu; possibly in Mayi punkul (Breen 1981b:
108); possibly in Maranungku pingkarra (Tryon 1970);
*jipa 'liver' in Northern Pama (except Ngkoth), Kaantju-Umpila (O'Grady 1976),
Almura, KTh, KYak, YY, YTh, KNarr, UO, Oik, Kurrtjar, GYim, Flinders
Island, Gunya (Marie); 'stomach' in Ngkoth and Mayi-Kutuna; 'heart' in
Djabugay (Hale 1961 and nd b). We assume that the 'liver' sense is primary;
*yaka~yaki 'cut' in KTh, KYak, YY, YTh, KB, UO, Oik, ON, HR, AL, Mu, CC,
KYal (Hershberger 1964); 'split' in Djirbal; poss. ake- 'cut' in Arrernte.
We consider reflexes of *jangkar 'to laugh', which would otherwise appear to be
local to Northern Pama, the Wik languages, KTh, KYak, YY, YTh, and UO, as
'old' on the strength of Djaru (WA) jingkiri 'laughing', a noun-like preverb. But we
persist in listing *yangkarV 'shin' as 'local' to Wik-Munkan, KTh, KYak, YY, and
YTh, because of the uncertainty of the semantic connection of Western Desert
yangkarl-pa in meanings like 'hip, hip bone, thigh, buttock area'. By the same token,
YY ngelqer, YTh ngelker, and KYak ngalkar 'tongue' are 'local', because the semantic
connection of Proto-Ngayarda *ngalka.ri 'liver' remains unestablished.
We are aware that there are, on the other hand, a few extremely widespread
cognate-sets with meanings in the religious or abstract area, which for all the
perfection or near-perfection in their sound-correspondences may well be recent
(and ubiquitous) loans. A potential case is *pijarr(a), widely continued in Cape York
as 'dream' (N) and as far south as Gumbaynggir, with bijaarr 'name, language'
(Eades 1979: 349). We are confident, however, that such instances are far outnum-
bered by the unrecognized cognates in distant places of 'local' and, most especially,
'unique' forms.
10
10
An example of the kind of reshuffling that follows from the recognition of previously unrecognized
cognates is that of the words for 'fire' in UO, Olgol, KTh, and KYak. Here, the pseudo-reconstruction
*paathu accounts for KTh paath (ergative paathu + n) and KYak paath, and, with an otherwise
unattested sound-correspondence, for UO alh (ergative alhu +I), Olgol alh. The form *paathu is
(wrongly, in light of data presented below) classed as 'subgroup' in the counts that produced the
figures cited below, because (by assumption) these four languages constitute a subgroup (C.2, as
listed in Section 3.2.1; for readers who do not accept C.2 as a subgroup, this form iswrongly, in
light of data presented belowclassed 'local'). But, subsequently to making these counts, the
existence of the Adnyamathanha (South Australia) form yalhu 'flame' (McEntee & McKenzie 1992:
100; O'Grady 1979: 120) came to the authors' attention. This form strongly suggests that UO and
Olgol alh continue a PPNy form *yalhu (and therefore belong in the 'old' category) and throws the
KTh and KYak forms into the 'subgroup' list (i.e. 'unique' to subgroup C.2.a.i, a putative dialect-set,
as listed in Section 3.2.1; for readers who do not accept this as a subgroup this cognate set becomes
'local'). Parenthetically, with regard to *yalhu, the existence of a laminal lateral (lh) in an eastern
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

The category UNIQUE can at present be said merely to reflect the ignorance of the
researchers; the less well known a language is to us, the more inflated are the figures
for this category. It is likely, of course, that further research will shift a number of
our 'unique' forms into the 'local' category rather than the 'old', although we cannot
at present predict how many will go each way. And it is of course possible that some
of our 'old' forms are in fact 'pseudo-old', since 'old' words are as borrowable as any
others.
In the face of such uncertainties, we have striven to use methods that will not
produce underestimates of the rate of borrowing.
The eight languages in the area in question for which we have anything approach-
ing adequate data in this regard are Wik-Munkan (WM), Kuuk-Thaayorre (KTh),
Kuuk-Yak (KYak), Yir-Yoront (YY), Yirrk-Thangedl (YTh), Koko-Bera (KB),
Uw-Oykangand (UO), and Ogo-Nyjan (ON). For each of these languages we have
assembled lexicostatistical lists of 100, 120, and 151 words, each inclusive of the last
(see Appendix for the lists and sources).
11
We have counted the forms in each list in
each of the categories in the left margin of Table 1 and calculated the fraction in
each category.
The weighted figures in Table 1 are calculated by totalling up the number of
languages in which each 'local' word is attested, subtracting 1, dividing the result by
the original number: (n l)/n; then summing these figures to give a total for the
'local' category (compare in this regard Gleason's weighting system for numerical
comparison, the 'characteristic vocabulary index' (Gleason 1962: 28-29)). We are
interested in these weighted figures not with regard to the individual languages, but
only with regard to producing an average for all the languages. The reasoning is that,
if a certain number of languages share a form, at least one of them must have
originated it and not borrowed it. If we knew more about the relevant subgroupings
and sound-correspondences, we could weight the figures by subgroup rather
than individual languages; but rating by individual languages errs in the preferred
direction of overestimation.
The subgroups recognized are KTh-KYak, YY-YTh-KB-Koko-Babongk, the
Wik group, UO-Olk, and ON-Hann River Aghu Tharrnggala-Aghu Laya. We have
not included a Koko-Babongk (sister-dialect to KB) list in our count; hence the
'unique' figures are greatly inflated (at the expense of the 'subgroup' figures) for KB,
which is only much more distantly related to YY and YTh. They are similarly
inflated for UO, because we have had access only to a 100-word list of its
sister-dialect Oik; similarly again for Aghu Tharrnggala and Aghu Laya, which are
sister-dialects to each other and rather more distantly related to Ogo-Nyjan (ON).
footnote 10continued
language (UO) with a phonologically and semantically transparently cognate form in a south-central
language (Adnyathamathanha) virtually settles the issue of the Proto-Pama-Nyungan depth of
the apical-laminal contrast for laterals (see Dixon (1980: 485) for a discussion of the imperfect set
of putative cognates for 'earth, ground, soil' in forms like walya, yalya, and, it might be added, YY
yulh, a; cf. also the discussion in Alpher (1991: 301) of the YY form melh 'lily seed')-
11
Copies of the lists, which incorporate the judgements summarized here with respect to cognation
and 'old'-nesSj are available from the authors at http://www.anu.edu.au/linguistics/nash/aust/
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Table 1. Composition of word lists in eight languages (all entries are percentages)
Retentions
l. OLD
Innovations
2. SUBGROUP
3. LOCAL
4. IDENTIFIABLE
BORROWING
5. UNIQUE
6. Innovations
(SUBGROUP
+LOCAL
+ BORROWING
+ UNIQUE)
7. Fraction of
innovations
that are due
to borrowing
(LOCAL
+ BORROWING)
8. Weighted
fraction of
innovations
that are
due to
borrowing
Size
of list
100
120
151
100
120
151
100
120
151
100
120
151
100
120
151
100
120
151
100
120
151
100
120
151
WM
42
41
36
20
19
17
27
27
28
0
0
0
11
13
19
58
59
64
47
45
44
35
35
34
KTh
44
42
38
8
9
13
40
38
40
0
0
0
8
11
10
56
58
62
71
66
63
53
49
48
KY
40
39
36
8
10
14
41
39
39
2
2
1
9
10
9
60
61
64
72
67
64
55
51
49
YY
42
39
35
21
22
23
31
29
28
1
1
4
4
8
11
58
61
65
56
50
49
44
39
40
YTh
43
42
37
21
22
23
32
31
33
1
1
2
2
4
5
57
58
63
59
54
56
46
42
43
KP
41
38
34
3
3
3
28
25
28
2
2
1
25
33
34
59
62
66
52
43
44
33
28
28
UO
38
35
33
15
14
11
25
26
27
1
1
1
22
26
23
64
67
63
41
40
45
28
26
29
ON
31
31
27
10
10
8
28
27
27
0
0
0
32
32
38
69
69
73
40
39
36
26
25
24
Mean
40
38
35
13
14
14
32
30
31
1
1
1
14
17
19
60
62
65
55
51
50
40
37
37
Notes:
SUBGROUP subsumes forms found exclusively in a single subgroup, whether they are single
morphemes, new-formations, caiques with parts not cognate, or semantic-shifted. LOCAL
correspondingly subsumes morphemes, new-formations, caiques of all kinds, and semantic-shifted
forms; and UNIQUE subsumes single morphemes, new-formations, and semantic-shifted forms.
The figures in rows 1-5 were obtained by totalling up for each list the number of forms in each
category and dividing by the total number of 'meanings' for which there were entries. Row 6 was
obtained by summing the figures as indicated in the margin and dividing as above. Row 7 was
obtained by summing in the categories indicated and dividing by the figure used as the numerator
to obtain row 6; row 8 was obtained as row 7 was but with figures weighted as explained above.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

We defend these subgroupings below (Section 3.2), but we are confident that
different subgroupings would give figures that are in general comparable to those in
the table. Where subgrouping does make a difference is in the grouping of YY and
YTh with KB rather than with KTh and KYak. To reclassify in this way has the
result of shifting a large number of words common to YY, YTh, KTh, and KYak
from the 'local' class to the 'subgroup' class and shifting the small number of words
which KB shares with YY and YTh from the 'subgroup' to the 'local' class. In other
words, re-subgrouping in this case would lower the estimate of the rate of replace-
ment by borrowing. Questions of the correctness of subgroupings aside, we prefer
not to underestimate this figure. Estimates of the fraction of lexical replacement
accomplished by means of borrowing (Table 1, row 8) range from 55% to 26% for
the 100-word list, 51% to 25% for the 120-word list, and 49% to 24% for the
151-word list. We note again that these figures overestimate the fraction due to
borrowing. It is extremely unlikely that this quantity exceeds 50% in actuality in any
of these cases, and it is quite possible that the actual fractions have been in the
neighbourhood of, or below, the lowest figures given.
2.2.3. Estimate Based on Respect Vocabulary.
There is a part of the lexicons of Australian Aboriginal languages that provides
another means to count instances of lexical addition and their types. In the respect
(or 'avoidance') vocabulary and speech-style, a special word replaces one or more
ordinary words in conversation concerning (or in the presence of, or with) certain
relatives. Examination of such items yields information on the types and frequency
of processes of vocabulary-creation. Consider, for example, the respect vocabulary
of Yir-Yoront and its etymological origins (this material, together with relevant
forms from neighbouring languages, is listed in full as Table 12, pp. 105-107, of
Alpher (1991)). Among the 45 respect vocabulary items recorded for Yir-Yoront, we
find some 29 forms for which the Kuuk-Thaayorre respect vocabulary equivalent is
recorded; some 14-16 of these, or about 50%, are 'local' (where 'local' includes
words recorded only in YY and KTh); by hypothesis, these are loanwords in one or
the other or both languages, and for the purposes of this particular reckoning, all are
counted as loans in YY. Examples include YY lirrch, KTh rich 'spear' ( < L *tirrcha;
the ordinary terms are YY kalq, KTh kirk, both < *kalka); YY nhag, KTh nhangk,
WMng nhengk 'meat, animal' ( < L *nhangki; the ordinary terms are YY minh,
KTh minh, WMng minh, all<*minya). One term clearly identifies itself as a loan
from YY respect vocabulary to KTh respect vocabulary: YY yapm, KTh yapim,
Bakanh yaampany
12
'eye' (L *yaampany). There is one correspondence of a YY
respect term with a KTh ordinary one: YY (respect) wal 'ear' (ordinary term pin),
KTh (ordinary) waal 'ear'. The rest are constructed of YY resources on one or
another metaphor: larr = ma 'go' (ordinary terms larr 'ground', ma 'tread'),
latr=olhth 'lie, sit; fall' (ordinary tholhth 'fall'), thorrchonh 'dog', 'yam' (ordinary
thorrchn 'hair'), nhin 'smell (N)' (ordinary motr-nhin 'sweat', ngulhthrr 'smell'); or
they are apparently reversals: yiwn 'speak' (ordinary 'tell a lie'); or they include
12
Pakanh yaampany 'eye' is erroneously spelled 'yaapany" in Alpher (1991: 107).
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

reversals within their semantic range: wallon 'put, leave it', 'hit', 'carry' (ordinary
wa\l + on 'cause to go'); or are apparently unique (mur 'house', ngerrcher 'hair,
head'). Other instances that we are aware of in this region of a respect-vocabulary
item that corresponds to another language's ordinary-vocabulary item are KTh
(respect) ngoolpng 'blood' (Bakanh ngolpong, UO olbong 'blood', both ordinary), UO
(respect) ormyjogh 'hair' (YY ordinary thorrchn 'hair'), UO (respect) owilam 'blood'
(HR ordinary vghwilvni).
Comparable data exist for another part of northeast Australia, in Dixon's (1990:
50) study of the origins of items of respect vocabulary in Yidiny and in the Dyirbal
languages. Like us (see below in this section), Dixon has concluded that elaborated
respected vocabularies are quite recent in the area, and like us he has used
Guugu-Yimidhirr as a more distantly related point of comparison (1990: 8-9, 18,
21), finding that its respect vocabulary is related not at all to those of Yidiny and
Dyirbal.
13
For Yidiny respect-vocabulary roots, Dixon finds that those with cognates
in other registers and/or in other languages constitute 112 of a total of 191, or
59%.
14
The comparable figure for Dyirbal is 245 of 622, or 39%. These figures
average to 49%, remarkably close to our estimate of 50% for southwestern Cape
York Peninsula (and based on a much larger corpus), and we will accept them at
that nominal value for the sake of our argument.
15
It is of course to be expected that cognates will turn up for a number of words now
considered unique. And, since in this area no respect-vocabulary items have been
identified to date for which a reconstruction is possible for any subgroup of greater
time-depth than the dialect-group, newly-identified 'sames' will in all probably
increase the numbers of 'local' words, i.e. of putative loanwords. Clearly, however,
borrowing from other languages' ordinary vocabulary is not a statistically major
source of respect vocabulary in southwestern Cape York Peninsula. What is needed
if it is to be shown for other areas (the Wik area to the north is a possibility; see
Johnson 1990: 429 [1991: 213]) is a demonstration not just that it has occurred in
a number of words but also that these words constitute a large proportion of some
such fixed and short list of 'meanings' as those used here. No quantitative data are
as yet available, however.
We note in passing that the only language of this area for which anything
approaching the replacement of every ordinary lexical item takes place is UO. In YY,
by contrast, many words have no separate respect equivalent: waql 'crayfish', pothn
'prawn', thortm 'sand', kun-kolhth 'tail', kaqar 'moon', thaw 'mouth', etc. These
forms are used in the respect register as is, either preceded by the generic wangal
('hand [respect]' in isolation; wangal-pothn would be 'prawn [respect]') or in
sentences containing wangal as a particle. KTh apparently does the same.
13
A number of GYim respect-vocabulary items correspond to ordinary vocabulary items in Yidiny.
14
All categories except 'derived from the everyday style of the same dialect by phonological
deformation' (three of 191 in Yidiny; 48 of 622 in Dyirbal) and 'no cognate known' (78 of 191 in
Yidiny, 329 of 622 in Dyirbal).
15
Overestimating, if anything, the contribution of borrowing to vocabulary elaboration and
replacement. Note that extending Dixon's counts to include polymorphemic items would probably
increase the ratio of items derived language internally to those not so derived.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

YY in this matter as in others apparently participates in a respect-vocabulary
'pool' which centres to the northSharp's (1939: 256) area V. We know of just one
respect item (possibly) shared with a language to the south or east of this area: yulh
'tree' ( < *yuunyja), UO onjchar. In the larger area that includes both YY and GYim,
there is apparently no sharing of respect items: GYima language for which
Haviland (1979a and b) has provided an extensive respect lexiconshares 23% of
ordinary vocabulary
16
with YY but shares not one respect item with it. It is apparent
from this and other facts that respect vocabularies are a relatively recent creation in
this part of Australia. It is, then, proper to treat the creation of respect vocabularies
as a task of wholesale generation of new lexical items.
2.2.4. Conclusion: the contribution of borrowing to lexical replacement in these languages
is probably less than 50% at a maximum and most usually well below 50%.
Estimates of the extent of the contribution of borrowing to lexical replacement range
from 0% (based on lack of correspondence-mimicry between Yir-Yoront and
Kuuk-Thaayorre; Section 2.2.1.2) to 25% (based on anomalous sound-correspon-
dences between Yir-Yoront and Kuuk-Thaayorre; Section 2.2.1.1) to the somewhat
higher figures 50% (based on counts of local words in respect vocabulary, and likely
an anomalously high case; Section 2.2.3) and the following (based on counts of local
words in standard lexicostatistical lists of ordinary vocabulary; Table 1):
Estimates of fraction of replacement due to borrowing (percentages)
Size
100
120
151
of list
words
words
words
High
55
51
49
Low
26
25
24
Mean
40
37
37
We conclude that a maximum of about 50% is a fair guess for lists of about this
order of magnitude. We can use this estimate in the equations for equilibrium
percentage presented in the next section. For Sankoffs (1973: 103) general ex-
pression y + O-yO, when 0^0.5 then the equilibrium is at most 0.5 - y/ 2 (where y
is the 'probability of chance recurrent cognation'), i.e. less than 50%. In our
refinement of Dixon's particular model of languages along a strip (under the
assumption that loans are never distinguished analytically from common retentions;
see Section 3.1 and especially Section 3.1.2) the equilibrium figure is, in this range,
approximately one-half of the fraction of replacement due to borrowing, that is, a
maximum of about 25%. We stress that these figures are maximum estimates. With
these compare the figures for Germanic languages in Table 2. Here the highest
figures are 48% for Faroese and 45% for English, very much comparable with our
presumably overestimated maximum of 50%. The mean for the Germanic figures is
19%, lower than all of our positive estimates for Australian Aboriginal languages,
but not much lower. Note that the extreme estimate of 0% is based on lack of
15
Of the items in Hale's 100-word list, 23% are shared; 20% are shared and 'old'. The comparable
figures for O'Grady and Klokeid's list are 22% and 19%, respectively, and for our 120-word list, 21%
and 19%, respectively.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Table 2. Fraction of lexical replacements (Swadesh 200-word list; see Lees 1953) that are loans in
Germanic languages (abstracted from Embleton 1986: 79, 99-101)
Language
1
English
German
LS
TrS
Yiddish
PennG
Icelandic
Norwegian
Danish
Swedish
Frisian
Faroese
Mean
Ancestor
2
OE
OHG
OSax
OHG
OHG
OHG
Olcel
Olcel
OlCel
Olcel
OFris
Olcel
Time
3
0.98
1.13
1.13
1.13
1.13
1.13
1.08
1.08
1.08
1.08
0.63
1.08
N
4
200
200
177
200
200
200
200
200
200
200
163
200
C
5
131
159
140
148
138
144
167
145
142
146
106
144
Rp
6
69
41
37
52
62
56
33
55
58
54
57
56
L
7
31
3
3
1
12
4
1
11
14
10
17
27
RpB
8
0.4493
0.0732
0.0811
0.0192
0.1935
0.0714
0.0303
0.2000
0.2414
0.1852
0.2982
0.4821
0.1937
1. Modern languages: LS, Hamburg Low Saxon; TrS, Transylvanian Saxon; PennG, Pennsylvania
German.
2. OE, Old English; OHG, Old High German; OSax, Old Saxon; Olcel, Old Icelandic; OFris, Old
Frisian.
3. In millennia.
4. N, size of list (attestations from the Swadesh 200-word list).
5. C, number of true cognates (loanwords excluded).
6. Rp, number of replacements: N C.
7. L, number of loans.
8. RpB, fraction of replacements attributable to borrowing: L/Rp.
attestation from correspondence mimicry, and is not to be trusted because of the
speculative nature of the assumptions made and because of the preliminary nature of
the data.
3. Equilibrium Percentages
The classifier of Australian languages often has to estimate the degree of relationship
of two neighbouring languages that have apparently undergone little or no differen-
tiation by sound change. Suppose the two languages have a word in common,
similar or identical in sound or meaning. It is impossible to tell whether this
correspondence results from common retention from a protolanguage or is a
borrowing in one or both languages. Certain rough-and-ready methods of language
classification, such as lexicostatistics as usually practised, become difficult to apply
in such cases, because they are designed on the assumption that common retentions
are counted and borrowings are excluded from the count.
This problem prompted the innovation (Dixon 1970, 1972: 330-337) of a
lexicostatistical model with a new assumption: for a given pair of languages, all
words that are similar in sound and meaning are counted in a single category
('cognate'), and no attempt is made to weed out borrowings. The interesting result
follows that two languages in contact over a long period will approach, in their
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

vocabulary lists (whether 100-word or the entire vocabulary), an equilibrium fre-
quency of 'cognate' words, in which losses of vocabulary items from one or the other
language are balanced by items that become similar because one language has
borrowed them from the other. Sankoff (1972, 1973: 103) and Embleton (1986: 66)
touch on the same long-run equilibrium situation. Certain consequences of ethno-
historical interest obtain: if two languages have matching vocabulary items at
significantly higher than the equilibrium rate, they have not been separate languages
for very long; if they have matching vocabulary items at significantly lower rate than
the equilibrium, they have not been in contact for long. Dixon (1972: 331-336) is
the only author to have nominated a specific percentage for the equilibrium, and
puts it this way:
If the two dialects have been contiguous for a long enough time, they will
have about 50% vocabulary in common ... Fifty percent is an 'ideal'
equilibrium figure. We would expect in practice in two contiguous dialects,
that had been borrowing back and forth for sufficient time, to have between
40% and 60% common vocabulary ... Considering the amount of time that
aboriginal languages have been occupying Australia we should expect most
dialects to show common vocabulary percentages within these ranges; it
seems that very many do so.
(It is clear that 'contiguous dialects' as used here does not mean that the two dialects
belong to the one language.)
On the other hand, if the map of the lexicostatistical classification of O'Grady,
Wurm and Hale shows nothing else, it provides numerous instances of adjacent
languages which do share less than 40% or more than 60% common vocabulary. For
instance, Hale calculated that the adjacent western Gulf languages Mara and
Yanyuwa share around 2%, and O'Grady and Klokeid (1969: 309) give 90% for the
Western Desert dialects Antikirrinya and Yankuntjatjarra.
Dixon's hypothesis has an answer for such cases, of course, namely that the time
of contiguity is too short for the equilibrium to have been established (though there
is little indication of what sort of times are required to reach equilibrium). Thus,
because of 'the rather low lexical score between Yidiny and Ngadjan29%', Dixon
suggests that these two north Queensland languages 'had been in contiguity for a
relatively short time' (1977: 8), or that '[m]utual unintelligibility, coupled with
traditional pride and intertribal antagonism, is likely to produce a situation that
tends to increase, rather than reduce, linguistic differences' (1976: 218).
To what extent, then, is the ' 40%-60% equilibrium' model applicable, given the
host of other factors acknowledged to be at work? What proportion of the contigu-
ous languages in Australia with lexical scores outside the ' 40%-60%' range are to be
explained by factors other than length of time in contiguity, such as have been
advanced for Yidiny and Ngadjan?
The possibility that we investigate below is that equilibrium (with a definition of
'cognate' that includes loanwords) can occur, but that the equilibrium rate is not
nearly as high as 50%. The consequence for ethnohistory is that neighbouring
languages which share significantly less than 40-60% of their vocabulary can
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

nevertheless have been spoken in contiguity (in their present locations, say) during
the greater part of the period of their linguistic separation. It is not necessarily the
case that two such languages have only recently become neighbours.
The estimate we reach, based on the data summarized in Section 2.2.4, is that the
equilibrium rate is very unlikely to exceed 25%, and quite likely to have been less
than this. Consequently, where Dixon's figure of 50% ( 10%) leads to the con-
clusion that the per cent 'cognate' between geographically contiguous languages
cannot fall below 40%, our figure of 25% ( 5%) indicates that this figure cannot fall
below 20% (maximum estimate). Therefore, geographically contiguous languages
with 'cognation' percentages in the 20-40% range need not be assumed to have
developed apart and come into contact only recently, and claims as to the distance
of their relationship can be made (admittedly tentatively) on lexicostatistical grounds
alone.
We are also of the belief, although we cannot demonstrate it from currently available
figures, that language pairs like the geographically contiguous Mara and Yanyuwa, with
2% common vocabulary, as mentioned above, can well have been neighbours for quite
a long time and possibly for the entire period of their linguistic separation (and are
related only very distantly, since the lexical retention rate, as discussed in Section 2.1,
seems not to be to any great degree lower than that found elsewhere in the world).
Such claims constitute a large part of the original (O'Grady et al. 1966) postulation of
the 'Pama-Nyungan' and other subgroups in Australia. We believe this (i) because our
equilibrium figure, as we have been at pains to point out, is an overestimate, possibly
by a considerable amount, and (ii) because the wholesale migration of linguistically
defined groups (as postulated in Dixon 1972: 331, 332, 1980: 239, 1991: 5-7 (mass
migration of Yidiny)) is inconsistent with current anthropological understanding of the
normal Aboriginal relationship to land. Language shift by a group who maintain their
geographical situation and land-tenure cannot of course be ruled out (and Dixon's
(1991: 7) hypothesis of Yidiny intermarriage with 'Pygmoid tribes' and the adoption of
the Yidiny language by these latter not only does not rule it out but postulates it). But
mass migration or language shift in situ is, we believe, by no means an assumption that
is necessary to account for very low cognate percentages, and language shift in situ
would (all things being equal) be expected to produce an increase of shared vocabulary
(persisting via a substrate effect) rather than a decrease.
3.1. Models of Borrowing and Lexical Change
Advances in lexicostatistical models have been concerned with refinements such as
the computation of error limits, allowing for variable retention rates (both with time,
and across the vocabulary),
17
and allowing for chance recurrent cognationsee, for
instance, Dyen et al. (1967), Sankoff (1973), and Embleton (1986).
17
Any respectable lexicostatistical method must be controlled for the interlocking effects of list size
and 'basicness' of vocabulary. This paper (aside from the obviously less 'basic' nature of the items
in our list items numbered 121 and higher; see the Appendix) does not deal with these effects. We
emphaize, however, that they are indeed important, and we recommend Breen (1990: 154-163) and
Black (1997: 62 et passim) for the Australian situation.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

As far as we know, Sankoff (1972) and Dixon (1970) were the first to model the
effect of borrowing on the proportion of shared vocabulary between languages in the
same regionother lexicostatistical models assume that loan words had been
identified and 'weeded out' from calculation of cognate percentages. We know of
these subsequent treatments: Guy (1980a: 26-27) has some advice on the treatment
of loanwords within lexicostatistical analysis; Embleton (1981, 1986) corrects,
condenses, and runs some simulation tests of Sankoff s (1972) model of borrowing;
and Embleton (1986) runs models incorporating a treatment of loanwords on the
Germanic, Romance, and Wakashan families to compare the fidelity of lexicostatis-
tically generated trees with trees generated on the basis of other evidence (for
Germanic and Romance) and to explore a high-borrowing situation (Wakashan).
Dixon's treatment remains the only one in which the equilibrium possibility is
focused on and a percentage proposed.
Sankoff (1973: 103), building on his 1969 dissertation, shows a general expression
for the expected value of the proportion of unaffected meanings after time t, and
remarks that 'As t gets very large, the lexicostatistic relationship approaches an
equilibrium value y + 0 yO, dependent only on borrowing and chance cognation.'
(The two quantities are y the probability of chance recurrent cognation, and 0 the
'borrowing probability' or the fraction of replacement attributable to borrowing
(Sankoff 1973: 100), identical to our L below.) Sankoff s equilibrium expression is
endorsed by Embleton (1986: 66).
We need then to explore where Dixon's reasoning differs from the general
lexicostatistical model, beginning with the assumptions made in Dixon's (1972)
model. First, we list assumptions that we too are prepared to entertain:
(i) the word lists used in the comparison are of the same size; in the limit, the
assumption is that the two languages' lexicons are of the same size;
(ii) both languages replace the same fraction by borrowing, and the same fraction
by other means;
(iii) items that become similar because both languages make identical new-
formations with old cognate items are not represented in the model (ignoring
this probably has statistically minor consequences);
(iv) borrowings, even if identifiable as such by sound-changes, are counted as
'cognates' along with retentions (note that the opposite has been explicitly the
procedure since the earliest literature on the subject; Lees (1953: 115) and
Black (1997: 62) for example count loans as noncognate; Guy (1980a: 27) feels
they should be ignored; identifiable loans comprise at any rate a statistically
minor part of our own data); and
(v) vocabulary is replaced at the same rate in both languages (the assumption
common to most lexicostatistical models).
These assumptions, however much they oversimplify, are consistent with the use of
the model as a 'rough-and-ready' method of language subclassification.
Now, consider Dixon's model. We write p(i,j) for what Dixon writes as pij, viz.
the proportion of vocabulary that languages I and J have in common, and reconsider
his situation of four languages A, B, C, D in a coastal strip so that B has just A and
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

C as neighbours, and C has just B and D as neighbours. (Dixon also has an E but
never uses it. Note that a plausible geographic setting also is along a peninsula.)
3.1.1. Preferential Borrowing. The model assumes that languages B and C are
replacing their vocabulary over time, and are doing so by loans only from the two
adjacent languages (A and C for B, B and D for C). Dixon first assumes that a
language will borrow with equal probability from both adjacent languages, and then
devotes a page to investigating that assumption.
Note that both sets of Dixon's calculations assume that p(b,d) = 0since he
assumes that the amount of vocabulary shared by C with either B or D is simply the
sum of the amounts shared with B and D, respectively. This goes against his
observation (1972: 335) that two languages with just one language separating them
would in 'equilibrium' share about 20% vocabulary. This figure of 20% is also a
logically necessary minimum in the case where language C shares 70% with B and
50% with D. Dixon's model does not specify for the minimum of 20% which all
three of B, C, D have in common how a form is to be replaced. By hypothesis, it is
replaced by borrowing from B or D, but this is not possible.
In general, a reasonable estimate for p(b,d) is the product of p(b,c) and p(c,d):
p(b,d) = p(b,c)-p(c,d)
In the 50% 'equilibrium' situation, then, p(b,d) would be 25%, and would range up
to 35% and down to 10% in the two non-equilibrium cases considered by Dixon.
Thus the expressions for borrowing trends need to be amended. Let C(b) [resp.
C(d)] be the probability that C copies a replacement term from language B [resp.
D]. First, consider how to calculate C(b). The replacement term must perforce
come from the vocabulary of B not shared with Cthe absolute chance of this is
[1 p(b,c)]. The alternative to the replacement being from B is that it is from D
(which has chance [1 - p(c,d)]), and to be clearly from B it cannot be in the
common vocabulary of B and D (which has chance p(b,d)). Parallel reasoning for
C(d) gives the pair of formulae:
C(b) = [1 -p(b, c)]/{[l -p(b, c)] + [1 -p(c, d)] -p(b, d)}
C(d) = [1 -p(c,d)]/{[l -p(b, c)] + [1 -p(c, d)] -p(b, d)}
Consider the examples investigated by Dixon, and the effect of assuming p(b,d) is
20% rather than zero:
1
p(b,c)
0.2
0.7
0.2
0.7
2
p(c,d)
0.5
0.5
0.5
0.5
3
p(b,d)
0
0
0.2
0.2
4
C(b)
0.62
0.37
0.73
0.5
5
C(d)
0.38
0.63
0.45
0.84
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Note: Column 3 controls the two cases considered; columns 4 and 5 are calculated
from columns 1-3 by the above formulae.
These figures support our adoption of this particular simplification of Dixon's, as
C(b) still exceeds C(d) when p(b,c) is low, and vice versa.
3.1.2. Reduced Borrowing. There is another effect on the equilibrium proportion of
cognates, to which Dixon's model is quite sensitive. Dixon's model assumes that a
word is always replaced by borrowing (rather than by resources within the lan-
guagesee the possibilities listed in Section 1), whereas it is more realistic to assume
that borrowing is resorted to in fraction L of cases; with L < 1, whereas Dixon's
model has L= 1; our L is Sankoff s 0). Keeping Dixon's other assumptions, we can
deduce that the equilibrium cognate proportion is L/2 (consistent with Dixon's
50%). In other words, the equilibrium rate is half of the fraction of lexical replace-
ment that is accomplished by borrowing.
We argue that we cannot apply the
l
40%-60% equilibrium' rule of thumb if L is
significantly less than 0.8, that is, if borrowing is not the highly dominant replace-
ment strategy. (On the basis of the data from some Cape York languages considered
elsewhere in this paper, it appears that L may be somewhere between 0.25 and 0.5.
Embleton (1981: 112, 1986: 79) modelled situations with Lr [her b] ranging from
0 to 0.3.)
The model with L < 1 proceeds as follows: Let p'(b,c) be the fraction of B's
vocabulary that is also in language C after the lapse of a fixed period of time in which
a fraction r of the vocabulary of each language has been replaced. Since the lexicons
are assumed to be equal in size (assumption (i)), p'(b,c) = p'(c,b), etc. Now,
Dixon's (1972: 332, with p(b), etc. assumed to be 0.5) model may be recast as the
statement that:
p'(b,c) =p(b,c) -r-p(b,c) -r-p(b,c) +r/2 +r/ 2
new = old - fraction lost - fraction lost + fraction + fraction
shared shared by C by B borrowed by borrowed by
C from B B from C
= p(b,c) + r ( l - 2p( b, c) )
Equilibrium, at which p'(b,c) = p(b,c), obtains when p(b,c) = 1/2, as Dixon shows.
18
18
This calculation appears to assume that no word borrowed by C can be identical to a word of B
unless it was borrowed from B (and similarly for B)in other words, that 50% of the borrowed
vocabulary of each language happens to be identical to vocabulary in the other language whether or
not the other language is the proximate source of the borrowing. Further, as Paul Black (p.c., April
1981) brought to our attention, the calculation also appears to assume that B and C do not lose the
same words. If word replacement is independent in the two languages, and also independent of
whether or not the word is shared, the term 2rp (b,c) is too large, and should be reduced by the
proportion of common replaced vocabulary, r2p(b,c). By the same token, the fractions B or C appear
to have borrowed from each other are less than r/2, as some items borrowed by B from C (say) may
be lost by C before the end of the time period under consideration. Our tentative understanding is
that the amendments effectively cancel each other out and the equation above is not affected.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Now, consider the modification of the above equation when a borrowing pro-
portion
19
L is incorporated:
p'(b,c) = p(b,c) - r-p(b,c) - r-p(b,c) + Lr/2 + Lr/2
= p(b,c) - 2rp(b,c) + Lr
= p(b, c)+r(L-2p(b, c))
Equilibrium, defined as before as the state in which p'(b,c) =p(b,c), occurs when
p(b,c) =172. And, just as in Dixon's model, we can see that if p(b,c) <I 72, then p'
(b,c)>p(b,c); conversely, if p(b, c)>L/2, p'(b,c) <p(b,c)in other words, the
passage of time brings the cognate percentages towards the equilibrium figure of
(172)* 100%, that is, an equilibrium in which the contiguous languages share the
fraction L/2 of their vocabulary.
A similar result is obtained by using the somewhat more sophisticated model of
Sankoff (1972), as refined by Embleton (1981, 1986). First, we need to clarify the
different usage of symbols for the parameters involved:
PARAMETER ALPHER and NASH SANKOFF, EMBLETON
cognate fraction p(x,y) SXY
replacement rate r (including loans) r (excluding loans)
borrowing rate Lr b, 0(r + b)
Note that we use r for the quantity that Sankoff and Embleton break up into their
(r + b); i.e. our L is their b/(r + b).
Now the general equation (Embleton 1981: 107, [14], 1986: 73, equation 3-11),
for a model incorporating borrowing between neighbouring languages, allows for
any number of languages, each with a different number of neighbours. If we take the
case of their equation which applies to our example of the four languages A, B, C,
D, and use our symbols for the quantities involved, the following expression results
for the time-derivative of p(b,c):
- 2(r - Lr)p(b,c) + (Lr/2) [2 - 4p(b,c) + p(a,c) + p(b,d)]
= r(L - 2p(b,c)) + (Lr/2) [p(a,c) + p(b,d)]
At equilibrium, the derivative (rate of change of p(b,c) with time) is zero, and thus:
p(b,c) = 172(1 + [p(a,c) + p(b,d)]/2)
which is approximated by our result, viz. p(b,c) = 172, when the languages that do
not border (A and C, B and D) have relatively low cognate percentages, but even
were p(a,c) = p(b,d) = 1, at equilibrium p(b,c) would not exceed L. We can more
realistically assume that the equation for p(b,c) holds for the long-run equilibrium
p between any adjacent pair of languages along the modelled strip. We can
reasonably assume independence between the languages and so two non-neighbour-
ing languages separated by a third language have at equilibrium a common vocabu-
lary fraction p
2
; i.e. p(a,c) =p(b,d) =p
2
at equilibrium. Thus at equilibrium the
19
That is, vocabulary replacements which result in shared forms between B and C, whether because
of borrowing or because of identical new formations using available cognate forms.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

0.8-
0.6-
0.4-
0.2-
0.2 0.4 0.6 0.8
1.0
Figure 4. Graphs showing two models of dependency of equilibrium fraction (p) on fraction of
replacement due to borrowing (L); p = (1 - V(l L
2
))/L compared with p = L/2.
equation may be simplified to p = (L/2)(l + p
2
). When solved for p, this is equivalent
to p = (1 V( l L
2
))/L. For L< 0.6, L/2 is a good approximation to equilibrium
p, as can be seen from the graph of the two quantities as in Figure 4. As the graph
shows, when L= 1 (all replacement is by borrowing) this model (and Sankoffs
1973:103) predicts p = 1 at equilibrium, namely merger of the erstwhile neighbour-
ing languages across the entire lexicon.
As we stated, this model is of languages with two neighbours, as along a strip.
When there are more than two neighbours of each language, the implication of the
Sankoff-Embleton general model is that equilibrium p between neighbours will be
less than in the two-neighbour model. In other words, we take equilibrium p to be
usually less than L/2, especially when L<0. 6) .
We emphasize that this finding is more than a suggestion (contra Dixon (1997:
27, fn. 14) in his comment on a draft of this paper). So long as it is recognized that
not all lexical replacement is by borrowing (and Dixon has repeatedly concurred in
this judgement), the conclusion that the equilibrium rate must be less than 50% is
compelling. Furthermore, if it is accepted that estimates of the amount of resort to
borrowing that are based upon consideration of the origins of respect vocabulary are
applicable to ordinary vocabulary as well (Dixon's and ours are about 50%; see
above), then the conclusion is compelling that the equilibrium rate is not above
25%. Moreover, the other methods of estimation that we have employed suggest
that the rate can be well under 25%.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

3.2. Lexicostatistical Subgrouping with Loans Taken into Account: the Southwestern
Paman Case
It is instructive to perform an exercise in lexicostatistical subgrouping with an eye
towards the effects of recognizing or not recognizing 'local' forms among the 'sames'.
This we do below for the southwestern Cape York languages studied above (Section
2.2.2), with the addition of Kok-Narr (Breen 1972; 1976 a,b). These are languages
for which we possess the relevant data on word distribution, and whose subgrouping
can be hypothesized on independent groundsthe usual criteria of shared innova-
tions in morphology and phonology. The lack of fit between any of the several
possible lexicostatistically generated subgroupings of these languages with the sub-
grouping(s) based on shared innovations constitutes a suitable 'worst-case' test for
lexicostatistics. To the extent that subgroupings to be presented in this section, which
are arrived at by nonlexicostatistical means, are in future refined in the direction
indicated by lexicostatistical results, our general point is strengthened.
20
The classification of these languages on the basis of shared innovations is approx-
imately as follows (based on Alpher 1972 and 1976, with certain modifications as
suggested by Black 1980 and p. c, and with other modifications formulated for the
first time here). A graphic version of this classification is given as Figure 3. We must
emphasize that there is very little about this classification that is not controversial to
some degree or another (the list of criterial features below is not, however, to be
taken as exhaustive). But there are no variants except the lexicostatically generated
ones themselves that do not come into more-or-less the same degree of conflict with
the lexicostatistically generated classifications. Furthermore, the claim most relevant
to the present discussion seems fairly solid, that Kuuk-Thaayorre and Yir-Yoront do
not, lexicostatistical evidence notwithstanding, constitute a genetic subgroup.
Pama-Maric subgroup (Hale 1964, 1976a,b,c nd a,b,c).
A. Wik languages: Wik-Mungkanh, Pakanh, and others (Hale 1976c, nd a).
B. Inland Pama: based on the imperative suffix -ng, certain vocabulary not recorded
elsewhere, like *ata- 'to look, see', and on highly complex vowel inventories
(whose elaboration may or may not post-date the language split).
21
1. Ogo-Nyjane.g. atvng 'look (imp)'.
2. Aghu-Laya (Kuku-Thaypan, Aghu-Tharnggala)e.g. tang 'look (imp)'.
20
St udi es relating lexicostatistically det er mi ned subgroupi ngs to t hose det er mi ned by ot her means
have been done since t he begi nni ng of lexicostatistics in t he 1950s. We do not review t hese her e,
but we call r eader s' at t ent i on t o t wo r ecent studies of I ndo- Eur opean l anguage classification whi ch
have obt ai ned excellent degrees of fit bet ween subgroupi ngs made accordi ng t o t he t wo different
met hods : Dyen et al. (1992) and Embl et on ( 1986) .
21
Gr oup B (Inl and Pama) is probl emat i c because Olgol (of Chillagoe; Philip Hami l t on, p. c. , 1996),
whi ch is in t he area but not in this put at i ve gr oup, attests an i mperat i ve in -ng and because t here is
at t est at i on elsewhere in this area (Yi r-Yoront for one; see Al pher 1991: 17) of t he addi t i on of
word-final ng as a strictly phonol ogi cal process. Thi s gr oupi ng does however receive some
lexicostatistical suppor t (from Hal e 1961 and nd a); it is t he same as t he ' Sout her n Pa ma ' of O' Gr ady
etal. (1966:54).
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

C. Southwest Pama: based on the past perfective suffix -/ in the intransitive
conjugation(s) and in contrast with a past imperfective tense-aspect.
22
1. Norman Pama (Kurrtjar, Kuthant, and Walangama; see Black 1980 and nd b).
2. RR languages: based on a past perfective suffix -rr in the transitive conjugation(s)
and in contrast with a past imperfective tense-aspect (see notes 24 and 26 and
remarks below on -rr).
a. M languages: past perfective tense-form of 'to see' a reflex of *nhaawa + l;
common retention of a past imperfective suffix -(n)m in the transitive (L)
conjugation in contrast with a past perfective tense-aspect.
23
i. Kuuk-Thaayorre (past perfective of 'see' nhaaw + r, r in this context descends
regularly from *1) and Kuuk-Yak; both languages with past imperfective -m.
M
ii. Uw-Oykangand (past perfective of 'see' ewa +1) and Olkola past imperfective in
-nm (Alpher's notes, Sommer 1969 and 1972, and p.c. from P. Hamilton).
3. NT languages: past perfective suffix -nt, possibly an innovation consisting in the
addition of -t to the existing (conservative) *-n.
25
a. Yir-Yoront and Yirrk-Thangalkl: loss of the nasal in a homorganic nasal-stop
cluster.
b. Koko-Bera and Koko-Babongk (Alpher's notes, Black nd a).
22
Gr oup C ( Sout hwest er n Pama; pr oposed by Al pher 1972 and wi t h t he addi t i on of No r ma n Pa ma
as suggest ed by Black 1980: 1930) is pr obl emat i c. See also t he r emar ks on Kok- Nar r in not es 2 3 ,
24, and 26.
23
Group C.2.a (Kuuk-Thaayorre and Uw-Oykangand, etc.) is among the least compelling of these
postulated subgroups. Their common past imperfective in (n)m now appears (despite Alpher 1972:
80; cf. Alpher 1990: 157) to be a common retention rather than an innovationbut in this regard
note that Kok-Narr imperfectives in -nh and -ny are also conservative, and hence their relationship
with the putatively conservative *-(n)m is problematic.
24
Gr oupi ng C. 2. a. i ( Kuuk- Thaayor r e and Kuuk- Yak) is pr obl emat i c because t ense- par adi gm dat a
for t he latter are al most unavai l abl e and what dat a t her e are do not reflect some of t he criterial
i nnovat i ons. Th e past t ense-forms of ' see' , ' fall' , ' gr ow' , and ' bi t e' in Kuuk- Yak are, respectively,
nhakvn ( KTh nhaawr), wontvn ( KTh wontr or wantr, Koko- Ber a wantal),piinhthvn ( KTh piinhthitr),
andpathvn (KThpatharr) Th e compl et e absence from available Kuuk- Yak dat a (Al pher' s not es; Hal l
1968) of past perfectives i n -rr (an i nnovat i on i n this area) or -r ( < *-l; conservative in this area) in
favour of -n (whi ch is conservat i ve in Pama- Nyungan, al t hough not in t he par adi gms of ' see' and
' fall' see Al pher 1990: 162- 164; it apparent l y does not cont i nue at all in KTh ) woul d be difficult,
t hough not i mpossi bl e, t o explain away. Alternatively, it coul d force us t o recogni ze a spect acul ar
case of massi ve bor r owi ng, bri ngi ng a di st ant l y-rel at ed l anguage i nt o dialect-like similarity wi t h
Kuuk- Thaayor r e. I n suppor t of t he l at t er possibility are t he inflated numbe r s of ' l ocal ' wor ds in
Kuuk- Thaayor r e and Kuuk- Yak (see Tabl e 1).
25
Thi s gr oupi ng (Yi r-Yoront and Koko- Ber a) is a mat t er of di sagr eement ; cf. Al pher ( 1972: 76- 78)
on t he one ha nd and Black ( 1980: 193) on t he ot her. Not e , however, t hat t he ar gument s given in
Al pher ( 1972) i ncl ude not just (i) t he *- nr pas t endi ng but also (ii) t he exi st ence of ' s t r ong' versus
' weak' verb conj ugat i ons (t he former wi t h disyllabic forms of t he r oot al t ernat i ng wi t h disyllabic ones)
wi t h cor r espondi ng cognat e verbs and (iii) l oweri ng of sequences of t wo hi gh vowels t o mi d in
cor r espondi ng cont ext s. See Al pher (1972) for details and mor e expl i cat i on; t he aut hor st ands by
his concl usi ons.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

4. Other: Kok-Narr: past perfective in -rr contrasting with conservative past
imperfective in -nh or -ny.
26
The groupings that are nonproblematic here are C.I (Norman Pama; fide Black
1980: 191-194), C.2.a.ii (Uw-Oykangand and Olgol; dialects of a single language),
C.3.a (Yir-Yoront and Yirr-Thangalkl; dialects), and C.3.b (Koko-Bera and Koko-
Babongk; dialects). With regard to problems with various less-compelling subgroup-
ings, see the appropriate notes. Most problematic of all are groups C.2.a
(Kuuk-Thaayorre and Uw-Oykangand; see note 23) and C.2 (see remarks on -rr
below).
The Pama-Maric grouping itself has never received published justification. We
take the composition of Pama-Maric, for the purposes of the argument, to be the
same as that listed in O'Grady et al. (1966: 51-54), with the exception (exclusion)
of the 'Yara' subgroup (Dyirbal, Warrgamay, Nyawaygi, and other languages of the
southern rainforest area), and with 'Gulf Pama' defined as 'Norman Pama' as above
(group C.I). The positions of Kukatj and the Mayi languages, spoken to the
southwest of Normanton, remain problematic but do not affect present conclusions.
Subgroups A-C of Pama-Maric as postulated here are three out of an estimated
10-25 subgroups within this larger group. The exact number and composition of
these subgroups have not been the subject of an exhaustive piece of research, but we
are assuming that it will turn out to be of this general order of magnitude; the
number is of relevance in our working definition of 'local words' (Section 2.2.2).
The *-l and *-rr past perfective endings on the basis of which groups C and C.2
(respectively) are hypothesized have apparent cognates outside their respective
groups. There is an -l imperfective in Kukatj contrasting with perfectives in -n and
-nh (Breen 1976a: 159); there is a -la, designated pluperfect and in contrast with
pasts in -na and -nya, in Umpila (O'Grady 1976: 193, 199); and there is a -la past
in the Marie languages Warungu (Tsunoda 1974), Bidjara (Breen 1973), and
Margany-Gunya (Breen 1981a). By the same token, Umbuygamu (Sommer 1976:
21) contrasts a tense in -rr designated pluperfect with a past in -n; note also the
problems with Kuk-Narr -rr mentioned above (and it is quite possible that *-rr is the
conservative ending in languages of this area, in contrast to an innovative *nt). But
the criterial value of *-l and *-rr for distinguishing C and C.2 as subgroups lies in
their being conjugation-specific and having past perfective value in contrast with
past imperfective.
Although it has been claimed (Merlan 1982: xii-xiii; Thomason & Kaufman
1988/1991: 18 and Section 2.1 passim) that verb paradigms, and irregular ones at
that, are not immune from borrowing between fairly distantly-related languages, we
26
Logically, this belongs with the 'RR' languages, but there are problems with the interpretation of
the -rr endings in the several languages in which they occur (see below), and for that matter there
is a problem in the apparent absence of a past perfective in *-l in Kok-Narr; the past perfective -ng
is an apparent innovation which is assumed to have replaced it. Kok-Narr appears to group with the
Southwest Paman languages on other grounds, but it is here placed with them in a 'flat' grouping
as a matter of caution.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

WM KTh KYakUO YY YTh KP KN ON
Figure 5. Subtree from Figure 3, with languages of Table 3.
believe that these particular suffix distributions are not susceptible of this interpret-
ation. Take as an example the Yir-Yoront past perfect in the L conjugation, which
is zero accompanied by vowel ablaut in certain verbs, like puy 'bit' (nonpast pay +1).
This can be shown (see Alpher 1972, 1989, 1990, 1991: 119) to have replaced an
earlier *paya + r, itself cognate with Yirrk-Thangalkl payd +1 and Koko-Bera
pathent, both of which continue Proto-Pama-Nyungan *patja 'bite' and the past
perfective ending *-nt(V). But the modern reflexes of *-nt(V) are phonetically
sufficiently different to make borrowing a highly unlikely explanation. Note that
social closeness including a Sprachbund relationship, whose importance in the
borrowing of morphological systems is stressed by Thomason and Kaufman (1988/
1991: 13-34), is attested between Yir-Yoront and Kuuk-Thaayorre (Sharp 1934,
1958) but not between Yir-Yoront and Koko-Bera.
We now pass to the classifications suggested by lexicostatistics. The material for
lexicostatistical subgrouping is the numerical information contained in Table 3.
The 'correct' subgrouping of these nine languages, based on the 'Pama-Maric
subgroup' as detailed above, is as shown in Figure 5 (a subset of the tree shown in
Figure 3).
We applied lexicostatistical methodssuch as the n-way splitting algorithm
applied to unrecomputed linear-correlation coefficients, favoured by Guy (1980a:
38)to the data of Table 3 to produce subgrouping trees. A subgrouping produced
by this algorithm and based on a count of all lexical 'sames' in two-language lists,
regardless of source (i.e. derived from the figures in the right-hand column of
Table 3) yields the classification shown in Figure 6.
A subgrouping made by the same algorithm but based on a count which excludes
putative borrowings (derived in this case from the figures in the middle column
['old' + 'subgroup'] in Table 3) yields a slightly different grouping (Figure 7).
From both calculations there emerges a close relationship between Yir-Yoront
and Yirrk-Thangalkl on the one hand and Kuuk-Thaayorre and Kuuk-Yak on the
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

CO
Table 3. Fraction cognate. Left column: per cent of sames retained in the OLD category; middle column: per cent of sames retained in the OLD and
SUBGROUP categories; right column: per cent of sames in list regardless of source P
List KTh KY YY YTh KB UO ON KN 1?
100 30 30 40 27 27 37 27 27 34 28 28 33 25 25 25 25 25 27 19 19 22 19 19 19
120 W M
151
100
120
151
100
120
151
100
120
151
100
120
151
100
120
151
100
120
151
100
120
151
30 30 40
29 29 41
25 25 38
KTh
27
27
24
37
36
34
27
27
24
45
45
46
KY
37
38
36
80
78
81
27
25
22
33
30
26
31
29
26
27
25
22
33
30
26
31
29
26
YY
34
34
31
54
50
46
54
51
50
28
26
24
32
30
27
32
30
28
44
41
35
28
26
24
32
30
27
32
30
28
63
61
57
YTh
33
33
31
51
48
48
54
52
51
96
92
89
25
24
22
25
24
22
22
22
21
26
24
20
26
25
21
25
24
22
25
24
22
22
22
21
28
25
22
27
26
23
KB
25
24
22
28
27
25
25
25
24
32
29
26
33
31
28
25
21
21
30
28
25
30
28
26
29
26
23
29
27
24
25
24
23
25
21
21
30
28
25
30
28
26
29
26
23
29
27
24
25
24
23
UO
27
26
24
34
31
29
33
32
30
34
31
27
35
33
30
28
27
29
19
19
16
18
18
16
18
18
16
23
21
18
22
22
19
20
19
18
19
19
18
19
19
16
18
18
16
18
18
16
23
21
18
22
22
19
20
19
18
19
19
18
ON
22
22
19
21
21
19
21
22
20
27
26
21
28
28
23
24
23
22
36
34
33
19
20
17
19
19
18
17
18
17
20
19
16
19
19
16
24
23
21
17
17
17
12
13
13
19
20
17
19
19
18
17
18
17
20
19
16
19
19
16
24
23
21
17
17
17
12
13
13
KN
19
21
19
20
21
20
18
20
10
22
22
18
21
22
20
38
35
34
17
18
18
13
15
16
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

WM KTh KYakYY YTh KP UO ON KN
Figure 6. Subgrouping based on a count of all lexical "sames" in two-language lists, regardless of
source (i.e. derived from the figures in the right-hand column of Table 3).
other, a relationship which finds no support in the comparison of verb paradigms.
From neither calculation does the close relationship between Yir-Yoront and Yirrk-
Thangalkl on the one hand and Koko-Bera on the other emerge. This linkage of
YY-YTh with KTh-KYak (with the exclusion of KB) is in fact the subgrouping of
WM KTh Kyak YY YTh KP UO ON KN
Figure 7. Subgrouping based on a count that excludes putative borrowings.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

these three sets of languages that was postulated in Sommer (1969) and again
in Wurm (1972). Exclusion of putative borrowings (the second calculation) has
a few desirable effects, however: it eliminates pseudo-subgroups linking
Wik-Mungkan with Kuuk-Thaayorre and other languages to the south, and
linking Uw-Oy-ka-ngand with Ogo-Nyjan; and it (correctly, as we believe) suggests
that Koko-Bera and Uw-Oykangand group with Kuuk-Thaayorre, Kuuk-Yak,
Yir-Yoront, and Yirrk-Thangalkl but not with Wik-Mungkan.
Taken as a whole, none the less, neither of these lexicostatistically-derived
subgroupings is conspicuously successful. With regard to inferences made from the
data in Table 3, it is of course possible to use algorithms other than the ones (Guy's
LXSTAT and SIMULA) used to produce the trees shown above. We were able to
obtain cluster analyses of the nine lists of percentages shown in Table 3 (SPSS-X
Release 3.1, run 17 May 1989 on an IBM 3081KX computer); some of the resulting
trees are shown as Table 4.
With nine languages, there are 84 possible sets that consist of three languages
each, and within each of these sets the three languages have some Stammbaum
relationship. Each such group of three is susceptible of four different subgroupings:
(AB)C, A(BC), B(AC), or the 'flat' (ABC). We compared the lexicostatistically
generated subgrouping (calculated here on the 120-word lists), for each of these
triplets, with the subgrouping hypothesized on the basis of shared innovations (by
assumption the 'correct' one), and scored the match as correct or incorrect. The
resulting 'fidelity' scores, expressed as the number 'correct' out of 84 (by definition,
84 for the 'correct' subgrouping),
27
are 32 and 45, respectively, for the 'all sames'
and 'old + subgroup' groupings produced by Guy's algorithm (above) and are as
given in Table 4 for those produced by cluster analysis from the 120-word lists. The
highest score is the 45 (54%) for 'old + subgroup' (Table 4C) by cluster analysis,
identical to the 45 (54%) obtained with Guy's algorithm for the same data.
If, however, Wik-Mungkan is removed from consideration, the apparent relative
merits of the methods change considerably. Out of 56 possible, the scores for the
two subgroupings obtained by Guy's algorithm and the three in Table 4A-C are,
respectively, 19, 29, 18, 31, and 40 (71%). This last is significantly better than 29
(52%), the highest obtained with the other algorithm. And it is reasonable to
exclude Wik-Mungkan, on the grounds that it is a dialectally complex language and
its word list has been assembled from different sources, and that furthermore more
language territories intervene between it and its nearest neighbour in the sample
(KTh) than between any other 'adjacent' members of the sample. The exclusion of
Wik-Mungkan also results in an improvement of fidelity scores calculated by a
second method, which is described below.
For a statistical method applied to a known 'worst case', 71% is perhaps not too
bad. Note however that this is by virtue of considerable manipulation; the method
is hardly a 'rough and ready' one. The scores obtained with both algorithms show
27
This is itself a rough-and-ready measure of the similarity of trees, highly impracticable beyond
seven languages without a computer. For a more sophisticated measure of tree similarity, which takes
into account genetic distance as well as tree topology, see Embleton (1986: 80-93).
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Table 4. Subgroupings generated by cluster analysis: dendrograms using average
linkage between groups.
a
The 'fidelity' to the presumed 'correct' subgrouping is
expressed as the number of triads 'correct' (out of a possible 84)
A. 120-word
YY
YTh
KTh
KY
WM
KB
KN
UO
ON
B. 120-word
YY
YTh
KTh
KY
UO
WM
KB
ON
KN
C. 120-word
YY
YTh
KTh
KY
UO
WM
KB
KN
ON
D. 151-word
YY
YTh
KTh
KY
UO
WM
KB
KN
ON
list, all sames
1
1
1
1
1
2
2
2
2
list, 'old' only
1
1
1
1
1
1
1
1
2
list, 'old' and
2
list, 'old' and
1
1
1
1
1
1
1
1
2
(right columns of Table 3); fidelity = 28
1
>
2
2
1
1
2
2
(left columns); fidelity = 41
1
1
2
'subgroup'
1
1
1
1
1
1
2
2
'subgroup'
1
1
1
1
1
1
2
2
2
(middle
1
1
2
2
2
2
(middle
1
1
2
2
2
3
1
1
1
1
1
2
columns);
1
1
1
2
columns)
1
1
2
1
1
1
1
2
fidelity = 45
1
1
2
1
1
2
2
a
The program produced dendrograms of three other types as well: (i) average linkage
within groups (generally less felicitous than the others); (ii) single linkage; and (iii)
complete linkage. The representations here do not show an important part of these
dendrograms, the genetic distance (as shown by length of tree branches).
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Table 5. 120-word list, per cent of 'sames' in the OLD and SUBGROUP
categories (subsets of Table 3the figures in the centre of each cell)
a. YIR-YORONT 61 25
YIRR-THANGALKL 26
KOKO-BERA
'Correct' grouping: (YY YTh) KB.
Score: Correct; statistics decisive.
b. KUUK-THAAYORRE 30 24
YIR-YORONT 25
KOKO-BERA
'Correct' grouping: KTh (YY KB).
Score: Incorrect; statistics moderately decisive in their incorrectness.
c. WIK-MUNGKAN 19 20
OGO-NYJAN 13
KUK-NARR
'Correct' grouping: (WMng ON KN).
Score: Correct; statistics given the benefit of the doubt.
significant improvement with an increase of stringency of conditions, and therefore
of labour-intensiveness, for counting a pair as 'cognate'.
But in this gradient lies another point of interest (and here, the discrepancies
between the lists appear more orderly under cluster analysis than they are under
Guy's algorithms). The subgrouping based on data collected under the least strin-
gent conditions (Table 4A) is patently 'areaP in nature: KB, KN, ON, and KN are
southern languages and the rest are northern. Within the southern group, OU and
ON are inland and KB and KN are coastal. Within the northern group, the major
split is between WM, which is inland, and the rest, whose territories are coastal or
estuarial.
28
A point should be noted with regard to the 151-word subgrouping
(Table 4D): although its placement of Wik-Mungkan (not as a sister of KTh-KYak-
UO but as a sister of YY-YTh as well as KTh-KYak-UO in a three-way split) is
marginally better than the one based on the comparable 120-word list (is less
'areal'), it is known in general (Black 1997; Embleton 1986: 53 and references) that
beyond some (as-yet-undetermined) maximum longer word lists are more likely to
show an 'areal' effect than shorter ones (other things being equal).
One virtue of all the algorithmic methods whose results are shown above and in
Section 3.2.2 is that they adjust statistical anomalies among various triads as judged
one-by-one for a 'best fit'. We did, however, attempt a test of fidelity to the 'correct'
subgrouping by comparison of the 'correct' grouping of each triad with the grouping
as read directly from the percentages in a matrix for each set of three. In judgements of
correctness (in the absence of a convincing test of statistical significance), the
following rules of thumb were taken as indicating a 'flat' grouping: (i) if the total
point spread was 10 or less and there was no pattern of two low (and similar) and
28
These areal dichotomies are in part, but only in part, artifacts of the criteria for local and old words
set forth in Section 2.2notably the west coastal versus inland criterion (a).
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Table 6. 'Sames' among Yir-Yoront, Yidiny, and
Dyirbal. Hale's 100-word list; no weeding out of
putative borrowings; all figures are percentages
YIR-YORONT 17 15
YIDINY 18
DYIRBAL
one high, and (ii) if two scores were high and one low (one language appears to be
very closely related to each of two others, but these two others do not appear to
be closely related between themselves). Although this procedure is in principle
constructible as an algorithm, in actual practice in the absence of a computerized
routine there is a vulnerability to subjective judgements, and the data below should
be read with this problem in mind. Table 5 displays some examples of the individual
judgements.
The results of the survey of all 84 such triplets judged against the trees computed
with Guy's algorithm are as follows:
(i) Comparison counting all 'sames' regardless of origin (the right-hand column of
Table 3): 32 of 84 (38%; 20/56 or 36% if Wik-Mungkan is excluded) correct if
Kuk-Narr is classed with Southwest Pama as above; 38/84 correct if Kuk-Narr
is assumed to comprise a separate subgroup within Pama-Maric; 38/84 correct
if Kuk-Narr is assumed to comprise a separate subgroup within Pama-Maric
and if Kuuk-Thaayorre and Uw-Oykangand are assumed not to constitute a
subgroup.
(ii) Comparison counting 'sames' from which putative borrowings have been elim-
inated (the centre column of Table 3): 40 of 84 (47%; 28/56 or 50% if
Wik-Mungkan is excluded) correct if Kuk-Narr is classed with Southwest Pama
as above; 54/84 correct if Kuk-Narr is assumed to comprise a separate subgroup
within Pama-Maric; 58/84 correct if Kuk-Narr is assumed to comprise a
separate subgroup within Pama-Maric and if Kuuk-Thayorre and Uw-Oykan-
gand are assumed not to constitute a subgroup. The greatest success rate among
all these options is 58/84 or 69%, not an encouraging score in light of the
number of 'correct' subgrouping assumptions that had to be dropped.
The corresponding calculations, run against the findings by cluster analysis shown
in Table 4, are (i) for the 'all sames' column of Table 3, 31/84 (37%; 20/56 or 36%
if Wik-Mungkan is excluded), and (ii) for the 'old' + 'subgroup' cloumn, 43/84
(51%; 38/56 or 68% if Wik-Mungkan is excluded).
Let us have a brief look at where it succeeds and where it fails. It succeeds where
two of the three languages compared are related as close or distant dialects and the
third is not. It fails at the intermediate level where groups hypothesized on other
grounds tend to be controversialto be sure, one of the reasons why these hypoth-
esized groupings are controversial is that they are in conflict with lexicostatistical
results. The method also fails in at least certain cases where 'sames' are counted
across major genetic boundaries; here too the possible statistical anomalies invited
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

by the use of individual triads of languages are a problem, but the comparisons can
none the less be of interest. Failures of this kind have been observed and commented
on for some time (for example, Embleton 1986: 94-95).
Consider the comparison displayed in Table 6 among Yir-Yoront, Yidiny, and
Dyirbal, for which grammatical considerations
29
suggest that the first two are related
to each other more closely than either is to Dyirbal, although the last two are spoken
in neighbouring countries, and numerous intervening languages and great distance
separate both from Yir-Yoront.
There is no way to read statistical significance into the difference between 17%
(Yir-Yoront/Yidiny) and 15% (Yir-Yoront/Dyirbal). All of these shared items are
(almost by definition) in the 'old' category.
Two facts of interest emerge here. The first is that what Yir-Yoront shares with
Yidiny does not significantly exceed what it shares with the putatively less closely-
related Dyirbal. For this fact, no obvious interpretation suggests itself other than
a reevaluation of the degree of relationship of Dyirbal to the Paman languages.
The second fact of interest that emerges from this comparison is that 17 of the 18
items shared by Yidiny and Dyirbal are in the 'old' category. They are reflexes of
*tirra 'tooth', *jarra 'thigh', *pungku 'knee', *jina 'foot', *jana 'stand', *nyina 'sit',
*paja 'bite', *pa(r)na 'water', *kuta(ka) 'dog', *yuku 'tree', *kungkarr 'north',
*wanyu 'who', *wanyja 'where', *ngayu/*ngaju T, *nyuntu 'you (sg)', *ngali 'we
(2 incl)', and *nyupul(a) 'you (2)'. To be sure, a few of these 'old' pairs, such as
the first and second person singular pronouns (Yidiny ngaya, Dyirbal ngaja T;
Yidiny nyuntu, Dyirbal nginda 'you') count as 'cognate' by virtue of certain
arbitrary assumptions, although they in fact contain noncognate morphology and
their sameness cannot in all probability be a matter of borrowing. But most of
these 'old' pairs do not differ from each other enough to rule out a claim that they
are loanwords in one or both of the languages. Only one 'local' word is common
to both lists: kupu 'leaf.
The implications are either (i) that at least some of these 17 'sames' are only
'pseudo-old' (see Section 2.2.2)that they do in this case result from borrowing,
evidence for their very wide distribution notwithstandingor (ii) that there is a
process at work in situations of prolonged language contact that works against
differential loss. This latter possibility is an attractive one in accounting for the
anomalously high rate of sharing between Yir-Yoront and Kuuk-Thaayorre. One
might hypothesize it to work as follows, for languages A and B which belong to
29
Here are a couple of examples. The Dyirbal dialects Jirrbal and Mamu have a future-tense ending
-ny that has apparent cognates in coastal languages to the south but not in ('Paman') languages to
the north and west; and a third dialect of Dyirbal, Girramay, has a future ending -djay with no
apparent cognates in languages to the north and west (Dixon 1972: 55, 1977: 207). The Dyirbal
dialects have an elaborated set of demonstratives differentiated for four noun classes; the absolutive
case-forms of these have no apparent cognates in languages to the north and west (Dixon 1972:
44-47, 1977: 186-194). Of course, more evidence is needed, especially of innovations in Paman,
but we are operating here on the basis of Dixon's conclusion: 'Dyirbal and Yidiny are totally
different languages. Although both belong to the Pama-Nyungan subgroup of the Australian family,
in their grammars they are as different as any two Pama-Nyungan languages.'
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Table 7. 'Sames' among Yir-Yoront, Margany, and Wankumara
(Galali): Hale's 100-word list: no weeding out of putative
borrowings] all figures are percentages
YIR-YORONT 16 8
MARGANY 18
WANKUMARA (GALALI)
adjoining countries, whose speakers regularly intermarry, and for which there are
numerous bilinguals. Languages A and B share form X; an event, possibly a death,
occurs in group A such that X will not be spoken. X is differentially avoided in
groups A and B (it is used more frequently in B, in certain social contexts). Speakers
of A who are fully bilingual in A and B never forget X, think of it as a word that is
the same in both languages, and reinstate it at the end of the mourning period
(something like this, with a standard language like High German or Latin as the
basis of continuity, is discussed under the term 'prevented loss' by Embleton (1986:
102, 140, 146); Bergsland and Vogt's discussion (1962: 128-129) of the lowered
retention rates for one of the various eastern dialects of Eskimo (all groups of whom
practise death-taboo) that was spoken and transmitted in isolation from the others
would appear to support this reasoning). Alternatively, X is replaced in both
languages with a similar caique, with cognate words semantic-shifted in the same
way, or with the same form borrowed from a particular third languagethe latter
case giving rise to an inflated count of 'local' words.
It is clear that problems of this kind have been a serious and explicit concern of
glottochronologists from the beginning. Here is Swadesh (1950: 159-160):
... if two languages have been in contact for all or part of the time that has
elapsed since their common period, they will have exerted an influence on
each other. If one of the two languages displaces a word of the original
stock, the second language may imitate the displacement or it may eventu-
ally cause the first language to return to the original form. Since these
influences may be either in the direction of promoting or of retarding
change, the trends may cancel each other out. The total percentage of
change may be the same as in the case of a single language out of contact
with related languages, but the two languages will tend to stay together. In
consequence they will diverge from each other less than the maximum
probable amount under the formula as we have given it.
As a second case of three languages separated by either great geographical
distance or a major genetic boundary, consider the relationship of Yir-Yoront, its
southernmost relative in the Pama-Maric family Margany, and Margany's distantly-
related southern neighbour Wangkumara (Garlali) (McDonald and Wurm 1979).
The statistics are displayed in Table 7.
Here the relative amounts Yir-Yoront shares with the other two languages are as
expected, but the tally for the neighbouring Margany and Wangkumara is inflated.
The figure of 18% represents 16 items out of the 87 in Hale's 100-item list for which
entries were recorded in both languages. Of these 16, some 10 are 'old': *jina 'foot',
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

*taaku 'ground, earth' (questionable), *mara 'hand', *nga- T, *jaa- 'mouth', *nyaa-
'see', *jalany- 'tongue', *nyi- 'you (Sg)', *nyu-/*nyi- 'he', and *ngali 'we (Du Incl)'
Whatever the explanation for the inflated 'sames' counts, the user of lexicostatis-
tical subgrouping as a rough-and-ready method needs to beware. Certain rules-of-
thumb suggest themselves as ways to avoid some of the pitfalls. If for example (as
in the cases illustrated in Tables 6 and 7) three languages A, B, and C all share
equally and in the low range, and if B adjoins C while A is geographically distant
from both, then the 'sames' count for AB and for AC will be given greater weight
than that for BC. A more generalized use of this rule is the recognition of chains of
geographically contiguous languages with cognate densities that gradually decrease
with the number of intervening languages; this procedure is stated explicitly in
O'Grady's treatment of the Ngayarda subgroup (1966: 73-74); see also Swadesh's
discussion of the mesh principle (1962: 7-14). Recall in this regard that the figures
for any given single triad of languages are apt to represent a statistical ' bump' and
that a sample of more languages, especially where gradients of cognate density with
geographical distance are encountered, is desirable. A second rule of thumb is, of
course, to weed out obvious borrowings. Another principle is to avoid the use of
different lists and of different-sized lists, or to make allowances for the discrepancies
they introduce. A cursory look at morphology and phonology can help to settle some
of the remaining problems.
3.2.2. Past Practice of Classification by Lexicostatistics of Australian Languages. The
use of lexicostatistics for subgrouping was at its peak in the 1960s, when it was
applied to word lists of a few hundred words in various language families with many
present-day languages in which very little historical-comparative work had been
done. These families included Austronesian and Australian. For many reasons, the
method in its rough-and-ready form proved to be unreliable in anything but its gross
subgrouping. Subsequently, Guy (1980a, b) proposed some improved methods of
subgrouping, some which still use the table of cognate percentages as the basic data.
Further, some attention has been given to the effect that borrowing has, with a view
to overcoming Dixon's (1972) dismissal, at least for Australian languages, of the
method, even at the coarser levels of subgrouping.
Dixon (1980: xiv) advances 'the beginnings of a proof that all the languages of
Australia (except two or three northern tongues such as Tiwi and Djingili) are
genetically related', even though 'present knowledge of the relationships between
languages is not sufficient to justify any sort of fully articulated 'family tree' model'
(1980: 264-265). He proceeds by considering the systems of phonology, nominal
case inflection, pronouns and verbal conjugations, usefully complementing previous
results based on word lists. His claim that 'little should be inferred about the
subgrouping of languages in Australia from vocabulary comparisons' (1980: 254) is,
we believe, overstated, and is subjected to empirical evaluation above.
It seems evident that a number of considerations in addition to uninterpreted raw
lexicostatistical data went into the classification proposed by O'Grady et al. (1966;
see especially pp. 22-25; see also O'Grady & Klokeid 1969: 301-302 and 308-309
and O'Grady 1966: 73-74). Although it wrongly included Dyirbal and certain of its
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

neighbours in the Pama-Maric group, it did (correctly, we believe) group Koko-Bera
with Yir-Yoront, as 'Western Pama'a result not achievable with raw lexicostatis-
tics. Note also the OGW' s disregard of raw lexicostatistical data (1966: 114 and
119) in their classing of N(h)anda with the Kardu languages (whatever the correct
grouping of N(h)anda may benote that Blevins (forthcoming) groups it yet a third
waywhat is under consideration here is the methods actually used by O'Grady
et ah). And we must emphasize that the Pama-Maric case illustrated in Tables 3
and 5 is indeed a 'worst case'. Numerous other cases give the 'correct' results in a
straightforward manner, notably where the genetic boundaries in question are not
shallow ones. Such a case is that of the Arandic languages (Hale 1962).
4. Conclusion
Languages in contact spoken by small populations for whom marriage with speakers
of other languages has been frequent and even normal pose special problems for
language-historical inference, especially in the area of loanwords. With the aims of
quantifying the contribution of borrowing to lexical change in a situation of this kind
and of estimating the significance of shared vocabulary in cases where loanwords
cannot be identified with certainty, we have examined a group of Australian
Aboriginal languages. We have estimated the contribution of borrowing to lexical
change at a maximum of 50%, comparable to the rates found for languages known
to have borrowed heavily over a long period, for example English, with 45% of
lexical replacement accomplished by borrowing largely from French and from
Scandinavian languages. We infer that the problem of unidentified borrowings in
vocabulary shared by languages whose relationship, if a genetic one at all, is distant
enough to preclude use of the normal comparative method and remote enough in
time to belong to the period of hunting-and-gathering and the population sizes and
relationships postulated for this era is, while certainly a real one, not a crippling one,
provided that certain safeguards are applied.
We have tested, by lexicostatistical means, our indirect method ('local words')
for identification and quantification of borrowings, against a group of languages
(southwestern Paman, in Australia) for which relevant facts about subgrouping
can be inferred (nonstatistically) from shared innovations in phonology and mor-
phology. Finding that exclusion of loans 'identified' by this method does in fact
improve the performance of lexicostatistical methods of subgrouping, we go on to
a more general consideration of lexicostatistics in historical inference. There are
general reasons why glottochronology must be used with much circumspection,
and which considerably limit its usefulness, but these apply the world over. In
particular, it is known that the retention rate may be considerably higher than 80%
per millenniumO'Neil (1964) shows a case where the 100-item list shows 94%
common vocabulary after 1,000 years (and the 200-item list 93%). To the extent
that the retention rates vary between different languages and between different
groups, the value of lexicostatistics (despite the fact that it differs from glot-
tochronology in not concerning itself with computing time-depths in absolute
units of time) in determining genetic subgrouping of languages is vitiated in an
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

absolute sense (a fact long explicitly recognized; see Bergsland & Vogt 1962: 126;
Gudschinsky 1955: 149). However, even an imperfect lexicostatistics is a useful
source of hypotheses about subgrouping of languages and dialects. And, given an
assumption that the retention rate is fairly constant across the particular languages
under study, whether this rate is relatively high or low on a world-wide scale is
irrelevant to its use in subgrouping.
Although the evidence is slim, it does not support the particular inapplicability in
Australia of the retention rate minimum (roughly 80% per millennium) proposed
from data elsewhere in the world. The main impetus for questioning its applicability
came from noticing the widespread word-tabooing practices in Australia on a scale
unknown in Indo-European. However, because a 'borrowed' word is usually 're-
turned' (both in the observed cases and because of the recycling of names), and
since borrowing is just one of many replacement strategies, the taboo practice does
not lead to a higher replacement rate.
The estimates of the fraction of lexical replacement that is attributable to borrow-
ing (however crude), and the estimated equilibrium rate that follows from this, are
low enough to suggest that lexicostatistics, as a rough-and-ready method of language
subgrouping, can proceed without undue concern for the effects of borrowing. In
particular, there appears to be no reason from this quarter to question the lexico-
statistical evidence for the 'Pama-Nyungan' subgroup suggested by O'Grady et al.
This evidence amounts to a transcontinental linguistic boundary across which the
percentage of items shared in a 100 or 200 (or 500) word list hovers around 10%.
This is well within the equilibrium range suggested above for languages that have
been in long contact, and it is not possible to argue that unrelated or equally
distantly-related languages will acquire a much higher percentage of items shared
from borrowing over a long period of time. There is always the possibility, of course,
that one or more languages may be misclassified by this rough-and-ready method.
But the above evidence, together with evidence of a different kind from Merlan
(1979), Blake (1988), Evans (1988), Alpher (1990), and Heath (1990) suggests that
the general outlines of the Pama-Nyungan subgroup, as proposed in the literature,
are tolerably accurate.
With regard to the possibity of fine-tuning lexicostatistical methods to move it
beyond the 'rough-and-ready' stage, we cite Embleton (1986: 92): 'lexicostatistics is
best used for the construction of provisional family trees only; the slightly improved
accuracy for N = 500 over N = 200 is not worthwhile for what should be merely
provisional results anyway'. Here she is speaking of list size. We extend this thinking
to problems of retention rates and borrowing rates. It seems likely that both of these
rates follow a normal distribution. From estimates of these rates and their distributions
can come a lexicostatistics and glottochronology that is a blunt but useful instrument.
References
Abbreviations
AIAS Australian Institute of Aboriginal Studies (now the Australian Institute of Aboriginal
and Torres Straits Islander Studies, AIATSIS), Canberra and, for publications
mid-1976 to circa 1984, Humanities Press, Atlantic Highlands, New Jersey.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

ANU Australian National University.
GCAL Grammatical Categories in Australian Languages, RMW Dixon (ed.), AIAS, 1976.
HAL Handbook of Australian Languages, RMW Dixon & Barry J Blake (eds), Australian
National University Press, Canberra: Vol. I, 1979; Vol. II, 1981; Vol. III, 1983.
LCY Languages of Cape York, Peter Sutton (ed.), AIAS, 1976 (Australian Aboriginal Studies
Research and Regional Studies No. 6).
PAL 13 Papers in Australian Linguistics No. 13: Contributions to Australian Linguistics (Pacific
Linguistics Series A, No. 59), Research School of Pacific Studies, Australian National
University, Canberra, 1980.
Alpher B 1972 'On the genetic subgrouping of the languages of southwestern Cape York
Peninsula, Australia' Oceanic Linguistics 11: 67-87.
Alpher B 1976 'Some linguistic innovations in Cape York and their sociocultural correlates' in
LCY: 84-101.
Alpher B 1988 'Formalizing Yir-Yoront lenition' Aboriginal Linguistics 1: 188-197.
Alpher B 1989 'The origin of ablaut as a grammatical process in Yir-Yoront' Unpublished MS.
Alpher B 1990 'Some Proto-Pama-Nyungan verb paradigms: a verb in the hand is worth two in
the phylum' in GN O'Grady & D Tryon (eds).
Alpher B 1991 Yir-Yoront Lexicon: sketch and dictionary of an Australian language Trends in
Linguistics Documentation 6 Mouton de Gruyter Berlin.
Alpher B 1992 'Correspondence mimicry' Draft.
Alpher B 1997 "The Banks-Cook wordlist of 1770: can it be adapted as a lexicostatistical tool?'
Studies in Comparative Pama-Nyungan Pacific Linguistics Series C No. 1ll Research School
of Pacific Studies ANU Canberra: 155-171.
Alpher B 1998 'Yir-Yoront and Kuuk-Thaayorre: a Sprachbund' Draft.
Austerlitz R 1991 'Alternatives in long-range comparison' in Sydney M Lamb & E Douglas
Mitchell (eds) Sprung from some Common Source: investigations into the prehistory of languages
Stanford University Press Stanford California: 353-364.
Austin P 1981a 'Proto-Kanyara and Proto-Mantharta historical phonology' Lingua 54: 295-333.
Austin P 1981b A Grammar of Diyari, South Australia Cambridge University Press Cambridge.
Austin P 1992a A Dictionary of Payungu, Western Australia Department of Linguistics La Trobe
University Melbourne.
Austin P 1992b A Dictionary of Thalanyji Department of Linguistics La Trobe University
Melbourne.
Austin P 1997 'Proto Central New South Wales phonology' in D Tryon & M Walsh (eds)
Boundary Rider: essays in honour of Geoffrey O'Grady Pacific Linguistics C-136 Australian
National University Canberra: 21-49.
Bergsland K & H Vogt 1962 'On the validity of glottochronology' Current Anthropology 3(2):
115-153.
Black P 1980 'Norman Pama historical phonology' in B Rigsby & P Sutton (eds) PAL 13:
181-239.
Black P 1997 'Lexicostatistics and Australian languages: problems and prospects' in D Tryon &
M Walsh (eds) Boundary Rider: essays in honour of Geoffrey O'Grady Pacific Linguistics
C-136 Australian National University Canberra: 51-69.
Black P nd a 'Lexicostatistical lists: Kokaper (with Kok-Babonk)' Unpublished MS.
Black P nd b 'Kurtjar [on verb conjugations]' Unpublished MS.
Blake BJ 1979 A Kalkatungu Grammar Pacific Linguistics Series B No. 57 Research School of
Pacific Studies ANU Canberra.
Blake BJ 1988 'Redefining Pama-Nyungan: Towards the prehistory of Australian languages'
Aboriginal Linguistics I: 1-90.
Blevins J forthcoming 'Nhanda and its position within Pama-Nyungan' in M Laughren &
P McConvell (eds) The Origins of the Western Desert Languages.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Breen JG 1972 'Kok-Nar fieldnotes' MS.
Breen JG 1973 Bidyara and Gungabula Grammar and Vocabulary Monash University Melbourne.
Breen JG 1976a 'Gugadj and Gog-Nar. a contrastive sketch' in LCY: 151-164.
Breen JG 1976b 'An introduction to Gog-Nar' in LCY: 243-259.
Breen JG 1981a 'Margany and Gunya' in HAL II: 274-393.
Breen JG 1981b The Mayi Languages of the Queensland Gulf Country AIAS Canberra.
Breen JG 1990 Salvage Studies of Western Queensland Aboriginal Languages Pacific Linguistics
B-105 Australian National University Canberra.
Dixon RMW 1970 'Languages of the Cairns rain forest region' in SA Wurm & DC Laycock (eds)
Pacific Linguistics Studies in Honour of Arthur Capell Pacific Linguistics Canberra: 651-687.
Dixon RMW 1972 The Dyirbal Language of North Queensland Cambridge Studies in Linguistics 9
Cambridge University Press Cambridge.
Dixon RMW 1976 'Tribes, languages and other boundaries in northeast Queensland' in Nicolas
Peterson (ed.) Tribes and Boundaries in Australia Social Anthropology Series No. 10 AIAS
Canberra: 207-238.
Dixon RMW 1977 A Grammar of Yidiny Cambridge Studies in Linguistics 19 Cambridge
University Press Cambridge.
Dixon RMW 1980 The Languages of Australia Cambridge Language Surveys Cambridge
University Press Cambridge.
Dixon RMW 1990 'The origin of 'mother-in-law vocabulary' in two Australian languages'
Anthropological Linguistics 32: 1-56.
Dixon RMW 1991 Words of Our Country Queensland University Press St. Lucia.
Dixon RMW 1997 The Rise and Fall of Languages Cambridge University Press Cambridge.
Douglas WH 1976 The Aboriginal Languages of the South-west of Australia (2nd edn) Research and
Regional Studies 9 (1st edn 1968) AIAS Canberra.
Dyen I 1963 'Lexicostatistically determined borrowing and taboo' Language 39: 60-66.
Dyen I, AT James & JWL Cole 1967 'Language divergence and estimated word retention rate'
Language 43: 150-171.
Dyen I, JB Rruskal & P Black 1992 'An Indoeuropean classification: a lexicostatistical experiment'
Transactions of the American Philosophical Society 82(5) American Philosophical Society
Philadelphia.
Eades D 1979 'Gumbaynggir' in HAL I: 245-361.
Embleton SM 1981 'Lexicostatistical tree reconstruction incorporating borrowing' Toronto
Working Papers in Linguistics 2: 98-122 (republished 1982 Eighth Lacus Forum Hornbeam
Columbia SC: 265-272).
Embleton SM 1986 Statistics in Historical Linguistics Quantitative Linguistics vol. 30 Studienverlag
Brockmeyer Bochum.
Evans N 1988 'Arguments for Pama-Nyungan as a genetic subgroup, with particular reference to
initial laminalization' Aboriginal Linguistics I: 91-110.
Gleason HA Jr 1962 'Counting and calculating for historical reconstruction' Anthropological
Linguistics 1(2): 22-32.
Gudschinsky SC 1955 'Lexico-statistical skewing from dialect borrowing' International Journal of
American Linguistics 21: 138-149.
Guy JBM 1980a Experimental Glottochronology: basic methods and results Pacific Linguistics Series
B No. 75 Research School of Pacific Studies ANU Canberra.
Guy JBM 1980b Glottochronology without Cognate Recognition Pacific Linguistics Series B No. 79
Research School of Pacific Studies ANU Canberra.
Hale KL 1961 'Attestations [of 100-word list in 33 Cape York languages; also exists under title
Vocabularies and Cognation Judgments for 30 Cape York Peninsula Languages]'
Unpublished MS dittoed.
Hale KL 1962 'Internal relationships in Arandic of Central Australia' in A Capell (ed.) Some
Linguistic Types in Australia Oceania Linguistic Monograph 7 Oceania (The University of
Sydney) Sydney: 171-183.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Hale KL 1964 'Classification of Northern Paman languages, Cape York Peninsula, Australia: a
research report' Oceanic Linguistics 3: 248-265.
Hale KL 1976a 'Phonological developments in particular Northern Paman languages' in LCY:
7-40.
Hale KL 1976b 'Phonological developments in a Northern Paman language: Uradhi' in LCY:
41-49.
Hale KL 1976c 'Wik reflections of Middle Paman Phonology' in LCY: 50-60.
Hale KL nd a 'CYP cognate density matrices [computed from Hale 1961]' Unpublished MS
dittoed.
Hale KL nd b 'Other Paman languages [companion to 1976a and 1976c]' Unpublished MS
dittoed.
Hale KL nd c 'Wik-Mungknh wordlist' Unpublished MS.
Hall AH 1968 A depth study of the Thaayorre language of the Edward River tribe Cape York Peninsula
Unpublished MA thesis University of Queensland.
Hall AH 1972 A study of the Thaayorre language of the Edward River tribe, Cape York Peninsula,
Queensland Unpublished PhD thesis University of Queensland.
Hall AH 1976a 'Methods of negation in Kuuk-Thaayorre' in LCY: 299-307.
Hall AH 1976b 'Morphological categories of nouns in Kuuk-Thaayorre' in LCY: 308-314.
Harris BP & GN O'Grady 1976 'An analysis of the progressive morpheme in Umpila verbs: a
revision of a former attempt' in LCY: 165-212.
Haviland JB 1974 'A last look at Cook's Guugu Yimidhirr word list' Oceania 44(3) (March):
216-232.
Havilan4 JB 1979a 'How to talk to your brother-in-law in Guugu-Yimidhirr' in Tim Shopen (ed.)
Languages and Their Speakers Winthrop Cambridge Massachusetts: 160-239.
Haviland JB 1979b 'Guugu Yimidhirr' in HAL I: 27-180.
Heath J 1978a Linguistic Diffusion in Anthem Land AIAS Canberra.
Heath J 1978b Ngandi Grammar, Texts, and Dictionary AIAS Canberra.
Heath J 1979 'Diffusional linguistics in Australia: problems and prospects' in SA Wurm (ed.)
Australian Linguistic Studies Pacific Linguistics Series C No. 54 Research School of Pacific
Studies ANU Canberra: 395-418.
Heath J 1982 Nunggubuyu Dictionary AIAS Canberra.
Headi J 1990 'Verbal inflection and macro-subgroupings of Australian languages: The search for
conjugation markers in non-Pama-Nyungan' in Philip Baldi (ed.) Linguistic Change and
Reconstruction Methodology Mouton de Gruyter Berlin: 403-407.
Hershberger R 1964 'Notes on Gugu-Yalanji verbs' in R Pittman & H Kerr (eds) Papers on the
Languages of the Australian Aborigines Occasional Papers in Aboriginal Studies No. 1 AIAS
Canberra: 35-54.
Johnson S 1990 [1991] 'Social parameters of linguistic change in an unstratified Aboriginal society'
in P Baldi (ed.) Linguistic Change and Reconstruction Methodology Mouton de Gruyter Berlin:
419-433; reprinted in Philip Baldi (ed.) Patterns of Change, Change of Patterns: linguistic change
and reconstruction methodology Mouton de Gruyter Berlin 1991: 203-217.
Koch H 1997 'Comparative linguistics and Australian prehistory' in Patrick McConvell &
Nicholas Evans (eds) Archaeology and Linguistics: Aboriginal Australia in global perspective
Oxford University Press Melbourne: 27-43.
Lees RB 1953 'The basis of glottochronology' Language 29: 113-127.
McDonald M & SA Wurm 1979 Basic Materials in Wankumara (Galali): grammar, sentences, and
vocabulary Pacific Linguistics Series B No. 65 Research School of Pacific Studies ANU
Canberra.
McEntee J & P McKenzie 1992 Adynya-math-nha English Dictionary Revised May 1992 Adelaide
(ISBN 0959664432).
Merlan F 1979 'On the prehistory of some Australian verbs' Oceanic Linguistics 18(1): 33-112.
Merlan F 1982 Mangarayi (Lingua Descriptive Studies vol. 4) North-Holland Amsterdam.
Nash D. 1982 'An etymological note on kurdungurtu' in J Heath et al. (eds) Languages of kinship in
Aboriginal Australia Oceanic Linguistic Monographs No. 24 Sydney: 141-159.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Nash D 1997 'Comparative flora terminology of the central Northern Territory' in Patrick
McConvell & Nicholas Evans (eds) Archaeology and Linguistics: Aboriginal Australia in global
perspective Oxford University Press Melbourne: 187-206.
Nash D & J Simpson 1981 "No-name' in central Australia' in Carrie S Masek, Roberta A
Hendrick & Mary Frances Miller (eds) Proceedings from the Parasession on Language and
Behavior Chicago Linguistics Society Chicago: 165-177.
Newton PJF 1980 'Lexicostatistics: a minor analytical tool for Australian historical linguistic
studies' Working Papers in Language and Linguistics II: 1-8.
Oates W & L Oates 1964 'Gugu-Yalanji vocabulary' in W Oates, L Oates, H Hershberger,
R Hershberger, B Sayers & M Godfrey (eds) Gugu-Yalanji and Wik-Munkan Language
Studies Occasional Papers in Aboriginal Studies No. 2 ALAS Canberra: 79-146.
O'Grady GN 1964 Nyangumata Grammar Oceanic Linguistic Monographs No. 9 University of
Sydney Sydney.
O'Grady GN 1966 'Proto-Ngayarda phonology' Oceanic Linguistics 5(2): 71-130.
O'Grady GN 1976 'Umpila historical phonology' in LCY: 61-67.
O'Grady GN 1979 'Preliminaries to a Proto Nuclear Pama-Nyungan stem list' in S Wurm (ed.):
107-139.
O'Grady GN & TJ Klokeid 1969 'Australian linguistic classification: A plea for coordination of
effort' Oceania 39: 298-311.
O'Grady GN, CF Voegelin & FM Voegelin 1966 'Languages of the world: Indo-Pacific fascicle
six' Anthropological Linguistics 8(2) (February): 1-197.
O'Neil WA 1964 'Problems in the lexicostatistic time depth of Modern Icelandic and Modern
Faroese' General Linguistics 6(1) (Spring): 27-37.
Ross MD 1996 'Contact-induced change and the comparative method: cases from Papua New
Guinea' in M Durie & M Ross (eds) Comparative Method Reviewed: regularity and irregularity
in language change Oxford University Press New York: 180-217.
Sankoff D 1972 'Reconstructing the history and geography of an evolutionary tree' American
Mathematical Monthly 79: 596-603.
Sankoff D 1973 'Mathematical developments in lexicostatistic theory' in TA Sebeok (ed.) Current
Trends in Linguistics vol. 11 Mouton The Hague: 93-113.
Sharp RL 1939 'Tribes and totemism in north-east Australia' Oceania 9.3: 254-275, 904: 19-42.
Sharp RL 1934 'The social organization of the Yir-Yoront tribe, Cape York Peninsula' Oceania
4: 404-431.
Sharp RL 1958 'People without politics' in VF Ray (ed.) Systems of Political Control and
Bureaucracy in Human Societies American Ethnological Society Seattle: 1-8.
Sommer BA 1969 Kunjen Phonology: synchronic and diachronic Pacific Linguistics Series B No. 11
ANU Canberra.
Sommer BA 1972 Kunjen Syntax: a generative view Australian Aboriginal Studies No. 5 AIAS
Canberra.
Sommer BA 1976 'Umbuygamu: the classification of a Cape York Peninsular Language' in JF
Kirton et al. (eds) Papers in Australian Linguistics No. 10 Pacific Linguistics (Series A No.
47 Research School of Pacific Studies ANU Canberra: 13-31).
Sutton PJ 1978 Wik: Aboriginal society, territory and language at Cape Keerweer, Cape York
Peninsula, Australia Unpublished PhD thesis Department of Anthropology and Sociology
University of Queensland.
Swadesh M 1950 'Salish internal relationships' International Journal of American Linguistics 16:
157-167.
Swadesh M 1952 'Lexico-statistic dating of prehistoric ethnic contacts' Proceedings of the American
Philosophical Society 96(4): 452-463.
Swadesh M 1955 'Towards greater accuracy in lexicostatistic dating' International Journal of
American Linguistics 21: 121-137.
Swadesh M 1962 'The mesh principle in comparative linguistics' Anthropological Linguistics 1(2):
7-14.
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Thomason SG & T Kaufman 1988 Language Contact, Creolization, and Genetic Linguistics
University of California Press Berkeley; paperback ed. 1991.
Thompson DA 1976 'A phonology of Kuuku-Ya'u' in LCY: 213-235.
Tryon DT 1970 An Introduction to Maranungku (Northern Australia) Pacific Linguistics Series B
No. 15 Linguistic Circle of Canberra Canberra.
Tsunoda TG 1974 A grammar of the Warungu language, North Queensland Unpublished MA thesis
Monash University Melbourne.
Watson FJ 1944 'Vocabularies of 4 representative tribes of south eastern Queensland' Supplement
to Journal of the Royal Geographical Society of Australasia (Queensland) No. 34 Vol. 48
Session 1943-1944.
Wurm SA 1972 Languages of Australia and Tasmania Mouton The Hague.
Zorc R. David (comp.) 1986 Yolngu-Matha Dictionary School of Australian Linguistics Darwin
Institute of Technology Batchelor NT.
Barr Alpher
American University
< alpher@ibm.net >
Address for correspondence:
3218 Wisconsin Ave NW, Apt B2
Washington DC 20016, USA
David Nash
ANU, AIATSIS
< david.nash@anu.edu.au >
Appendix Lexicostatical Wordlists Used with the Cape York Material
The first 100 words constitute the list of O'Grady & Klokeid (1969: 303-7); the next 20 are the
words in Hale's (1961) list that are not also in O'Grady and Klokeid's; the next 25 are words in
Black's (nd a.) list that are not in either of the above; the last 6 words (relevant for the most part
in the monsoon tropics of Australia) are the authors' additions. ' H' cross-references to Hale's
numbering for the same word; 'B' does the same for Black's.
The list follows, in two forms. The first column gives the list in numerical order (by reference
number). The second column gives the list in alphabetical order of the English gloss, so that the
reference number for a particular gloss can be easily found. (Note with regard to certain items:
(i) for 'stomach' see 'belly'; (ii) 'to get, pick up' (item 34 in the O'Grady-Klokeid list) and 'to
take' (item 106, Hale's item 39, Black's 212) are listed as separate items here despite the fact that
they appear to work as near-synonyms in the elicitation of words in Aboriginal languages.)
Numerical order
No. Gloss
1. armpit (Hl l , B70)
2. ashes (H64, B109)
3. belly (HI3, B47)
4. big (H85, B138)
5. bite (H44, B196)
6. black (H31, B135)
7. blood (H21, B78)
8. bone (H23)
9. breast (B43)
10. to burn (intr; H65, B108)
Alphabetical order
No. Gloss
1. armpit ( Hl l , B70)
2. ashes (H64, B109)
126. axe (B23)
111. bad (H50, B170)
3. belly (H13, B47)
4. big (H85, B138)
5. bite (H44
3
B196)
6. black (H31, B135)
7. blood (H21, B78)
8. bone (H23)
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Numerical order
No. Gloss
11. by-and-by (H80, B235)
12. chest (B42)
13. to climb (H38, B177)
14. to cry (H47, B191)
15. to cut (H45, B204)
16. dog (H69, B88)
17. down, below (H78, B231)
18. ear (H6, B30)
19. east (H73, B225)
20. to eat (H27, B197)
21. egg (H68, B86)
22. elbow (HI5, B68)
23. excrement (B52)
24. eye (H4, B27)
25. to fall (H37, B178)
26. far (H87, B228)
27. fat, grease (H22, B79)
28. fingernail (B63)
29. fire (H62, B107)
30. fly (N; B98)
31. vegetable food (H72, B106)
32. foot (H20, B62)
33. forehead (H2, B26)
34. to get, pick up
35. to give (H42, B215)
36. to go (H35, B172)
37. ground (H55, B126)
38. hand (H16, B65)
39. head (HI, B24)
40. head hair (H25, B25)
41. hear (H30, B195)
42. heart (B44)
43. to hit (with hand; H43, B199)
44. hungry (H26, B161)
45. I (H96, B239)
46. knee (H18, B59)
47. leaf(BlOl)
48. to leave it (H40, B216)
49. liver (HI2, B49)
50. long (H89, B140)
51. to be lying down (B187)
52. many (H84, B133)
53. meat, animal (H66, B84)
54. moon (H57, B122)
55. mouth (H7, B32)
56. name (B16)
57. nape (H3, B38)
58. north (H75, B223)
59. nose (H5, B28)
60. now, today (B234)
Alphabetical order
No. Gloss
123. boomerang (B19)
9. breast (B43)
11. by-and-by (H80, B235)
12. chest (B42)
127. chin (B37)
136. cloud (B119)
113. creek (H61)
16. dog (H69, B88)
17. down, below (H78, B231)
122. dream (N; B18)
143. dry (N, Adj; B158)
18. ear (H6, B30)
19. east (H73, B225)
21. egg (H68, B86)
22. elbow (HI5, B68)
23. excrement (B52)
24. eye (H4, B27)
26. far (H87, B228)
27. fat, grease (H22, B79)
28. fingernail (B63)
29. fire (H62, B107)
132. fish (B97)
30. fly (N; B98)
32. foot (H20, B62)
33. forehead (H2, B26)
149. goanna
110. good (H49, B169)
134. grass (B105)
37. ground (H55, B126)
38. hand (H16, B65)
117. hard (H91, B144)
118. he (H98, B241)
39. head (HI, B24)
40. head hair (H25, B25)
41. hear (H30, B195)
42. heart (B44)
140. heavy (B144)
128. hip (B57)
139. hole (B129)
44. hungry (H26, B161)
45. I (H96, B239)
46. knee (HI8, B59)
121. language (B17)
47. leaf(BlOl)
141. light (B145)
49. liver (HI2, B49)
50. long (H89, B140)
151. mangrove (Avicennia sp.)
52. many (H84, B133)
53. meat, animal (H66, B84)
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

61. old man (B4)
62. one (H81, B130)
63. person, Aborigine (H51, Bl)
64. rib (B46)
65. rotten (B153)
66. to see (H29, B195)
67. short (H90, B141)
68. to sit (H34, B185)
69. skin (H24, B71)
70. sky (B119)
71. small (H86, B139)
72. to smell it (B195)
73. smoke (H63, Bi l l )
74. snake (B96)
75. south (H76, B224)
76. to speak (H32, B189)
77. spear (N; B20)
78. spit (N; B35)
79. to be standing (H33, B186)
80. star (H58, B121)
81. stone (H54, B127)
82. sun (H56
3
B123)
83. tail (H67, B85)
84. thigh (HI7, B58)
85. this (H92, B227)
86. throat (B39)
87. tongue (H9, B34)
88. tooth (H8, B33)
89. tree (H70, B100)
90. two (H82, B131)
91. up (H77, B230)
92. urine (B51)
93. water (H60, B112)
94. west (H74, B226)
95. what? (H93, B249)
96. where? (H95, B251)
97. who? (H94, B250)
98. wind (H59, B120)
99. woman (H52, B2)
100. you (sg.) (H97, B240)
101. shoulder (H10, B40)
102. upper arm (HI4, B69)
103. shin (H19, B60)
104. to die (H28, B198)
105. to run (H36, B174)
106. to take (H39, B212)
107. to throw (H41, B217)
108. to spear (H46, B201)
109. to laugh (H48, B192)
110. good (H49, B169)
111. bad (H50, B170)
112. to dig (H53, B202)
113. creek (H61)
54. moon (H57, B122)
133. mosquito (B99)
55. mouth (H7, B32)
135. mud (B114)
56. name (B16)
57. nape (H3, B38)
116. near (H88, B229)
58. north (H75, B223)
59. nose (H5, B28)
60. now, today (B234)
61. old man (B4)
62. one (H81, B130)
150. pandanus
131. pelican (B93)
63. person, Aborigine (H51, Bl)
130. possum (B90)
145. rain (B112)
64. rib (B46)
65. rotten (B153)
138. sand (B128)
137. shade (B125)
124. shield (B20)
103. shin (HI9, B60)
67. short (H90, B141)
101. shoulder (H10, B40)
144. sickness (B82)
69. skin (H24, B71)
70. sky (B119)
71. small (H86, B139)
73. smoke (H63, Bi l l )
74. snake (B96)
142. soft (B145)
75. south (H76, B224)
77. spear (N; B20)
125. spearthrower (B21)
78. spit (N; B35)
80. star (H58, B121)
81. stone (H54, B127)
147. string
82. sun (H56, B123)
83. tail (H67, B85)
84. thigh (HI7, B58)
85. this (H92, B227)
115. three (H83, B132)
86. throat (B39)
51. to be lying down (B187)
79. to be standing (H33, B186)
10. to burn (intr; H65, B108)
13. to climb (H38, B177)
14. to cry (H47, B191)
15. to cut (H45, B204)
104. to die (H28, B198)
112. to dig (H53, B202)
D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Numerical order
No. Gloss
114. tomorrow (H79, B238)
115. three (H83, B132)
116. near (H88, B229)
117. hard (H91, B144)
118. he (H98, B241)
119. we two inclusive (H99, B242)
120. you two (H100, B243)
121. language (B17)
122. dream (N; B18)
123. boomerang (B19)
124. shield (B20)
125. spearthrower (B21)
126. axe (B23)
127. chin (B37)
128. hip (B57)
129. wing feather (B87)
130. possum (B90)
131. pelican (B93)
132. fish (B97)
133. mosquito (B99)
134. grass (B105)
135. mud (B114)
136. cloud (B119)
137. shade (B125)
138. sand (B128)
139. hole (B129)
140. heavy (B144)
141. light (B145)
142. soft (B145)
143. dry (N, Adj; B158)
144. sickness (B82)
145. rain (B112)
146. yamstick
147. string
148. yam
149. goanna
150. pandanus
151. mangrove (Avicennia sp.)
Alphabetical order
No. Gloss
20. to eat (H27, B197)
25. to fall (H37, B178)
34. to get, pick up
35. to give (H42, B215)
36. to go (H35, B172)
43. to hit (with hand; H43, B199)
109. to laugh (H48, B192)
48. to leave it (H40, B216)
105. to run (H36, B174)
66. to see (H29, B195)
68. to sit (H34, B185)
72. to smell it (B195)
76. to speak (H32, B189)
108. to spear (H46, B201)
106. to take (H39, B212)
107. to throw (H41, B217)
114. tomorrow (H79, B238)
87. tongue (H9, B34)
88. tooth (H8, B33)
89. tree (H70, B100)
90. two (H82, B131)
91. up (H77, B230)
102. upper arm (HI4, B69)
92. urine (B51)
31. vegetable food (H72, B106)
93. water (H60, B112)
119. we two inclusive (H99, B242)
94. west (H74, B226)
95. what? (H93, B249)
96. where? (H95, B251)
97. who? (H94, B250)
98. wind (H59, B120)
129. wing feather (B87)
99. woman (H52, B2)
148. yam
146. yamstick
100. you (sg.) (H97, B240)
120. you two (HI00, B243) D
o
w
n
l
o
a
d
e
d

b
y

[
U
n
i
v
e
r
s
i
t
y

o
f

A
u
c
k
l
a
n
d

L
i
b
r
a
r
y
]

a
t

2
0
:
5
4

0
5

J
a
n
u
a
r
y

2
0
1
4

Lexical Replacement and Cognate

Caricato da

Informazioni sul documento

Descrizione originale:

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Lexical Replacement and Cognate

Caricato da

Copyright:

Formati disponibili

This article was downloaded by: [University of Auckland Library]

On: 05 January 2014, At: 20:54

Potrebbero piacerti anche