Indirect Anaphora

Indirect anaphora
Testing the limits of corpus-based linguistics
Simon Philip Botley*

Mara University of Technology, Malaysia
This paper investigates indirect anaphora (LA.) from a corpus-based Hn-

guistics (CBL) perspective. Indirect anaphora involves backward-pointing
links between surface lexical items, such as demonstrative pronouns and
objects, situations or concepts which are mentioned or hinted at in previous
discourse, but which are not hnguistically encoded as surface grammatical
elements such as nouns or noun phrases. This paper reports an empirical
study of three types of IA, namely labelling (Francis 1986,1989,1994), situ-
ation reference (Fraurud 1992, 1992a) and text/discourse deixis (Lyons 1977,
Levinson 1983). Although the paper reveals some discernable patterns in the
use of IA across different genres, the study ultimately points to the challenges
of identifying hard and fast categories of LA in a corpus of real-life examples.
Keywords: indirect anaphora, corpus, labelling, situation reference, deixis
1. Introduction
Anaphora is a linguistic device whereby a speaker or writer can recall to the

consciousness ofa hearer or reader entities or concepts that have already been
introduced into the discourse. In English, anaphora can be realised by many
different linguistic markers, such as pronouns or demonstratives, as we can see
from the following examples taken from the APHB' corpus:
(1) A tall woman in a long rustling gown appeared.

"Hotchkiss!" she said in a hushed but concerned voice.
(APHB Corpus, B002:17)2
(2) "It is no great matter to me," Hotchkiss concluded, "for I had only
the wages of my Portland engagement, and that was no great sum, I
assure you."
(APHB Corpus, B002:21)
International Journal of Corpus Linguistics i i : i (2006), 73-112.

ISSN 1384-6655 / E-issN 1569-9811 © John Benjamins Publishing Company
74 Simon Philip Botley
In examples (1) and (2) above, the person or entity being referred to by the
pronoun, the antecedent, is easily recoverable from the preceding context —
therefore, these are examples of wbat can be called direct anaphora, where the
anaphor and antecedent are coreferential. Here, a reader or hearer would have
little trouble identifying the antecedent, as tbe nature of tbe link between the
anaphor and antecedent is fairly straightforward. But let us see what happens
wben we are faced with examples like (3) and (4):
(3) In 1973 the government met the premiers ofthe western provinces. Just
the other day we received copies of an update from the Prime Minister
addressed to Premier Barrett on the event ofthe recent conference of
western premiers. Some of that process is worthy of commendation,
which I sincerely extend to the Prime Minister.
(Hansard Corpus, H0205274-76)3
(4) Mary was fired.

a. That happened last week
b. That is true
c. That surprised me
(from Fraurud 1992a:2)
In both (3) and (4), tbe antecedent of'tbat' is more difficult to define directly,
because tbe antecedent in these cases is not a surface noun or noun phrase, and
the link between them is not one of coreference. Also, the nature of tbe ana-
phoric link in tbese cases means tbat a reader or hearer may have to carry out a
somewhat complex process of inference to arrive at the antecedent. Therefore,
these examples can be said to fall under indirect anaphora or LA..^
This paper will investigate the role ofthe English demonstratives this, that,
these and those with respect to three types of IA, namely labelling (Francis 1986,
1989,1994), situation reference (Fraurud 1992,1992a) and text/discourse deixis
(Lyons 1977; Levinson 1983). The paper will focus on corpus samples repre-
senting three written genres, and will argue tbat, despite the existence of some
genre-related patterns of LA involving demonstratives, IA still provides some
challenges for a CBL approach.
2. Three main types of Indirect Anaphora
In this section, the three types of IA considered in this paper will be discussed,
as a preliminary to tbe empirical study.
Indirect anaphora 75
2.1 Labelling
Labelling has attracted considerable attention, prominently by Francis (1986,

1989, 1994).5 Francis argues tbat some anaphors, commonly demonstratives
and definite noun phrases, function to encapsulate, or label', stretches of sur-
rounding text constituting anything from a clause to a set of paragraphs.
Francis distinguishes between retrospective labels (benceforth RLs), which
function anaphorically, and advance labels (henceforth ALs), which function
cataphorically. An RL, according to Francis (1994:29):
...serves to encapsulate or package a stretch of preceding discourse by giving
it a name. There is no single nominal-group equivalent in the preceding dis-
course, hence the label indicates to the reader exactly how the discourse is to
be interpreted (Francis 1994:29).
Most RL cases occur in paragraph-initial clauses, and contribute to textual co-

hesion by organising the topical and rhetorical structure of the previous dis-
course.
Advance labels, on the other hand, point to, and introduce, stretches of
following text, performing a similar function to the RLs, but in reverse or-
der. Here are some examples of RLs and ALs taken from the Hansard corpus
sample used in this paper.
2.1.1 Retrospective labels
(5) Those who have lost patience and manifested that loss in this
demonstration were few in number on Monday, but that number could
grow rapidly if it appears the government has lost interest or is too
preoccupied with other needs even to include an expression of concern
for their interests in the Speech from the Throne. I wish the new minister
success in his difficult task. He will find an early opportunity, I trust in
this debate, to make up for this omission.
(Hansard corpus, H0205332-34)
Example (5) illustrates the naming and encapsulating function of RL cases.

The head noun of the expression carries the labelling function. This example
also sbows the interpersonal meaning of some RLs — to express the speaker/
writer's attitude towards the previous context.
2.1.2 Advance labels

(6) A farmer, with whom I spent almost an hour speaking on the telephone
the other night, told me about costs and gave me this illustration. A 64-
inch plastic belt for his combine cost 79 cents in 1967, and cost 89 cents
in 1968. Last week the same belt cost $4.88 and the farmer had to buy
two the same day.
Here, in example (6) we see a subsequent stretch of discourse introduced and

pointed to for the first time. ALs appear to carry out similar functions to RLs,
but differ in their 'direction of reference'.
As well as the encapsulative role, Francis also points out (1994:92-93) tbat
labels can often function metalinguistically, labelling a given stretch of text as
a type of language, thus forcing an appropriate interpretation of that stretch of
text. Francis identifies several types of metalinguistic labels thus:
a. illocutionary nouns
These include sucb nominalisations of verbal processes as accusation, allega-
tion, answer, observation, statement, and can be used strategically by a writer
or speaker, as we will see below in Section 3.6. Here is an example from the
Hansard corpus to illustrate:
(7) However, according to the Evening Telegram of lune 24, the Minister
of Regional Economic Expansion is reported to have said privately that
Canada would act unilaterally to take control ofthe continental shelf off
its coast should the current Law ofthe Sea Conference negotiations not
go mostly our way. In view of that statement, Mr. Speaker, I wonder what
is Canada's position on the continental shelf control issue.
b. language activity nouns

These are nouns referring to a language activity or the results of such an activ-
ity. Examples include account, debate, description, tale, version. Another Han-
sard example will illustrate:
(8) Mr. Speaker, in rising to take part in this debate I would like first of all to
compliment the hon. member for Assiniboia (Mr. Goodale) for rising to
make his maiden speech on such a contentious issue.
(Hansard corpus, H020488)
c. mental process nouns

These nouns refer to cognitive states and processes and the results of tbese.
Examples would be belief, opinion, thought, hypothesis, theory or view.^ An ex-
ample from tbe APHB sample will illustrate:
(9) After all, I reflected, I was like my neighbours; and then I smiled,
comparing myself with other men, comparing my active goodwill
with the lazy cruelty of their neglect. And at the very moment of that
vainglorious thought, a qualm came over me, a horrid nausea and the
most deadly shuddering.
(APHB corpus, BO 100683-84)
d. text nouns
Tbis class of nouns refers to tbe formal structure ofthe discourse. Tbeir use sig-
nals no interpretation, and merely functions to label tbe discourse. Such nouns
include phrase, sentence, word, page, excerpt, section. An example follows:
(10) I appreciate the fact that the minister is not able to be here tonight to
answer my question, but he is well represented by a new member in the
person ofthe hon. member for South Western Nova (Miss Campbell),
who comes into this House as the Parliamentary Secretary to the
Minister of National Health and Welfare and will shortly be making her
maiden speech. For once that phrase seems to apply, does it not?
(Hansard Corpus, H020612-13)
Anaphoric labelling is argued to be an important aspect of argumentative nar-

rative, due to its ability to objectify and organise complex, abstract informa-
tion, and because of its function in evaluating, categorising and summarising
the preceding co-text, perhaps for rhetorical reasons. This becomes important
when we consider genres such as political speeches and some newspaper lan-
guage, this last genre mirroring much ofthe data examined by Francis.
Furthermore, with the Hansard corpus examined below, there is a lot of
scope for encapsulation as a phenomenon wbich organises and characterises
stretches of discourse, especially where those stretches of discourse may be an
argument put forward by an opposing speaker in Parliament.
2.2 Situation reference
Fraurud (1992a) explored the phenomenon oi situation reference, and gives the
following example, reproduced from (4) above:
(11) Mary was fired.

a. That happened last week
b. That is true
c. That surprised me
In (11a), the demonstrative can be said to refer to an event — the firing of
Mary. In (lib), the antecedent is a proposition, that the firing of Mary is true.
Finally in (1 lc) the antecedent is a fact, tbe fact that Mary was fired.
Fraurud argues that we can distinguish examples such as (11) from ordi-
nary cases of direct anaphora by considering tbe semantic class of the ante-
cedent (whether the antecedent is an event, a proposition or an object), the
syntactic properties of tbe anaphor and antecedent (for instance whether tbe
antecedent is a sentence) and the mode of reference (whether the anapbor func-
tions deictically or anaphorically).''
Fraurud (1992:4) points out that situation reference is not homogeneous,
and cannot be easily distinguisbed from ordinary' coreferential anaphora,
whicb she terms 'object reference'. Fraurud defines her basic terminology thus:
'Situation' here is used as a cover term for, on the one hand, events, processes,
states and the like, and, on the other, entities like facts and propositions. When
I need to distinguish the two main kinds of situation entities, I will call the
former 'eventualities' [...] and refer to the latter hy the analogous term 'factu-
alities' [...]. The term 'situation anaphor' will he used for any situation-refer-
ring expression with an antecedent in the text. The term 'antecedent' is used
for any constituent, sentence or stretch of text hy means of which the entity is
evoked, or from which the situation referent is derived...
(Fraurud 1992a:4)*
Fraurud (1992a) moves on to discuss tbe entities wbich represent situations in

a discourse model. Fraurud recalls the work of Reichenbacb (1947), who dis-
tinguished between 'thing type' entities, such as houses or human bodies (de-
scribed as 'aggregates of matter keeping together for a certain time' (Reichen-
bach 1947:267)), and 'event type' entities, whicb are 'space-time coincidences
and do not endure' (ibid.). Examples of event type entities would be 'a corona-
tion', 'afight'or 'a car crash'.
This distinction between 'tbings' and 'events' has been bighly influential,
and informs much subsequent work on discourse models, which assumes that
in order to account for anaphoric reference to situations, a discourse model
must contain entities of some sort which represent situations. This is seen in
tbe work of Webber (1979, 1987, 1991), Kamp (1981, 1988), Schuster (1986)
and Bauerle (1988a, 1988b) as well as the work ofthe centering theorists.^-'"
Now that we bave considered the distinction between events and things, we
will consider the distinction Fraurud makes between events on the one hand,
and facts and propositions (or factualities in her terms) on tbe other. Many
situation anaphors such as 'it' (and the English demonstratives as we will see
below) refer to events but also to propositions, facts and assertions (and in
more traditional linguistics 'sentences').
Lyons (1977) approached this by means of his distinction between^rsf-or-
der entities (equating approximately to objects), second-order entities (events),
and third-order entities (facts/propositions). Fraurud uses her own terms for
Lyons' second- and third-order entities, talking of 'second-order entities' as
'eventualities', and 'third order entities' as 'factualities'. Fraurud also points out
that 'eventualities' can be divided into such things as activities, accomplish-
ments, achievements and states (Vendler 1967) or events, processes and states
(Mourelatos 1981).
2.3 Textual/discourse deixis
Lyons (1977) proposed the term 'textual deixis', wbich he saw as a mid point
between the deictic and the anaphoric function of pronouns." Lyons' defini-
tion of deixis, which has remained more or less unquestioned,'^ assumes that
deixis links referring expressions to the 'spatio-temporal co-ordinates ofthe act
of utterance' (Lyons 1977:637).'^
Lyons makes a distinction between 'pure textual deixis' and 'impure textual
deixis'. 'Pure textual deixis' describes cases where an anaphor would refer to a
linguistic entity, as in example (12):
(12) A: I've been to Mount Kinabalu.
B: How do you spell that?
Here, the demonstrative anaphor refers to the linguistic form 'Kinabalu' rather
tban tbe referent ofthe noun phrase.^* On the other hand, 'impure textual de-
ixis' is for Lyons closer to some classes of indirect anaphora than 'pure textual
deixis' and can be seen in example (13):
(13) A: I don't know who you're talking about. Inspector!
B: That's a lie!
Here, the demonstrative refers, not to the linguistic form of A's utterance, but to
the proposition expressed by tbe sentence uttered by A. Lyons is in fact includ-
ing under 'impure textual deixis' a wide range of tbird-order entities such as
facts, propositions etc, which Fraurud terms 'factualities'.'^
8o Simon Philip Botley
Levinson (1983), in a treatment of different kinds of deixis, discusses what

he terms (pure and impure) discourse deixis, which can be contrasted with Lyons'
notion of textual deixis. For Levinson, discourse deixis includes tbe following:
- Lyons' 'pure textual deixis', as in the Kinabalu example above (12) where
reference is made to the linguistic form of an utterance rather than its ref-
erent'^
- Cases where a demonstrative expression refers metalinguistically to a forth-
coming or preceding discourse segment, as in examples (14) and (15):
(14) I bet you haven't heard this story.."^
(15) That was the best joke I have ever heard
- Other expressions which do not function referentially such as but, there-
fore, in conclusion, to the contrary, still, however, anyway, well, actually, all
in all, so, after all, etc.
Tbese last examples sbow that Levinson's discourse deixis covers a wider range
of phenomena than is the case witb Lyons' textual deixis, or with reference.
However, Levinson, as with Lyons, distinguishes discourse deixis from anapho-
ra by saying that anaphora is a relationship of coreference, wbereas discourse
deixis is a relationship between a referring expression such as a pronoun, and
an actual linguistic expression or piece of discourse.
Despite tbese attempts to make a distinction between deixis and anaphora,
we will see in the empirical section of tbis paper that the distinction is often
blurred, and it is difficult to establish a clear distinction. There remain some
cases of anaphoric expressions which, although they seem to function ana-
phorically, also in a sense function deictically.
3. Investigating Indirect Anaphora in three corpora
3.1 Introduction
Here, I will examine demonstratives witb indirectly-recoverable antecedents

in three corpus samples, with a view to investigating genre-based distribution
patterns with respect to indirect anaphora. After providing general findings
from the three genres, I provide detailed findings concerning labelling, situ-
ation reference and textual deixis in tbe three corpora. Finally, there will be a
consideration of several problematic and difficult-to-classify cases of IA, which
it is argued test the limits of CBL methodologies.
3.2 Methodology
In tbis study, I examine all cases of demonstratives witb an indirect relation-

ship with their antecedents, from three corpus samples. The corpora comprise
three 100,000 word segments from the Associated Press corpus of American
newswire texts (AP), the Canadian Hansard proceedings from the Canadian
House of Commons, and the American Printing House for the Blind collection
of literary works and motivational narrative (APHB).
These three corpus samples were cbosen for a number of reasons. Firstly,
tbey represent genres wbich are inherently different from one anotber — news
reportage, literature and transcribed spoken discourse. Therefore, it is envis-
aged that tbese genres will provide many interesting features. Secondly, it
would be expected tbat the way in which anaphors function witb relation to
their antecedents would differ in the three corpora. For instance, the Hansard
sample contains a number of exchanges between parliamentarians who regu-
larly make indirect (and direct) reference to each other's arguments, as well as
their own. Also, as the Hansard represents a spoken genre, it was expected that
there would be a great deal of non-phoric reference (Halliday & Hasan 1976)
and deixis. As is seen below, this proved to be the case.
A third justification for using the three corpus samples is related to the
issue of corpus design and sampling. Biber (1990) found that 1000-word text
samples within a corpus provided a relatively stable count for high-frequency
features in a text, which would presumably include anaphoric pronouns.'*
However, because anaphora often functions over long stretches of text, and
in particular some types of IA involve long distances between anaphor and
antecedent, the use of 1000 word samples as Biber recommends may militate
against making interesting observations about such phenomena as anaphoric
chains, wbich may occur over stretches of text longer than 1000 words. For
this reason, the three corpus samples are 100,000 words in total, and many of
the individual texts that comprise the samples, especially in the AP and APHB
corpora, are much longer than others (but mostly exceed 1000 words).
A final justification for the selection ofthe data is related to the question
of representativeness. On the face of it, the extent to whicb the three corpora
from which the samples are drawn are representative of tbe language in general
and ofthe three genres in particular, is an open question. As a consequence, the
generalisability of observations that are made from them is restricted. In any
case, as Biber et al (1998:246) point out, little is known about the nature of text
types and about how many there are."
However, Biber argues tbat one method of reaching some measure of rep-
resentativeness in a corpus sample is to include a wide variety of different texts.
In the three corpus samples used in this chapter, it has not in every case been
possible to do this, because ofthe nature of tbe corpus from wbich tbe samples
were taken, especially with the Hansard sample, whicb is a continuous record of
a parliamentary session and is therefore not easily divided into separate texts.
However, the AP sample does contain a wide variety of news stories on a
variety of topics, and the APHB corpus sample contains a range of narrative
texts whicb deal witb different subject matter and include novels, short stories
and biograpbies. This textual variety does, I argue, follow Biber's method of as-
suring the representativeness of samples taken from the original corpora.
All demonstrative pronouns in the three corpus samples were annotat-
ed using the demonstrative feature scheme outlined in Botley and McEnery
(2001:8-10). Tbis scbeme is briefly described in Table 1 but see Appendix 1 for
some textual examples.
Table l. The Demonstrative Feature Scheme of Botley and McEnery (2001:8-10)

Feature Valuel Value2 Value3 Value4 Value5
Recoverability of D I N (non-recov- 0(non-ap- None
Antecedent (directly (indirectly recoverable) plicable)
recoverable) erable)
Direction of A (ana- C (cataphoric) 0 (not ap- None None
Reference phoric) plicable
— exophoric or
deictic)
Phoric Type R (referen- S (substitutional) 0 (not appli- None None
tial) cable/
non-phoric)
Syntactic Func- M(noun H (noun head) 0 (not appli- None None
tion modifier) cable)
Antecedent N (nominal P (propositional/ C (clausal) J (adjectival) 0 (no anteced-
Type antecedent) factual) ent)
Using the above annotation scbeme, tbis study considered only those cases
whicb had been identified as being Indirectly Recoverable, that is, those cases
that were tagged with the value I in Table 1 (also, see Appendix 1 for examples).
Once all cases bad been identified in the corpus samples, frequency statistics
were obtained with the aid of concordances generated using the WordSmith
concordance tool (Scott 1996). The frequency data from tbe different corpora
were then subjected to a test of statistical significance, namely the log-likeli-
bood (LL) test.
3.3 Overall distributions of indirectly recoverable demonstratives
Tbe following tables give tbe frequencies for all demonstratives identified as
indirectly recoverable across all tbree genres. Indirectly recoverable demonstra-
tives were so identified using the demonstrative annotation scheme outlined in
Botley and McEnery (2001:8-10) as well as Appendix 1 below.
The first table gives the overall distribution of demonstratives in the three
corpora, along with percentage figures. Tbe figures in columns 3, 5 and 7 in
Table 2 are percentages of tbe total number of cases in each corpus, whereas
the figures in column 9 are percentages ofthe total number of cases in all tbree
corpora. In Table 3, distribution frequencies are given by demonstrative fea-
ture values, e.g. DA = Direct, Anapboric, SH = Syntactic, Head function etc
(Botley & McEnery 2001:8-10).
In this paper, descriptive statistics are enhanced by a significance test, the
log-likelihood measure (LL), wbich shows the statistical significance of differ-
ences in the distribution of IA cases between the three genres being consid-
ered.^" Furthermore, all features whose distribution profiles are statistically
significant are given in bold type, and the cut-off point is marked as an empty
grey row, below which all figures are not significant at the confidence level
given.
Table 2. Overall distrihutions of indirectly recoverable cases in three genres, with

significance scores
Form AP % APHB % Hansard % TOT. % LL
this 29 40.27 129 53.97 204 60.53 362 55.86 148.77
that 33 45.83 73 30.54 101 29.97 207 31.94 36.51
these 5 6.94 35 14.64 28 8.30 68 10.49 27.13
1
. .J
those 5 6.94 2 0.83 4 1.18 11 1.69 1.37
TOT. 72 100 239 100 337 100 648 100
In Table 3, as well as Tables 4-7, two-character abbreviations are used to show

configurations of features and values. The first character gives the demonstra-
tive feature displayed by a given demonstrative and the second character gives
the feature's value. For instance, DA means Direction of Reference Anaphoric,
AO = Antecedent type zero, SM = Syntactic Function Modifier, etc. Please refer
to the legend below each table for full details.
In Table 2, as witb all tables in tbis paper, log-likelihood scores are signifi-
cant with a 95% confidence level and with 2 degrees of freedom^^ if they are
greater than 5.99. As we can see from the above tables, there are many distri-
Table 3. Distribution of Indirectly Recoverable Demonstratives in Three Genres

(3 X 100k Words), by Feature
Feature AP APHB Hansard LL
this_AO 26 129 199 152.34
this_DA 24 122 187 144.62
this_PO 29 119 196 139.24
this_SM 12 75 130 114.74
this_SH 16 54 74 41.63
that_DA 33 71 101 36.32
that_AO 33 71 101 36.32
that_PO 32 67 94 32.04
these_DA 4 34 26 28.59
that_SH 14 42 54 26.26
these_AO 5 32 28 24.65
these_SM 5 33 24 23.87
these_PO 5 32 26 23.71
this_PR 0 10 6 13.99
that_SM 19 31 47 13.47
this_DC 5 7 16 6.98
these_AC 0 3 0 6.59
1 1
that_PR 1 6 7 5.61
these_SH 0 2 4 5.55
this_PS 0 0 2 4.39
this_AP 0 0 2 4.39
that_AP 0 2 0 4.39
these_DO 0 0 2 4.39
these_PR 0 3 2 4.26
this_AN 2 0 1 2.77
this_AC 1 0 2 2.77
this_DO 0 0 1 2.20
this_SO 1 0 0 2.20
that_DC 0 1 0 2.20
that_DO 0 1 0 2.20
these_DC 1 1 0 1.62
those_SH 1 0 1 1.62
those_DA 5 2 4 1.37
those_PO 5 2 4 1.37
those_AO 5 2 4 1.37
those SM 4 2 3 0.68
Legend: AO=Antecedent Type, zero; AC=Antecedent Type, Clausal; AP=Antecedent Type, Propo-
sitional; AN=Antecedent Type, Nominal; DA=Direction, Anaphoric; DC=Direction, Cataphoric;
DO=Direction, zero; PO=Phoric Type, zero; PS=Phoric Type, Substitution; PR=Phoric Type, Referen-
tial; SM=Syntactic Function, Modifier; SH=Syntactic Function, Head; SO=Syntactic Function, zero.
(Refer to Tahle 1 ahove for further details).
butions wbich are statistically significant at this level, some of them highly so,
suggesting some genre-based differences.
For instance, in Table 3, tbe distribution figures for proximal this differ
highly significantly across the genres (>100) v^^here tbe antecedent type is zero
(hardly surprising given the indirectly recoverable antecedents), where tbe di-
rection of reference is anaphoric, where the phoric type is zero (again following
from the indirectly recoverable antecedent where phoric type is not an issue),
and finally wbere the syntactic function of the anaphor is modifier (meaning
that the demonstrative pre-modifies a noun, as in 'this book').
These observations suggest that there are some differences between the
three genres with respect to antecedent type, syntactic function, phoric type
and direction of reference. As well as this, we can make some further general
observations, as follows:
1. In all three corpora, IA cases tend to be characterised by anaphoric de-

monstratives functioning as modifier, with cases functioning as head being
the next most frequent pattern — tbis is borne out by the log-likelihood
scores discussed above. Interestingly, we can see that the log-likelihood
score for this functioning as head is much lower than that for this as modi-
fier, though it is the next most frequent.
2. Cases of cataphora are much rarer than cases of anaphora, in all tbree cor-
pora. Indeed, as we see from Table 3, the differences with relation to cata-
phoric feature frequencies are hardly statistically significant for this and
not significant at all for the other demonstratives. This is perhaps not sur-
prising, considering that anaphoric cases generally outnumber cataphoric
cases in this corpus.
3. There is a much higher proportion of IA cases in the Hansard data than in
tbe otber two corpora, especially functioning as modifier and head. This is
a fact which was discussed in detail in Botley and McEnery (2001:16-27)
as being characteristic of that particular corpus, and possibly of the genre
represented in tbat corpus.
4. Generally, proximal demonstratives have indirectly recoverable anteced-
ents more frequently than distal demonstratives, despite a slight reversal
of this pattern in tbe AP Corpus sample. This is borne out by the high log-
likelihood scores for this in Table 3.
5. Generally, singular demonstratives more than plural ones bave indirectly
recoverable antecedents. We can see that singular demonstratives occupy
the 8 highest log-likelihood scores (log-likelihood > 32.04), suggesting
genre-based differences with regard to singular demonstratives.
These are general observations concerning the distribution of indirectly recov-

erable cases in tbe three genres. Despite the relatively small data samples, we
can observe some genre-based differences which pertain largely to (lexico-)
grammatical features of the language. These features may reveal something
about the surface features of discourse in three different genres, which are
relatively straightforward to delineate, but to look at the limits of discourse
anaphora involving indirect referents, we must examine further the functions
performed by anapboric demonstratives in these three corpora.
To this end, I will now examine the three sub-types of IA in the data,
namely labelling, situation reference and textual/discourse deixis, with a view
to making some observations about genre-based distribution differences, and
to probing the limits of these proposed categories.
For all three corpus samples, each indirectly recoverable demonstrative
was examined according to its function and assigned to various categories.
Frequency statistics were obtained for these categories, which are discussed in
the following sections.
3.4 Labelling
The starting point was to investigate the labelling, or encapsulative function
of demonstrative anapbors with indirectly recoverable antecedents, across the
three genres analysed. This analysis follows Francis' (1989, 1994) distinction
between advance labels (ALs) and retrospective labels (RLs).
Tbe frequencies of RLs and ALs were obtained from each ofthe three cor-
pus samples. The following tables show these frequencies in eacb corpus, ar-
ranged according to demonstrative features.
Tables 4-6 reveal the following general patterns:
1. In all three corpora, RLs vastly outnumber ALs, as would be predicted by
the overall preponderance of anaphoric cases in the data.
2. Across the three samples, modifier demonstratives tend to function as RLs
more frequently than head demonstratives. This aspect will be discussed in
more detail below.
3. In the Hansard and APHB data, modifier and head demonstratives togeth-
er comprise the largest set of RL cases.
4. Generally, proximal demonstratives ('that', 'those') tend to function as RLs
and ALs more frequently than is the case with distal demonstratives ('this',
'these'). This reflects the observation that most of Francis' examples involve
Table 4. Retrospective and Advance Labels by Demonstrative Feature, AP Corpus,

look Words
Feature this that these those
RL AL RL AL RL AL RL AL TOTALS
PO 5 0 5 0 3 1 1 0 15
AO 5 2 5 0 3 1 1 0 17
DA 5 0 5 0 3 0 1 0 14
SM 4 0 4 0 3 1 1 0 13
DC 0 5 0 0 0 1 0 0 6
SH 1 3 1 0 0 0 1 0 6
SO 0 1 0 0 0 0 0 0 1
AN 0 1 0 0 0 0 0 0 1
AC 0 1 0 0 0 0 0 0 1
(Refer to Table 1 above for further details).
Table 5. Retrospective and Advance Labels by Demonstrative Feature, APHB Corpus,

look Words
AO 58 7 10 0 20 1 1 0 97
PO 53 7 10 1 20 1 1 0 93
DA 58 0 10 0 23 0 1 0 92
SM 49 3 9 1 23 0 1 0 86
SH 9 4 1 0 0 1 0 0 15
DC 0 7 0 1 0 1 0 0 9
PR 5 0 0 0 3 0 0 0 8
AC 0 0 0 0 3 0 0 0 3
AP 0 0 0 1 0 0 0 0 1
proximal demonstratives, altbougb sbe does not provide quantitative data

on demonstrative proximity in ber work.
Table 6. Retrospective and Advance Labels by Demonstrative Feature, Hansard Cor-

pus, 100k Words
AO 124 15 45 0 19 0 1 0 204
DA 125 0 45 0 19 0 1 0 190
PO 122 18 0 0 18 0 1 0 159
SM 106 3 25 0 17 0 0 0 151
SH 19 15 20 0 2 0 1 0 57
DC 0 16 0 0 0 0 0 0 16
PR 3 0 5 0 1 0 0 0 9
AP 0 2 0 0 0 0 0 0 2
AC 1 0 0 0 0 0 0 0 1
However, as we see from Table 7 below, the Hansard sample displays a signifi-
cant amount of proximal RLs compared to the other corpus samples. Table 7
gives the distributions and significance scores for RL cases across the three
genres. Because of their low frequency in the data, ALs are not considered. In
the rest of this paper, the emphasis will be on RLs, and the subtypes identified
by Francis.
From Table 7, we can see that the highest log-likelihood scores (>118) are
associated with singular proximal demonstratives with the features anaphoric,
zero-antecedent, modifier and zero phoric type, suggesting some genre-based
differences with respect to these demonstratives and features. This result main-
ly reflects the general pattern in Table 3 above.
3.5 Head demonstratives as labels
As can be seen from the above tables, I have included head demonstratives
in the figures for labels. This inclusion of head demonstratives in the label
class reflects the observation by Francis (1994:97) that the label sometimes
functions as the predicative complement of the demonstrative anaphor. Also,
from the literature outlined above, it would appear that many cases of antec-
edentless head demonstratives will function in terms of situation reference or
text deixis.
Table 7. Distributions of RL Cases in Three Genres, with Significance Scores

Feature AP APHB Hansard LL
this_DA 5 58 125 138.36
this_AO 5 58 124 136.98
this_PO 5 53 122 135.16
this_SM 4 49 106 118.59
that_DA 5 10 45 45.26
that_AO 5 10 45 45.26
that_SH 1 1 20 32.16
this_SH 1 9 19 19.86
these_DA 3 23 19 18.99
that_SM 4 9 25 18.62
these_SM 3 23 17 18.17
these_AO 3 20 19 16.63
these_PO 3 20 18 16.05
that_PO 5 10 0 13.86
that_PR 0 0 5 10.99
this_PR 0 5 3 6.99
these_AC 0 3 0 6.59
( J
these_SH 0 0 2 4.39
these_PR 0 3 1 4.29
this_AC 0 0 1 2.20
those_SM 1 1 0 1.62
those_SH 1 0 1 1.62
those_DA 1 1 1 0.00
those_PO 1 1 1 0.00
those_AO 1 1 1 0.00
However, I argue that there seems to be some overlap between the situa-
tion-referential function and the labelling function with head demonstratives
in certain syntactic environments. For instance, the data revealed a number
of cases where a demonstrative head is in a copular relationship with a lexical
noun or noun phrase which itself labels or encapsulates a preceding stretch of
discourse, as in examples (16) and (17), with the relevant information given in
bold type:
(16) Given the support for consumer representation on CEMA by the

Consumers Association of Canada, the Food Prices Review Board,
the Forbes report and at least one of his cabinet colleagues, can the
Prime Minister say what steps are to be taken or are proposed to ensure
consumer representation on consumer marketing boards and will the
Consumer's Association of Canada be consulted with regard to any such
appointments? Mr. Speaker, this is a very sound principle and in general
terms it has the governments support.
(17) Sen. Edward M. Kennedy stepped off his campaign bus and out into the
street in Rochester, N.H., to be greeted by a swarm of youngsters and a
crowd of grownups hoping for a handshake and an autograph. Was this a
"spontaneous outpouring of public affection," as a candidates aides like
to insist, usually trying to suppress a smile?
(AP corpus, A029:88-89)
In example (16), something in the previous discourse is being simultaneously
labelled as a 'principle' as well as being characterised as being Very sound'. One
could easily paraphrase this as 'this very sound principle' with the same ef-
fect. Example (17) proposes a characterisation ofthe situation being described
previously, but questions this characterisation. The labelling function of the
head noun phrase in cases like these functions exactly like a retrospective label,
with the same interpersonal functions posited by Francis, such as encoding the
speaker's or writer's attitude to the referent being labelled.
It appears from the above that the category of label put forward by Francis
is not as clear-cut as it would seem to be. Therefore, the figures for ALs and RLs
in the above tables are for clear cases of labels only, which fulfil all ofthe char-
acteristics which Francis identifies in her data, as well as the additional aspect
I have identified above.^^
Although the data includes many clear-cut cases of labelling and encapsu-
lation, there were several cases which seem to function in a label-like manner,
encapsulating a stretch of previous discourse, but which do not fulfil all of the
labelling criteria. These cases were not included in the above figures, but are
worthy of separate consideration here. One example of this type of case was
identified in the AP Corpus sample:
(18) Morris has been writing for 45 of his 70 years. He has 29 books — the
latest is the novel "Plains Song" — to his credit. "Can it be!" the white-
haired writer wonders aloud. "Just think, 29 books! Maybe that's my
problem."
(AP corpus, A003:92-96)
In example (18), the demonstrative functions similarly to a label, but it is un-

clear what is being labelled. It would seem that the writer is referring to the fact
that he has 29 books to his credit, what Fraurud (1992a) would refer to as a
'factuality'. However, there seems to be an overlap in this example between the
lahelling and encapsulative functions explored by Francis and others.
In example (19), from the APHB corpus, the antecedent is even more dif-
ficult to identify clearly:
(19) "O, I know it's not evidence, Mr. Utterson; I'm book-learned enough for
that; but a man has his feelings; and I give you my bible-word it was Mr.
Hyde!" "Ay, ay," said the lawyer. "My fears incline to the same point. Evil,
I fear, founded — evil was sure to come — of that connection. Ay, truly,
I believe you; I believe poor Harry is killed; and I believe his murderer
(for what purpose, God alone can tell) is still lurking in his victim's
room. Well, let our name be vengeance. Call Bradshaw." The footman
came at the summons, very white and nervous. "Pull yourself together,
Bradshaw," said the lawyer. "This suspense, I know, is telling upon all of
you; but it is now our intention to make an end of it."
(APHB corpus, B0100793-B0100802)
In this extract from 'Dr Jekyll and Mr Hyde', the protagonists are engaged in a
hunt for the eponymous Doctor's dark nemesis. The lawyer refers to the sus-
pense felt by all in the situation, without that suspense overtly being mentioned
in the text. The antecedent is instead inferable from the actions and feelings of
the characters. One might ask whether this kind of case should be analysed as
'situation reference' but in the absence of a clear definition, this case remains
somewhat problematic.
3.6 Metalinguistic labels and general labels
Labels can be broadly subdivided into general labels (although this is not a term
Francis uses explicitly), and metalinguistic labels. General labels function to
label or package a stretch of discourse, characterising it or evaluating it in some
manner. Metalinguistic labels function similarly to Lyons' 'pure textual deixis'
in that they label a stretch of discourse as a linguistic entity.
In what follows, I explore the metalinguistic function performed by some
labels in the three genres under consideration, with examples and statistics.
To re-iterate, Francis (1989, 1994) identified four subtypes of metalinguistic
label:
- illocutionary nouns
- language activity nouns
- mental process nouns
text nouns
Although the number of cases of metalinguistic label is small, we can see from
Table 8 below that for all three text genres studied, it is largely proximal de-
monstratives labelling illocutionary and language activity entities which are the
most frequent, especially in the Hansard data and the APHB data. These have
the highest log-likelihood scores (16.64 and 12.78 respectively). However, de-
spite this pattern, the frequency for that labelling an illocutionary entity is also
relatively significant (log-likelihood = 10.99).
Table 8. Distribution of RLs Referring Metalinguistically in Three Genres, ranked by

LL Score.
Label AP APHB Hansard LL
this_language act 0 6 12 16.64
this_illocutionary 1 1 10 12.78
thatjllocutionary 0 0 5 10.99
this_mental process 1 8 5 6.23
theseJUocutionary 0 1 4 5.98
thatjanguage act 0 0 2 4.39
that_mental process 0 3 1 4.29
these_text noun 0 1 3 4.29
this_text noun 0 1 0 2.20
thosejanguage act 1 0 0 2.20
these_mental process 1 1 0 1.62
TOTALS 4 22 42
We can see that metalinguistic labels have both low absolute frequencies and
relatively low log-likelihood scores (compared to general labels). However, de-
spite this, we may hypothesise that the general preponderance of metalinguistic
cases in the APHB and Hansard samples stems from the possibility that Han-
sard and APHB both represent narrative genres, with a highly structured dis-
course and (in the case of Hansard especially) with a strong rhetorical aspect.
Therefore, it is not surprising that RLs in general, and metalinguistic labels in
particular, predominate in these genres.
Let us compare the figures for metalinguistic labels in Table 8 with those
for general labels in Table 9.
Table 9. General Labels in Three Genres (RLs Only)

AP APHB Hansard TOTAL
this 3 42 98 143
that 5 7 37 49
these 2 20 12 34
those 0 1 1 2
TOTAL 10 70 148 228
We see from Table 9 that general labels greatly outnumber metalinguistic la-
bels in the three text genres considered in this study (228 against 68). This can
be contrasted with the findings made by Petch-Tyson (2000), who found that
native speaker writers of narrative essays tended to make more use of metalin-
guistic labels than general labels, when compared to non-native writers with
different Ll (first language) backgrounds. The reasons for this would require a
great deal of further work to be carried out, but may provide further evidence
of genre-specific differences in the use of retrospective labels and metalinguis-
tic labels.
Furthermore, it can be suggested that the high preponderance of metalin-
guistic labels in the Hansard sample reflects the fact that the Hansard sample
is composed of parliamentary debates with a heavy reliance upon rhetorical
devices. This may militate in favour ofthe use of labels which mark a speakers
attitude to an utterance or a question from another speaker.
The above observations suggest that there is some genre-based difference
with respect to different types of label, although there is a small amount of
fuzziness in the categorisation of labels in the data analysed. Let us now look at
the other types of IA in this study — situation reference and textual deixis.
3.7 Situation reference and textual deixis
In the following table, I give frequency distributions for demonstratives which

appeared to be referring to situations, or which functioned as textual deictics.
Here, I concentrate on anaphoric cases only.
Here, I have adhered to Fraurud's broad classification which divides situa-
tion reference into reference to 'factualities' (propositions, statements, etc), and
'eventualities' (events, processes, etc). I have included 'textual deictics' for those
cases of 'pure textual deixis' which are not metalinguistic labels. I will discuss
each of these categories, with examples ofeach type for clarification.
Table lo. Situation-Referring and Textually-Deictic Anaphoric Demonstratives in

Three Genres, with Significance Scores.
Demonstrative/situation type AP APHB Hansard LL
that_eventuality 11 0 12 18.69
this_eventuality 5 27 10 18.44
this_factuality 0 1 9 15.47
this_text deictic 1 0 7 11.55
that_factuality 1 0 5 7.78
that_text deictic 0 0 3 6.59
3.7.1 Eventualities
These make up the greatest proportion of situation-referring cases in the three
corpora, particularly in the Hansard and APHB samples. Also, as can be seen
from Table 10, the distribution figures for eventualities differ most significantly
across the three genres, although the statistical significance does not appear
great, due to the low absolute frequencies involved.
Here is an example from the Hansard corpus:
(20) No wonder Mr. C. W. Lewis, chairman ofthe bargaining committee of
Local 60, asked Mr. Trudeau on August 22 that the basic wage of grain
inspectors be "at least parity with all Vancouver grain workers." This
justifies the stand that the Perry report is just a springboard for increased
inflation across the country.
In example (20), the demonstrative is pointing to the situation being encoded

in the previous sentence. There is no specific linguistic antecedent apart from
the situation, the events being described. It is worth noting that almost all cases
of situation reference in the data involved head demonstratives.
Another noteworthy pattern, especially prevalent in the APHB corpus
sample, is a large number of cases of event-referring demonstratives which
seem to function in particular lexical idioms which in turn function as dis-
course markers, as we see in examples (21) and (22):
(21) These fleets came back empty-handed, because wherever they drew near
the fertile shore with its fine white buildings, regiments ofthe Shah's
army rode down to meet them. At this Stenka Razin and Filka put their
heads together to think what to do.
(APHB corpus, B0201382-83)
(22) When he faced them all Stenka Razin explained that henceforth they
must till the farmlands, and carry on trade up and down the river. They
must repair the damage to Astrakhan, and guard themselves from attack.
No longer would the Muscovites give them orders. In this way they
would rule themselves, Cossack fashion.
3.7.2 Factualities
References to factualities occur to a lesser extent than reference to 'eventuali-
ties', primarily in the Hansard sample, and with chiefly proximal demonstra-
tives, as in example (23):
(23) I am interested in having good food on the table, that is, in good train
service for transporting people and enough cars to carry our goods, so
that we can export manufactured products and create good service to
our ports. This holds true particularly for Vancouver and some parts of
eastern Canada.
Here, the demonstrative refers to the argument expressed indirectly in the pre-
vious paragraph. Another example follows:
(24) We Social Crediters have a solution that several still find funny, and
the majority of Canadians still do not accept it. We saw that at the last
election.
In example (24), the proposition 'that the Majority of Canadians do not accept
it' is the antecedent. We see here that there is a certain amount of potential
overlap between the directly recoverable cases with propositional antecedents
and the cases of situation reference with 'factuality' antecedents. I argue that
the propositional antecedent cases are those with clear surface markers such as
'that-clauses' whereas the cases I point to above tend to be non-surface propo-
sitions or arguments encoded in a sentence and referred to as facts.
3.7.3 Textual deictics

This class of IA covers cases of 'pure textual deixis' where an utterance or lin-
guistic entity itself is being referred to. These are the least frequent cases in the
three samples, and also have low significance scores. Here are some examples
as (25) and (26):
(25) Mr. Speaker, I rise on a point of order. With respect, yesterday I wished
to ask a supplementary question but was overlooked, which does not
seem possible. With all due respect, and I will sit down after I have said
this, members on the front benches have a responsibility.
Here, the speaker is referring to his own utterance metalinguistically, yet there
is no labelling function, as the demonstrative is a head noun.
In example (26), the reference is to the previous utterance, but there is an
element of event reference, providing evidence of overlap between the text de-
ictic and event-referential functions:
(26) Holding his hat in his tense hands as he had seen the others do, the
young Stenka explained. "Because, Father, their standard poles no longer
showed against the sky — because the sound of their wagons went away
toward the sea." At this the Ataman lifted his gray head with pride.
The observations made above suggest some genre-based differences in the dis-
tribution of IA cases referring to situations and functioning as textual deic-
tics, despite the relatively limited amount of data studied. These observations
also start to reveal more strikingly how problematic and fuzzy the categories of
situation reference, and textual deixis can be. Let us complete this analysis by
considering a few examples taken from the three corpora which provide some
serious challenges.
4. Some outlying cases
So far in this paper, I have considered only 71% (462 cases) of all indirectly re-
coverable demonstratives in the three corpora studied. Although we can place
these cases into fairly clearly defined categories, at this point, we are faced with
the remaining 29% (186 cases in all) which have proved difficult to classify eas-
ily as labels, situation reference or text deictics.
A subset of these cases (numbering some 100 cases in all, 15.43% of all indi-
rectly recoverable cases) do fall into small classes based on a number of semantic
or syntactic criteria, but the numbers involved are too small to propose any mean-
ingful cross-genre patterns at this stage. The classes can be given as follows:
Temporal reference
Idiomatic cases/discourse markers/genre-specific examples
Quantifiers
'Class membership' references
I will discuss each of these in turn, with examples.
4.1 Temporal reference
There are 12 cases where reference is made to a stretch of time or a temporal

situation which were not included in the 'situation reference' class above. Here
is one example:
(27) "I watched the shooting from the roof of my house," said Rauf, one of
four Afghans who led this reporter into rebel territory. "I also saw a
number of men slip away from the jirga and escape down the river bank."
By this time, 400-500 government troops were going through the town.
(AP corpus, A031:89-91)
Example (27) appears to be referring to the situation described previously,

which may suggest that what is going on here is a class of situation reference.
This points to a need to further examine the situation-reference category to
include the temporal aspects of situations.
4.2 Idiomatic cases/discourse markers/genre-specific examples
There are some cases where demonstratives are used in a way which does
not appear to be situation-referential, nor encapsulative. This is a loose class,
which, like the temporal examples above, requires further research on a larger
corpus in order to produce firmer conclusions. Some examples appear to be
genre-specific, as we see from the following example (28) from the Hansard
corpus sample:
(28) I solemnly believe that if we do not or cannot make that right a reality
during the life of this parliament with its commitment to bilingualism,
with the very emphatic words ofthe Leader ofthe Opposition in his
speech an hour ago, with the strong French-speaking representation in
government, then it will never be done and separatism will have proved
its point and Canadian unity will cease to have meaning for the majority
of Quebeckers. In that respect, we should be happy about the progress
made during these lastfiveyears.
Idioms like 'in that respect' (as well as 'in that regard') would be, according to
Francis, treated as neutral retrospective labels. However, I would argue that it
is difficult to see their encapsulative function, and instead they appear to be
'inherently unspecific' (Winter 1982) idiomatic expressions, which do not have
a specific antecedent, yet in a sense are linked to the preceding context.^^ An-
other noteworthy example comes from the Hansard corpus:
(29) He said he was having a legal opinion researched on the proper course
for censure in anticipation that a formal move would be made in that
direction.
(Hansard corpus, H0205276)
In example (29) again there is an unspecific reference but there is arguably a

surface antecedent here in the form of'censure'. I would tentatively suggest that
labels of this kind are characteristic of the genre of parliamentary reporting,
with its preponderance of formal official language. Further work will need to
be done in order to confirm this hypothesis.
Another aspect of the loose idiomatic class is that some cases appear to
function as discourse markers. Here is example (30) to illustrate this:
(30) Meanwhile the Volga men put lighted fuses in the powder kegs and
hurled them up over the rail among the Persian soldiers, who were
shattered by the explosions. Then, throwing the torches before them, the
Cossacks swarmed up the sides. Their shout echoed over the still water
— "Sarin na kitchkou!" In this way, by attacking a few vessels at a time,
the Cossacks became masters ofthe enemy fleet except for some smaller
craft that rowed away in panic.
(APHB Corpus)
This example seems to function to link one discourse unit to the next, and does
not refer to any particular segment of the previous discourse. Its discourse-
linking function may be more important than any discourse-referring function
it may perform.
4.3 Cases functioning as quantifiers
A small number of cases question the referential function where a demonstra-

tive expression quantifies something mentioned in the previous discourse. One
clear example from the AP corpus will illustrate this:
(31) In winning the honor for the fourth time this season, Sampson became
only the third player to earn rookie honors that many times in a single
- season since the award began in 1970.
(AP corpus, A037:33)
In example (31), there is clearly some kind of anaphoric relationship, but it is
not clear what is being quantified. The antecedent has to be inferred from 'the
fourth time'.
4.4 'Class membership' references
There are a number of cases where a demonstrative refers to a stretch of dis-

course as a 'kind of X' or a 'type of X'. Example (32) below appears to function
similarly to an RL, encapsulating what has gone before and naming it:
(32) The present basic wage is $4.96 an hour. Add: 87 cents wage increase,
December, 1973; 19 cents cost of living allowance, November, 1974; 65
cents wage increase, December, 1974; 50 cents cost of living allowance,
April, 1975; 21 cents cost of living allowance, November, 1975; 40
cents pension plan; 21 cents other fringe benefits. Total: $3.03 an hour.
Which is a 61 per cent increase, extreme by any standard. Does Ottawa
really mean to put its seal of approval on that kind of package, thus
establishing it as a target for other unions.
In example (33), there is an indirect link between the demonstrative expression

and an entity mentioned earlier:
(33) The Economic Council of Canada dealt with this settlement in 1966 in
its third annual review, assessing all the relevant facts. But large, highly
publicized settlements to which governments are parties inevitably
have somewhat more impact on the climate of collective bargaining
across the country. That is the crux of the problem. From the moment
the government embraces any figures or intervenes, the benchmark is
made. This is true whether you like it or not. The Economic Council of
Canada considered this sort of situation in 1966 and I have referred to
its conclusion.
Again, this example shows how a demonstrative expression may be referring to
an antecedent as an abstract class or entity, rather than a concrete entity.
5. Some noteworthy cases
Finally, I will discuss a group of some 86 noteworthy cases which were high-
ly challenging for the current analysis. This group seems to elude classifica-
tion for the time being, and may provide support for the claim that indirect
anaphora marks the limit of the corpus approach to anaphora, in that the an-
notation scheme used in this paper does not currently allow such cases to be
clearly delineated.
These cases, as far as it has been possible to discern, do not fit into clear
categories described in this section so far, although many of the cases raise
important research questions. I will select some notable examples and discuss
the issues that they raise.
Table 11 shows the overall distribution of these cases in the three corpora,
along with significance scores.
Table 11. Unclassified noteworthy cases in three genres.

Word AP APHB Hansard LL
this 7 10 27 14.94
these 0 5 4 7.41
1 1
that 8 9 15 2.57
those 3 1 2 1.05
As can be seen from Table 11, the majority of unclassified cases occur in the
Hansard and APHB corpora, suggesting some division along genre lines. How-
ever, we see that proximal demonstratives are the only significantly frequent
unclassified cases, albeit receiving relatively weak significance scores.
We can, however discuss several issues in relation to these unclassified
cases. For instance, there were many cases in the data where reference is made
to the topic ofa segment of discourse. There were relatively few of these cases,
but here are some of them:
(34) Mr. Speaker, it must first be made clear that the Department of
Agriculture does not run CEMA. CEMA runs the department. CEMA
runs its own business. I will be saying more about this later today.
Example (34) is typical of several cases in the Hansard sample where a speaker
appears to be referring to the matter under discussion, or the topic that has just
been introduced. These are akin, I would speculate, to cases of 'this matter' or
'this topic' which would be included as neutral labels. Here is another example:
Indirect anaphora loi
(35) We are talking about STOL. You are stalling. You don't know. I do
know, but I am not going to mention it, for obvious reasons. I am being
perfectly honest. I am telling members the plain truth, but I am not
going to reveal all the details. I repeat, the figure of $25 million which I
mentioned previously is the right one. If Mr. Sinclair contends the STOL
program has cost $135 million, let him prove what he says. I think he is
wrong. He should not consider himself as speaking for the government
of Canada. He is speaking as a citizen, and that is all. My interest in this
is no smaller than his.
In the exchange contained in example (35), the demonstrative highlighted in

bold can be said to refer to the local topic ofthe current exchange, which is the
STOL program introduced at the start of this extract. This shows that in order
to understand the reference of some demonstratives, we need to take into ac-
count the structure of topics, as is argued by workers such as Grosz and Sidner
(1986) as well as much current work on centering in dialogues.^'*
Finally, another area which provides some ambiguity is where a demon-
strative reference is made in direct speech to some antecedent which may not
be present in the context of the discourse, but would be present in the shared
knowledge or deictic context ofthe participants. Here is an example, (36):
(36) Marjory seemed glad to see him, and gave him her hand without
affectation or delay. "I have been thinking about this marriage," he
began.
(APHB corpus, BOlOl525-26)
6. Conclusions
This paper has provided some detailed empirical insights into three types of
indirect anaphora in three written English genres, despite a relatively small
data sample (300,000 words) which makes strong generalisations somewhat
premature.
Also, it must be stressed that this paper has taken a deliberately descrip-
tive and comparative approach to the three types of indirect anaphora under
analysis, and there has not been any attempt to speculate on how the cases of
IA might be resolved by humans. In particular, any serious attempt to explore
the role of indirect anaphora in, for instance, automatic anaphor resolution,
has been beyond the scope of this paper, even though the data discussed would
certainly serve as the basis for the development of automatic anaphor resolu-
tion systems.^^
We have instead made several observations concerning the distribution
of indirect anaphora involving demonstratives in parliamentary exchanges,
newswire stories and literary narrative. We offer support, for instance, for the
hypothesis that argumentative genres such as parliamentary proceedings make
much use of retrospective labels, especially those which characterise and eval-
uate a stretch of text. It would be worth analysing a larger stretch of similar data
or court proceedings to further explore this notion.
Another genre-based observation that deserves further investigation is a
seemingly higher proportion of proximal demonstratives functioning as RLs
in the Hansard data compared to the other corpus samples. This may have
a number of explanations, one of which might be that parliamentary speak-
ers often label utterances which have only just been said, either by themselves
or by other speakers; and another might be that parliamentarians make more
metalinguistic references than speakers in other genres.
However, despite the appearance of clear genre-based patterns, this paper
has explored a number of challenging issues. For instance, an underlying as-
sumption of modern corpus-based linguistics (CBL) is that empirical, quan-
titative descriptions of language should be at least on an equal footing with
rationalistic ones based on intuition (McEnery & Wilson 1996:16). One of
the corrolaries of this is that a corpus-based study of language should be able
to provide observations which either confirm or deny rationalistic intuitions
about language.
This can only be achieved realistically if categories such as 'indirect anapho-
ra' or 'retrospective label' are easy to identify (using corpus annotation) and to
count. If this is not the case, then what can result is fuzziness and ambiguity.
This paper has shown that indirect anaphora definitely poses difficulties for
corpus-based linguistics, in that almost 30% of IA cases analysed were hard to
classify straightforwardly, whether or not an annotation scheme was used.
This is because antecedents lack clear surface linguistic boundaries (such
as situation reference and some ofthe unclassified cases), where the inference
process for retrieving an antecedent is complex, or unclear (as in situation ref-
erence and text deixis), or where overlapping definitions make it difficult to
make hard and fast analysis decisions (as is the case with most types of indirect
anaphora). Therefore, indirect anaphora reveals some limitations of descriptive
corpus-based linguistics.
Furthermore, although it has been possible to identify some patterns in the
corpora, especially in the area of retrospective labels and situation reference.
this has sometimes been challenging, because some cases could not be straight-
forwardly described, and therefore assigned an unambiguous annotation
symbol.
Even though Botley and McEnery's (2001) demonstrative feature annota-
tion scheme was applied mostly successfully, the information contained in the
feature tags is limited — we can at best retrieve information about the sur-
face syntactic function ofthe anaphors and their antecedents. Full information
concerning the referential and discourse functions of demonstratives, as well
as the inferential complexity associated with some indirect anaphors, is more
difficult to obtain using the scheme as it currently stands. Therefore, further
development is required before a corpus annotation scheme can provide richer
information about indirect anaphora and its various complexities.
Also, an issue that needs to be addressed is the reliability ofthe annotation
scheme used, and the extent of agreement between annotators as to the appli-
cation of tags to particular cases of LA. In this study, all ofthe annotation was
carried out by one analyst — which to an extent side-steps the issue of agree-
ment between annotators.
However, the reliability of the annotation process still remains an issue to
be considered in future work, given that there is plenty of evidence in the lit-
erature that users of annotation schemes do not always agree on how to apply
them (see Baker 1997 and Poesio & Vieira 1998).
Despite the challenges inherent in this work, the methodological issues
raised have a positive outcome, because they force us to re-evaluate existing
analytical categories, such as label or situation reference, reinforcing the value
of naturally-occurring language data in helping us to provide a rigorous and
complete description of English discourse.
Notes
* The author was formerly in the Department of Linguistics and Modern English Language
at Lancaster University, UK. He is now an Associate Professor in the Language Department,
MARA University of Technology, Kota Samarahan, Malaysia.
1. American Printing House for the Blind Treebank — contains mostly literary extracts
and motivational stories. Total size: 200,000 words. Available from http://www.comp.lancs.
ac.uk/computing/research/ucrel/corpora.html. A 100,000 word sample of the corpus was
used for this paper.
2. All corpus examples are given codes like this, which are references to the computer file
name plus the line numbers in which the examples are found.
3. The Hansard Corpus contains proceedings from the Canadian House of Commons
throughout the 1970s. Its total size is 750,000 words and it is available from http://www.
comp.lancs.ac.uk/computing/research/ucrel/corpora.html. A 100,000 word sample was
used for this paper.
4. It is worth mentioning here the phenomenon of'bridging reference' (Clark 1977), also
known as associative anaphora (Hawkins 1978), where the link between a definite descrip-
tion and its 'anchor' has to be inferred. One example might be: "I have just decorated my
house. The door is now red". In this paper, I have left bridging reference out ofthe scope of
indirect anaphora primarily because bridging references tend to involve definite NPs rather
than demonstratives, which are the focus of this paper.
5. Though see Halliday and Hasan (1976), Krenn (1985), d'Addio (1988, 1990) and Conte
(1980,1981,1996) for other perspectives on the encapsulative function of NPs.
6. Francis admits that there is some overlap between the illocutionary type and the lan-
guage activity type. It is possible to identify a broad continuum from purely mental pro-
cesses to purely verbal ones. (ibid.:92-93).
7. See Fraurud (1992a) for a detailed treatment ofthese aspects of situation reference in the
literature.
8. This definition is generously inclusive, covering as it does a wide variety of different re-
ferring phenomena. Many of the cases of indirectly recoverable anaphora examined below
fall into these categories and also some aspects of the directly recoverable cases examined
in Botley and McEnery (2001) are covered by this term, a fact which may cause confusion
which is worth addressing. Botley and McEnery (2001) particularly refer to the propositional
antecedent type cases, which would presumably be included as 'factualities' by Fraurud.
However, these cases are treated here as examples of anaphora with directly recoverable
surface antecedents in the form of'that-clauses' rather than as cases of indirect anaphora.
9. See Grosz, loshi and Weinstein (1995) and Walker et al (1998).

10. On the surface, it would appear that 'things' and 'events' equate to 'objects' and 'situa-
tions' and that the distinction between the two types of entity is clear-cut. This is not the
case, as we see in Fraurud (1992a:6) and Vendler (1967).
11. Lyons argued that deixis was more basic than anaphora.
12. Though see Gundel et al (1988), Ehlich (1982) and Bosch (1983) for different views of
the definition of deixis and anaphora.
13. C. Lyons (1999) makes a distinction between deixis meaning 'closeness to or association
with some centre (typically the speaker and the moment of utterance)' and deixis in the
sense of directing a hearer's attention toward a referent (Lyons 1999:160). C. Lyons points
out that I. Lyons (1977) uses the term 'deixis' in both senses, and therefore reserves the term
'ostention' for the second ofthe above two senses ofthe term.
14. A dormant volcano in Northern Borneo.

15. Fraurud notes that Lyons seems to omit to include second-order entities — her eventu-
alities — among the class of referents involved in impure textual deixis. The reasons for this
omission are not clear.
16. As Fraurud notes, Levinson refers to cases like this as 'mention' or 'quotation'.
17. These are akin to Francis's (1989,1994) category of advance labels.
18. Although for lower-frequency features, such as subject that-clauses, such small samples
may not provide enough examples.
19. See Biber (1993), McEnery and Wilson (1996:63-65), and Aston and Burnard (1998:21-
40) for discussions of issues of corpus design, sampling and representativeness.
20. With thanks to Dr. Paul Rayson at Lancaster University for providing guidance in im-
plementing the significance test.
21. 2 degrees of freedom are considered normal for frequency profiles with three columns,
as is the case with all tables in this paper.
22. The data used in Francis' study was primarily newspaper material.
23. However, if we paraphrase these examples so that the demonstrative is rescued from
the idiom, these cases become much more specific in reference, as in 'We have not met our
commitment with respect to that'.
24. See Grosz, loshi and Weinstein (1995) and Walker et al (1998).
25. But see the work of Poesio, Vieira and Teufel (1997), Poesio and Vieira (1998) and Byron
(2002) in this regard.
References
Aston, G. & Burnard, L. (1998). The BNC Handbook: Exploring the British National Corpus
with SARA. Edinburgh: Edinburgh University Press.
Bauerle, R. (1988a). Ereignisse und Reprasentationen. LILOG-Report 43. Frankfurt: IBM
Germany.
Bauerle, R. (1988b). Aspects of Anaphoric Reference to Events and Propositions in German.
Unpublished Ms.
Baker,). R (1997). Consistency and accuracy in correcting automatically tagged data. In R.
Garside, G. Leech & A. McEnery (Eds.), Corpus Annotation (pp. 243-250). London and
New York: Longman.
Biber, D. (1993). Representativeness in Corpus Design. Literary and Linguistic Computing
8 (4), 243-57.
Biber, D. (1990). Methodological Issues Regarding Corpus-based Analyses of Linguistic
Variation. Literary and Linguistic Computing 5, 257-269.
Biber, D., Conrad, S. & Reppen, R. (1998). Corpus Linguistics: Investigating Language Struc-
ture and Use. Cambridge: Cambridge University Press.
io6 Simon Philip Botley
Bosch, P (1983). Agreement and Anaphora. New York: Academic Press.

Botley, S. P & McEnery, A. M. (2001). Demonstratives in English: a Corpus-based Study.
Journal of English Linguistics 29: 7-33.
Byron, D. (2002). Resolving pronominal reference to abstract entities. In Proceedings ofthe
40th Annual Meeting, Association of Computational Linguistics (pp. 80-87). University
of Pennsylvania.
Clark, H. H. (1977). Bridging. In P N. Johnson-Laird & P C. Wason (Eds.), Thinking: Read-
ings in Cognitive Science. Cambridge: Cambridge University Press.
Conte, M-E. (1996). Anaphoric Encapsulation. Belgian Journal of Linguistics 10,1-10.
Conte, M-E. (1981). Textdeixis und Anapher. Kodikas/Code 3,121-132.
Conte, M-E. (1980). Coerenza Testuale. Lingua e Stile 15,135-154.
D'Addio, W (1990). Tre capsule anaforiche e sinonimi contestuali, aspetti testuali del les-
sico. Linguistica Selecta I (Pubblicazioni del Dipartamento di Scienze del Linguaggio
dellVniversita de Roma "La Sapienza") (pp. 6-32). Roma: Bagatto Libri.
D'Addio, W. (1988). Nominali anaforici incapsulatori: un aspetto della coesone lessicale.
Dalla Parte Del Ricevente: Percezione, Comprensione, Interpretazione. Atti del XIX Con-
gresso Internazionale della Societa di Linguistica Italiana (Roma, 1985) (pp. 143-151).
Roma: Bulzoni.
Ehlich, K. (1982). Anaphora and deixis: same, similar or different. In R. Jarvella & W. Klein
(Eds.), Speech, Place, Action (pp. 315-338). Chichester: John Wiley and Sons.
Francis, G. (1994). Labelling discourse: an aspect of nominal-group lexical cohesion. In M.
Coulthard (Ed.), Advances in Written Text Analysis (pp. 83-101). London: Routledge.
Francis, G. (1989). Aspects of Nominal Group Lexical Cohesion. Interface. Journal of Ap-
plied Linguistics 4, 27-53.
Francis, G. (1986). Anaphoric Nouns. Discourse Analysis Monograph 11. Birmingham: Eng-
lish Language Research.
Fraurud, K. (1992). Processing Noun Phrases in Natural Discourse. PhD Thesis (by Publica-
tions), Department of Linguistics, Stockholm University.
Fraurud, K. (1992a). Situation Reference: What Does 'It'Refer To?. GAP Working Paper No
24, Fachbereich Informatik, Universitat Hamburg.
Grosz, B. J., loshi, A. K. & Weinstein, S. (1995). Towards a Computational Theory of Dis-
course Interpretation. Computational Linguistics 21 (2), 203-225.
Grosz, B. J. & Sidner, C. L. (1986). Attention, Intentions and the Structure of Discourse.
Computational Linguistics 12, 175-204.
Gundel, J. K., Hedberg, N. & Zacharski, R. (1988). The generation and interpretation of de-
monstrative expressions. In Proceedings from the 12th International Conference on Com-
putational Linguistics, COLING 1988 (pp. 216-221). Budapest: University of Budapest.
Halliday, M. A. K. & Hasan, R. (1976). Cohesion in English. London: Longman.
Hawkins, J. A. (1978). Definiteness and Indefiniteness. London: Croom Helm.
Kamp, H. (1988). Discourse representation theory: what it is and where it ought to go. In
A. Blaser (Ed.), Natural Language and the Computer (pp. 34-52). Heidelberg: Springer
Verlag.
Kamp, H. (1981). A theory of truth and semantic representation. In I. Groenendijk, T. M.
V. lanssen & M. Stokhof (Eds.), Eormal Methods in the Study of Language (pp. 66-83).
Amsterdam: Mathematische Centrum.
Krenn, M. (1985). Probleme der Diskursanalyse im Englischen. Verweise mit this, that, it und
Verwandtes. Tubingen: Narr.
Levinson, S. C. (1983). Pragmatics. Cambridge: Cambridge University Press.
Lyons, C. (1999). Definiteness. Cambridge: Cambridge University Press.
Lyons, J. (1977). Semantics. Volume 1 and 2. Cambridge: Cambridge University Press.
McEnery, A. M. & Wilson, A. (1996). Corpus Linguistics. Edinburgh: Edinburgh University
Press.
Mourelatos, A. P. D. (1981). Events, processes and states. In P Tedesci & A. Zaenen (Eds.),
Syntax and Semantics Volume 14: Tense and Aspect (pp. 191-212). London: Academic
Press.
Petch-Tyson, S. (2000). Demonstrative expressions in argumentative discourse: a computer
corpus-based comparison of non-native and native English. In S. P Botley & A. M.
McEnery (Eds.), Corpus-Based and Computational Approaches to Discourse Anaphora
(pp. 46-66). Amsterdam: John Benjamins.
Poesio, M. & Vieira, M. (1998). A Corpus-based Investigation of Definite Description Use.
Computational Linguistics 24 (2), 183-216.
Poesio, M., Vieira, M. & Teufel, S. (1997). Resolving bridging references in unrestricted text.
In Proceedings ofthe ACL Workshop on Operational Factors in Robust Anaphor Resolu-
tion (pp. 1-6). Madrid, Spain, July 1997.
Reichenbach, H. (1947). Elements of Symbolic Logic. Toronto: Macmillan.
Schuster, E. (1986). Towards a Computational Model of Anaphora in Discourse: Reference to
Events and Actions. MS-CIS-86-34, LINC LAB 17, University of Philadelphia.
Scott, M. R. (1996). WordSmith Tools. Oxford: Oxford University Press.
Vendler, Z. (1967). Linguistics and Philosophy. Ithaca: Cornell University Press.
Walker, M. A., Joshi, A. K. & Prince, E. F. (Eds.). (1998). Centering Theory in Discourse.
Oxford: Clarendon Press.
Webber, B. L. (1991). Structure and Ostention in the Interpretation of Discourse Deixis.
Language and Cognitive Processes 6 (2). 107-135.
Webber, B. L. (1987). Two Steps Closer to Event Reference. MS-CIS-86-74, LINC LAB 159,
University of Philadelphia.
Webber, B. L. (1979). A Eormal Approach to Discourse Anaphora. PhD Thesis, Harvard Uni-
versity. New York: Carland.
Winter, E. O. (1982). Towards a Contextual Grammar of English. London: Allen and Un-
win.
Author's address
Simon Botley
Language Department
MARA University of Technology (UiTM)
Jalan Meranek, Kota Samarahan
Sarawak, East Malaysia
Phone: (6) 82 677593
Email: spbotley@sarawak.uitm.edu.my
io8 Simon Philip Botley
Appendix 1. Botley and McEnery's Demonstrative Annotation Scheme,

with Corpus Examples and Analysis
1. Recoverability — Directly Recoverable:

(1) "The company said Saturday that it will increase gasoline allocations from 50 percent to
60 percent through February The 35 percent allocation was set by Phillips last week. More
than 90 percent ofthe company's total supply of that fuel was produced by the Burger refin-
ery". (AP Corpus, A002:90)
Feature Value Tag

Recoverability directly recoverable D
Direction of reference anaphoric A
Phoric type referential R
Syntactic function modifier M
Antecedent type nominal N
2. Recoverability — Indirectly Recoverable:

(2) Devco has attempted to contradict both the present and the former minister, and has
put forward the illusion that they have things well in hand. In addition to this distortion of
the situation perpetuated by Devco, the corporation has insinuated that the committee is
interfering in the legitimate bargaining process. (Hansard Corpus, H0100006-H0100007)
Feature Value Tag

Recoverability indirectly recoverable I
Phoric type not applicable 0
Antecedent type no antecedent 0
3. Recoverability — Non-Recoverable (akin to Halliday and Hasan's 'exophoric'):

(3) "I never regarded Mr. Head as a master of fiction. In any event, I trusted we would get
back on track with our commitment of assistance to those nations which are less fortunate
than we are." (Hansard Corpus, H0205303-H0205304)
Feature Value Tag

Recoverability non-recoverable N
Direction of reference not applicable 0
Antecedent type no antecedent 0
4. Recoverability — Zero-recoverable antecedent:

(4) "We operated the program for six years with great response and high quality" said Bruce
Alton, president ofthe private, four-year school located here. "And we felt that if one or two
individuals could compromise the program that easily, by necessity we should discontinue
the program altogether." (AP Corpus, A004:28-29)
Feature Value Tag

Recoverability not applicable 0
Syntactic function not applicable 0
Antecedent type not applicable 0
5. Direction of Reference — Anaphoric:

(5) "'Our offense is designed to shoot lay-ups. If we can't carry on this offense, we find our-
selves sitting on the bench..'" (A005:7-8)
Feature Value Tag

6. Direction of Reference — Cataphoric:

(6) Stevenson has this in common with Bunyan, that his allegory has to do with some-
thing of such universal human experience that we are hardly aware ofthe indirect approach,
and make the transference from the actual to the symbolic with scarcely a conscious effort.
(BO 100:42)
Feature Value Tag

Direction of reference cataphoric C
Syntactic function head of noun phrase H
Antecedent type propositional P
7. Phoric Type — Referential:

(7) But now he waited patiently unt il lone strings of barges appearedI. These flew the flag of
the two-headed eagle, and a cross. (B02011:81-81)
no Simon Philip Botley
Feature Value Tag

Syntactic function head H
8. Phoric Type — Substitutional:

(8) "The organizers said they hoped the march would launch a new civil rights movement
akin to that ofthe 1960s" (A001:30)
Feature Value Tag

Phoric type substitutional S
9. Syntactic Function — Head:

(9) We are also very much aware of representations made to us by western provincial min-
isters of agriculture and consumer affairs. These will be considered in detail next Monday.
(H02051:23-24)
Feature Value Tag

10. Syntactic Function — Modifier:

(10) "The street was small and what is called quiet, but it drove a thriving trade on the
week-days. The inhabitants were all doing well, it seemed, and all emulously hoping to do
better still, and laying out the surplus of their gains in coquetry; so that the shop fronts
stood along that thoroughfare with an air of invitation, like rows of smiling saleswomen."
(B01001:26-27)
Feature Value Tag

Indirect anaphora lii
11. Antecedent Type — Nominal:

(11) The world owes a considerable debt of gratitude to Mr. Byles the Butcher, or to the less
generic alias under which this indispensable tradesman supplied the needs ofthe Steven-
son family in Bournemouth in 1885. (B01000:l)
Feature Value Tag

12. Antecedent Type — Clausal:

(12) "Novellist Wright Morris says he alwavs is working on a book 'because this is the way
I breathe'" (A003:85).=
Feature Value Tag

Antecedent type clausal C
^ The spelling for 'novellist' here reflects the American source for this example.
13. Antecedent Type — Adjectival:
(13) From these fear-stricken rovings Markheim's eyes returned to the body of his victim,
where it lay both humped and sprawling, incredibly small and strangely meaner than in life.
In these poor, miserly clothes, in that ungainly attitude, the dealer lay like so much sawdust.
(B0101746-47)
Feature Value Tag

Antecedent type adjectival J
14. Antecedent Type — Propositional/Factual:
(14) "We have a better depth thisvear so I hope that makes a difference " (AP Corpus, A012,
sentence 4)
Feature Value Tag

Antecedent type propositional P

Indirect Anaphora

Caricato da

Informazioni sul documento

Descrizione originale:

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Indirect Anaphora

Caricato da

Copyright:

Formati disponibili

Indirect anaphora

Testing the limits of corpus-based linguistics

Simon Philip Botley*

This paper investigates indirect anaphora (LA.) from a corpus-based Hn-

Keywords: indirect anaphora, corpus, labelling, situation reference, deixis

Anaphora is a linguistic device whereby a speaker or writer can recall to the

(1) A tall woman in a long rustling gown appeared.

International Journal of Corpus Linguistics i i : i (2006), 73-112.

(4) Mary was fired.

2. Three main types of Indirect Anaphora

Labelling has attracted considerable attention, prominently by Francis (1986,

Most RL cases occur in paragraph-initial clauses, and contribute to textual co-

2.1.1 Retrospective labels

Example (5) illustrates the naming and encapsulating function of RL cases.

2.1.2 Advance labels

Here, in example (6) we see a subsequent stretch of discourse introduced and

b. language activity nouns

c. mental process nouns

Anaphoric labelling is argued to be an important aspect of argumentative nar-

2.2 Situation reference

(11) Mary was fired.

Fraurud (1992a) moves on to discuss tbe entities wbich represent situations in

2.3 Textual/discourse deixis

Levinson (1983), in a treatment of different kinds of deixis, discusses what

3. Investigating Indirect Anaphora in three corpora

Here, I will examine demonstratives witb indirectly-recoverable antecedents

In tbis study, I examine all cases of demonstratives witb an indirect relation-

Table l. The Demonstrative Feature Scheme of Botley and McEnery (2001:8-10)

3.3 Overall distributions of indirectly recoverable demonstratives

Table 2. Overall distrihutions of indirectly recoverable cases in three genres, with

In Table 3, as well as Tables 4-7, two-character abbreviations are used to show

Table 3. Distribution of Indirectly Recoverable Demonstratives in Three Genres

1. In all three corpora, IA cases tend to be characterised by anaphoric de-

These are general observations concerning the distribution of indirectly recov-

Table 4. Retrospective and Advance Labels by Demonstrative Feature, AP Corpus,

Table 5. Retrospective and Advance Labels by Demonstrative Feature, APHB Corpus,

proximal demonstratives, altbougb sbe does not provide quantitative data

Table 6. Retrospective and Advance Labels by Demonstrative Feature, Hansard Cor-

3.5 Head demonstratives as labels

Table 7. Distributions of RL Cases in Three Genres, with Significance Scores

(16) Given the support for consumer representation on CEMA by the

(AP corpus, A003:92-96)

In example (18), the demonstrative functions similarly to a label, but it is un-

3.6 Metalinguistic labels and general labels

Table 8. Distribution of RLs Referring Metalinguistically in Three Genres, ranked by

Table 9. General Labels in Three Genres (RLs Only)

3.7 Situation reference and textual deixis

In the following table, I give frequency distributions for demonstratives which

Table lo. Situation-Referring and Textually-Deictic Anaphoric Demonstratives in

(Hansard corpus, H0311432-33)

In example (20), the demonstrative is pointing to the situation being encoded

3.7.3 Textual deictics

4. Some outlying cases

I will discuss each of these in turn, with examples.

4.1 Temporal reference

There are 12 cases where reference is made to a stretch of time or a temporal

Example (27) appears to be referring to the situation described previously,

4.2 Idiomatic cases/discourse markers/genre-specific examples

In example (29) again there is an unspecific reference but there is arguably a

4.3 Cases functioning as quantifiers

A small number of cases question the referential function where a demonstra-

4.4 'Class membership' references