Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
Karin Aijmer
Diana Lewis Editors
Contrastive Analysis
of Discourse-
pragmatic Aspects of
Linguistic Genres
Yearbook of Corpus Linguistics and Pragmatics
Series Editor
Jesús Romero-Trillo, Universidad Autónoma de Madrid, Spain
Reviews Editor
Dawn Knight, Cardiff University, Cardiff, UK
Contrastive Analysis of
Discourse-pragmatic Aspects
of Linguistic Genres
Editors
Karin Aijmer Diana Lewis
University of Gothenburg Department of English and
Gothenburg, Sweden Lerma Research Centre
Aix Marseille University
Aix-en-Provence, France
Introduction...................................................................................................... 1
Karin Aijmer and Diana Lewis
v
vi Contents
vii
Introduction
Karin Aijmer and Diana Lewis
The aim of this issue of the Yearbook of Corpus Linguistics and Pragmatics is to
explore the comparability of discourse-pragmatic characteristics of genres across
European languages, using parallel corpora (aligned translated texts) and/or compa-
rable corpora (genre-matched original texts). The articles have their origin in a
seminar at the 12th ESSE conference in Kosiče, Slovakia 29 August–2 September
2014 convened by the editors.
Renewed interest in contrastive linguistics over the past couple of decades, together
with increasing availability of specialised digital corpora, have resulted in a new,
usage-based approach to language comparison. The domain of contrastive linguis-
tics centres on the comparison, in synchrony, of two languages. In a break with the
‘applied’ approach to contrastive linguistics of the 1960s and 1970s, which tended
K. Aijmer (*)
University of Gothenburg, Gothenburg, Sweden
e-mail: karin.aijmer@eng.gu.se
D. Lewis
Department of English and Lerma Research Centre, Aix Marseille University,
Aix-en-Provence, France
e-mail: diana.lewis@univ-amu.fr
course strategies, power status, speaker roles associated with the genre can explain
both formal and functional properties of the patterns. The focus is on spoken, writ-
ten or multimodal genres within domains such as political discourse, public com-
munication, journalism, stand-up comedy, academic and professional discourse,
addressing both methodological and theoretical issues. All adopt a usage-based
approach, exploiting a range of corpus material to reveal patterns of form and use in
one or more languages.
Genre-specific recurrent patterns can also be studied contrastively. The contras-
tive point of view highlights the dependence of the patterns on different social and
cultural practices in the compared languages. Languages involved in the compari-
sons with English are Czech, Dutch, French, Polish, Norwegian, Spanish and
Swedish.
Central to the contrastive analysis of linguistic phenomena is the use of parallel and
comparable corpora, and this volume illustrates the use of both types. Parallel cor-
pora consist of translations from one language to the other. Comparable corpora
consist of texts in two or more languages which are comparable with regard to
genre, formality, subject-matter, time-span, etc. (Aijmer 2008).
Parallel corpora can be further characterized as unidirectional or bidirectional,
depending on the translation direction. A bidirectional parallel corpus makes it pos-
sible to analyse how words and constructions in one language have been rendered in
the target language and to retrace the process to find the sources of the translations.
Parallel corpora were first used for lexical and grammatical studies but are now also
used as a resource to study discourse and pragmatic phenomena. As illustrated in
this volume, there now exist parallel corpora for many different language pairs usu-
ally with English as either the source or the target language. A parallel corpus is
above all a method to show differences or similarities between lexical elements or
constructions in two or more languages which may not be apparent to intuition.
Another approach is to use the parallel corpus is to test a hypothesis about how a
particular function is expressed in another language by making observations about
correspondences and arrive at a theoretical statement which is empirically based (cf.
Gast 2015). Dyvik (1998) addressed the question how translational phenomena can
be used for the study of meanings. In this perspective the corpus can provide a
resource in lexical semantics by mirroring meanings and functions of an element in
one language in another language. Translators are excellent informants since they
use their judgments to find the appropriate translation as a part of their professional
duties thus avoiding the observer’s paradox (Labov 1972). Translations should be
used with caution, however. The disadvantages of using parallel corpora is that they
may suffer from ‘translationese’ (Baker 1993; Baroni and Bernardini 2005),
source-text influence, the translator’s fingerprints, and uneven translation quality.
The results of the translation analysis should therefore also be tested on the basis of
4 K. Aijmer and D. Lewis
The volume is divided into three sections, according to the methodology and the
type of contrastive analysis carried out (cf. Aijmer 2008).
The first section comprises four papers based on parallel (translated-text) corpora.
Karin Aijmer’s contribution discusses obligation across languages and genres.
The starting-point is the observation that in both English and Swedish the meaning
of obligation can be expressed by a modal auxiliary. However must and its Swedish
Introduction 5
cognate måste are not always each other’s translation equivalents reflecting the fact
that there are several grammatical and lexical ways to express obligation. The trans-
lations of Swedish måste into English showed that måste was frequently translated
by the semi-modal have to especially in fiction. If have to and had to were conflated
the frequency would be even higher than for must as a translation choice. Need to,
should and ought to, on the other hand, were all more frequent in non-fiction than in
fiction. In the Swedish translations from English måste was most frequent both in
fiction and non-fiction. Other alternatives are få (‘may’), ska(ll), skulle (‘shall’,
‘should’), behöva (‘need’), bör (‘ought to’). Must and have to express different
meanings as a translation of måste. Have to, especially when qualified by will, is
downtoning and polite, it can have a general or generic meaning and it can indicate
negative evaluation. In Swedish ska(ll)/skulle can be used to express power and få
indicates that an action is unwelcome to the hearer. In non-fiction must, have to and
need (and their Swedish correspondences) are associated with interactional goals
and how these are evaluated as good or bad. By using impersonal structures with a
collective we as the grammatical subject or an agentless passive the speaker can get
the message across to the hearer with maximum hedging.
The aim of Lieven Buysse’s contribution is to examine the mutual translat-
ability of Dutch dus and English so. The corpus used is the Dutch-English com-
ponent of the Dutch Parallel Corpus (1997–2009). The texts belong to five text
types: fictional and non-fictional literature, journalistic texts, instructive texts,
administrative texts, and external communication. The corpus has been balanced
for text type and for translation direction and amounts in all to five million words.
Since it is a bidirectional corpus all the examples of so translated to dus (and dus
from Dutch into English) were included as well as ‘back-translations’. The func-
tional ranges of these two discourse markers were shown to be remarkably simi-
lar – the polysemies of dus and so overlapped almost completely – but there were
significant differences in frequency and distribution, with dus being both more
frequent overall and more associated with inference than so, which occurred more
typically in resultative contexts. Significant genre effects were found: as well as
being unevenly distributed across genres, dus and so also tended to occur with
different functions in different genres and were differently distributed according
to language. Thus the study not only demonstrates how semantic equivalence does
not result in translation equivalence, with only a sixth of source-text dus being
translated by so in the corpus, but also how genre constrains the markers differ-
ently in the two languages.
Michaela Martinková and Markéta Janebová wanted to investigate the evi-
dential and epistemic senses of the Czech particle prý by studying the functions of
prý reflected in the correspondences in another language. The authors used the
English-Czech and Czech-English sections of the Czech National Corpus- InterCorp,
which is a multilingual parallel corpus of texts written in 39 different languages
with their Czech counterparts. Their study focused on three registers which were
represented in the corpus: fiction, journalistic texts and spoken language. The
journalistic texts are represented by the PressEurope database (2009–2014). The
spoken language in InterCorp comes from Proceedings from the European
6 K. Aijmer and D. Lewis
Parliament and a corpus of Subtitles. Czech as source and as target language were
not differentiated in order to obtain a sensible amount of text from the European
Parliament. The sub-corpora vary in size and come from different periods. Moreover
there were more translations than target texts in the corpus.
The authors found different patterns of prý usage and different frequencies
according to genre as well as different patterns of translation. From these translation
patterns, and with the great majority of correspondences being evidential, the
authors concluded that the epistemic uses of prý are context-bound: the interpreta-
tion of the particle as conveying doubt may arise in the context as an inference from
the context.
Magdalena Szczyrbak discusses the correspondences between English modal
adverbs of certainty and their Polish correspondences in argumentative legal writ-
ing. The material used for the study consists of 30 Opinions of Advocated General
at the European Court of Justice, issued between 2011 and 2013 comprising about
576,000 words. The data has been drawn from source texts in English and their
Polish translations. The English texts were written by a native speaker of English,
whereas the translations were made by professionals having Polish as their native
language. At the outset the most frequent modal adverbs were identified in the
English sub-corpus and then the equivalents in the Polish sub-corpus were
determined.
The genre of Opinion was chosen because it was assumed that it would be rich
in persuasive devices. Modal adverbs of certainty have been shown in previous stud-
ies to be useful rhetorical devices inextricably linked to stance and argumentation.
They are for instance used both to foreground and background legal arguments and
to demonstrate power and authority. The modal adverbs studied were indeed, neces-
sarily, of course, clearly and obviously and their Polish correspondences. The trans-
lations were used to study both the conventional meanings of the adverbs and the ad
hoc meanings associated with the particular genre. It is shown that there were
noticeable differences between the English adverbs and their Polish correspon-
dences with regard to the degree of persuasiveness and that the author’s presence
was less visible in the Polish translations. Omission of the modal adverb in the
translation was shown to lessen the rhetorical force of the translated text and its abil-
ity to influence the reader.
The second part of the volume contains two papers based on comparable corpora.
Hilde Hasselgård’s study compares adverbial clause placement in English and
Norwegian cross-linguistically and across the genres fiction and news. End position
was the most common alternative in both English and Norwegian in both registers.
In the initial position there were both language and register differences. It is shown
that initial position was proportionally more frequent in fiction in English than in
Norwegian. However there was a higher frequency of initial clauses in news in both
Introduction 7
was given by President Obama in Cairo in 2009 and is 5871 words long. The study
is both quantitative and qualitative. A search was made for the frequency of personal
pronouns, modality markers, mental verbs and negation in both speeches using a
Concordancer. The quantitative comparison showed that Obama’s speech had a
higher frequency of modality, especially epistemic modality, negation and first per-
son pronouns. In Bush’s speech negation was infrequent and you was more frequent
than other pronouns. It is argued that Obama’s speech can be interpreted as an
attempt to ‘recontextualise’ the position of the US policy towards the Middle East.
Negation in Obama’s speech is for example used to correct assumptions about the
US by Arabic speakers or about the relations between the US and the Arabic coun-
tries. Obama’s frequent use of modal auxiliaries indicates his personal involvement
with the topic addressed. Bush’s speech, on the other hand, shows a more conven-
tional discourse style characterized by a low frequency of stance markers and nega-
tion and a preference for second person pronouns. The preference for unmodalized
assertions further underlines an authoritative speaking style.
Metadiscourse has been frequently used to characterise academic genres. Tereza
Guziurová draws on the ‘integrative’ or broad approach to metadiscourse in order
to compare the distribution and uses of the engagement markers we and you, imper-
atives, questions in academic textbooks and research articles. The discussion focuses
on the pronoun we since this proved to be the most frequent engagement marker in
the data accounting for about 70% in both genres. We was used with a wide range of
semantic reference with different discourse functions depending on the genre. The
majority of examples of we in both genres referred to the writer and his/her readers.
The main reason for using the pronoun in the textbooks is that it draws students into
the shared world of disciplinary understanding. Another reason is that we helps to
make the exposition more interesting, relevant and approachable by referring to
people in general as language users. In research articles the writer uses we with the
aim of disguising him/herself as the agent. The study also discusses the potential
advantages and drawbacks of the integrative approach.
References
Aijmer, K. (2008). Parallel and comparable corpora. In A. Lüdeling & M. Kytö (Eds.), Corpus
linguistics. An international handbook (Vol. 1, pp. 275–291). Berlin: de Gruyter Mouton.
Baker, M. (1993). Corpus linguistics and translation studies: Implications and applications. In
G. Francis, M. Baker, & E. T. Bonelli (Eds.), Text and technology: In Honour of John Sinclair
(pp. 233–252). Amsterdam: John Benjamins.
Bhatia, V. K. (2002). Applied genre analysis: A multi-perspective model. Ibérica, 4, 3–19.
Baroni, M., & Bernardini, S. (2005). A new approach to the study of translationese: Machine-
learning the difference between original and translated text. Literary and Linguistic Computing,
21(3), 259–274.
Connor, U., Nagelhout, E., & Rozycki, W. V. (Eds.). (2008). Contrastive rhetoric: Reaching to
intercultural rhetoric. Amsterdam: John Benjamins.
Introduction 9
Dyvik, H. (1998). A translational basis for semantics. In S. Johansson & S. Oksefjell (Eds.),
Corpora and cross-linguistic research. Theory, method, and case studies. Amsterdam/Atlanta:
Rodopi.
Gast, V. (2012). Contrastive analysis: Theories and methods. In: B. Kortmann & J. Kabatek (Eds.),
Linguistic theory and methodology. (WSK-Dictionaries of Language and Communication
Science). Berlin: Mouton de Gruyter. http://www.personal.uni-jena.de/~mu65qev/papdf/
CA.pdf
Gast, V. (2015). On the use of translation corpora in contrastive linguistics. A case study of imper-
sonalization in English and German. Languages in Contrast, 15(1), 4–33.
González, G., de los Angeles, M., Lachlan Mackenzie, J., & Alvarez, E. G. (Eds.). (2008). Current
trends in contrastive linguistics. Amsterdam: John Benjamins.
Gregory, M., & Carroll, S. (1978). Language and situation: Language varieties and their social
contexts. London: Routledge & Kegan Paul.
Hinds, J. (1990). Inductive, deductive, quasi-inductive: Expository writing in Japanese, Korean,
Chinese, and Thai. In U. Connor & A. M. Johns (Eds.), Coherence in writing: Research and
pedagogical perspective, Alexandria (pp. 87–109). VA: TESOL.
Johansson, S. (2007). Seeing through multilingual corpora: On the use of corpora in contrastive
studies. Amsterdam: John Benjamins.
König, E. (2011). The place of contrastive linguistics in language comparison. ms. http://www.
personal.uni-jena.de/~mu65qev/e-g-ontrasts/papers/koenig_2011.pdf
König, E., &. Gast (2009). Understanding English-German contrasts (2nd ed.). Berlin: Schmidt.
Labov, W. (1972). Sociolinguistic patterns. Oxford: Blackwell.
Levinson, S. (1979). Activity types and language. Linguistics, 17, 365–379.
Mauranen, A. (2002). Where’s cultural adaptation? A corpus-based study on translation strategies.
In: B. Silvia & Z. Federico (Eds.) CULT2K, special issue of inTRAlinea. http://www.intralinea.
org/specials/article/1677
McEnery, T., & Xiao, R. (2008). Parallel and comparable corpora: What is happening? In
G. Anderman, & M. Rogers (Eds.), Incorporating corpora: The linguist and the translator
(pp. 18–31). Cleveland: Multilingual Matters.
Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge:
Cambridge University Press.
Taboada, M., Suárez, S. D., & Alvarez, E. G. (2013). Contrastive discourse analysis: Functional
and corpus perspectives. Sheffield: Equinox.
Part I
Contrastive Analysis with Parallel Corpora
The Semantic Field of Obligation
in an English-Swedish Contrastive Perspective
Karin Aijmer
Abstract The article examines how genre (fiction and non-fiction) affects the dis-
tribution and uses of the modal auxiliaries must/måste in the obligation meaning and
their more or less grammaticalized alternatives in English and Swedish. In both
languages the obligation markers were associated with specific contexts of use
depending on genre. In fiction the obligation markers were frequent with first and
second person subjects. Must was used for exhortations. Have to was used with
generic subjects and instead of must for more general recommendation. In Swedish
there was no corresponding distinction. Must usually pointed forwards to something
desirable in the context of EU debates. Have to, on the other hand, was also found
in negative contexts in the non-fiction data. Swedish måste was used both about
positive and negative obligation. In Swedish få was an alternative to måste when the
imposition was not in the hearer’s interest.
1 Introduction
In both English and Swedish the meaning of obligation can be expressed by a modal
auxiliary (must or Swedish måste). This is in line with ‘a significant cross-linguistic
trend for languages to have a category of grammatical expression forms, usually
called the “modal” auxiliaries’ (Nuyts 2016: 13). However must and måste are not
always each other’s translation equivalents reflecting the fact that there are a large
number of grammatical and lexical alternatives to express obligation.
The English modal auxiliaries have attracted a great deal of interest because of
their changing patterns over time. Less attention has been given to the codification of
certain functions which can take place in particular genres. However, Lewis (2015)
With many thanks to Bengt Altenberg for reading an earlier version of the text.
K. Aijmer (*)
University of Gothenburg, Gothenburg, Sweden
e-mail: karin.aijmer@eng.gu.se
2 Obligation and Necessity
in the state of affairs’ (van der Auwera and Plungian 1998: 81). Must is for example
deontic in the example below:
John must leave now
with the definition: ‘as far as the person with authority and /or the norm goes,
John’s leaving is necessary’ (van der Auwera and Plungian 1998: 83). The norms
here can be societal norms as well as moral assessments or judgements of desirabil-
ity (Nuyts 2016: 37).
An additional semantic domain is epistemic modality. The epistemic meaning of
must/måste has been defined in terms of a judgment by the speaker rather than in
terms of obligation: ‘a proposition is judged to be uncertain or probable in relation
to some judgment’ (van der Auwera and Plungian 1998: 81). The epistemic mean-
ing is illustrated by:
John must have arrived
Must (and måste) are available in all the domains. However all the epistemic
examples have been excluded from the investigation. They were less frequent than
the examples with obligation meaning and are mainly restricted to fiction.
In the present study I will focus on the importance of genre for understanding the
different frequencies and uses of the linguistic forms expressing obligation.
Following Biber (1988: 68) I will use the term ‘genre’ ‘to refer to text categoriza-
tions made on the basis of external criteria relating to author/speaker purpose’. The
genres used in the present study represent both fiction and non-fiction.
3 Material and Method
The data are taken from the English-Swedish Parallel Corpus (ESPC) (Altenberg
and Aijmer 2001). The ESPC contains original texts in English and Swedish with
their translations, altogether 2.8 million words making direct comparisons between
the languages possible. The texts represent both fiction and non-fiction texts in
equal proportions. Fiction texts consist of dialogues. Non-fiction is a hyperonym
covering the subject areas memoirs and biography, geography, humanities, natural
sciences, social sciences, applied sciences, legal documents, prepared speech
(Altenberg et al. 2001). I will use translation paradigms as the starting-point and
then compare the most frequent markers of obligation in different contexts of use in
English and Swedish.
16 K. Aijmer
The corpus examples were selected in the following way. First all the examples of
måste and must were extracted from the original texts with their translations. On the
basis of the translations we can compare how obligation is expressed in the two
languages (in either fiction or non-fiction). Måste is, for example, not always trans-
lated as must but a large number of alternatives are found. At a second stage, I
examine the contexts and functions of the most important markers of obligation in
the two languages in both fiction and non-fiction
On the whole, both the auxiliaries were more frequent in non-fiction than in fic-
tion. Moreover, they were more frequent as obligation markers than as epistemic
auxiliaries (see Tables 1 and 2).
In non-fiction Swedish måste had obligation meaning in 96.3% of the examples
to be compared with must in 87.4% of the cases.
The smaller number of examples of must in the English texts is interesting against
the background that it has been claimed that must has declined in frequency within
a 30-year period during the last century and that it has been replaced by other ‘gram-
maticalizing’ elements (Leech et al. 2009).
Table 3 shows the correspondences of the Swedish måste in English (translations
of Swedish originals into English) and Table 4 (in Sect. 4.2) the correspondences of
English måste in Swedish (the translations from the English originals into Swedish).
Must, taking into account all its uses, was more frequent in non-fiction than in
fiction (see Table 2). This difference can be partly explained by the fact that there
are more occurrences of must with epistemic meaning in fiction (32.9% of the
examples were epistemic in fiction to be compared with only 12.6% in non-fiction).
Moreover, as noticed by the diachronic linguist, must has been replaced by have to
in many of its uses (Leech et al. 2009). A genre-type explanation of the discrepancy
is that must has a number of functions in non-fiction texts which are not paralleled
in the fiction texts.
Table 3 The English translations of Swedish måste (SO ->ET). Obligation meanings
ET fiction ET non-fiction Total
must 112 (30.4%) 357 (57.0%) 469 (48.3%)
have to 84 (22.8%) 85 (13.6%) 169 (17.4%)
had to 88 36 124
should 5 46 51
need to (or other forms with need)a 9 32 41
(have) got to 12 4 16
ought to 4 11 15
is (was) -ed 3 5 8
is necessary, essential – 8 8
be going to, will 4 1 5
be forced, be compelled, be made, be taken to – 4 4
x makes sb do sth 3 – 3
ø 4 – 4
other 6 27 33
Examples occurring once or twiceb 11 10 21
Total 345 626 971
a
Not all examples with need in the translations are semi-modals (cf ‘I need somone to talk to’).
b
The following examples occurred once or twice in either fiction or non-fiction: had better, neces-
sarily, of necessity, be in need of, be a need to, appreciate the need to, I should like to say, there is
no other way but, I cannot help but, be enough to, be due to, it was natural for X to do sth, it should
be incumbent on X to do sth, couldn’t possibly, emphatic do, it’s time, the imperative
to the frequency would have been even higher.) Other frequent obligation markers
are (have) got to, need to and should.
Several other expressions have different frequencies in fiction and non-fiction.
Need to, should and ought to are strikingly more frequent in non-fiction than in fic-
tion. Have got to, on the other hand, occurs above all in fiction.
1
Have to was for example translated into behöva (‘need to’) in three examples.
The Semantic Field of Obligation in an English-Swedish Contrastive Perspective 19
The translation paradigms provide only raw data. They do not provide any infor-
mation about the contexts in which must or its alternatives are chosen as a transla-
tion. In the following sections I will discuss must and its most frequent variants have
to and need and make comparisons with the Swedish correspondences. The
following research questions will be asked: Are the factors determining the distribu-
tion and uses of obligation markers in English and Swedish the same? Are the fac-
tors the same in fiction and non-fiction?
Must and måste can be regarded as ‘close relatives’ but they were not always trans-
lated into each other. Must was translated as måste in 74.8% of the cases but the
correspondence in the other direction was much lower (because of the existence of
English variants such as have to). In this section I will discuss must, have to and
need to as competitors in the fiction texts.
Must and have to often overlap in meaning. For example, the translator may have
chosen must but could also have opted for have to without any difference in mean-
ing. However, there are some contexts where must and have to seem to be doing
different things. With a first person subject the speaker is strongly involved in the
verbal action:
(1) Your mother is lucky she did not choose to eat corned beef on a Saturday
night. On Saturday nights we are extremely busy. Now I must go. A nurse
will be coming along soon.” (ST1)
Er mor kan skatta sig lycklig att hon inte valde en lördagkväll för att äta
corned beef. På lördagkvällarna är vi ytterst upptagna. Nu måste jag gå.
Det kommer snart en sjuksyster.” (ST 1T)
When have to is used the obligation requiring an action from the speaker is imposed
by external circumstances (non-deontic meaning). In (2) the speaker has been
watching the galleries for a long time and now feels obliged by the look of them to
‘work up to them’.
However although have to is not deontic it can be used instead of must with a first
person subject to soften the imposition of the action on the hearer. Will (‘ll) in com-
bination with have to makes the imposition more vague and less strong by placing
the action in the future:
20 K. Aijmer
(3) I ‘ll have to think on it and perhaps take a few soundings before I decide
where I can best place it. (FF1)
Jag måste fundera på det och kanske höra mig för här och var innan jag
bestämmer vart jag ska skicka det. (FF1T)
(4) You must allow me this chance in Provence to make up my mind. (BR1)
Du måste låta mig få den här chansen att bestämma mig i Provence. (BR1T)
In (5) the speaker is using have to rather than must because it is less impositive
and therefore more polite. Have to treats the action as negative (face-threatening)
and therefore in need of hedging. Placing the imposition in the future (you’ll have)
is another hedging strategy (cf. I’ll have to):
(5) Reliving, mentally, the events of three days earlier, Andrew said “You ‘ll
have to make allowance for my having been a little dazed at the time.” (AH1)
Andrew gick i tankarna igenom händelserna tre dagar tidigare och sade: “Du
måste tänka på att jag var litet förvirrad just då.” (AH1T)
However in other examples have to does not overlap with the deontic must but refers
to external circumstances (it is important or crucial that you hurry if you’ll get the
colour off the hair):
(6) Matilda said, “I ‘d give it a good wash, dad, if I were you, with soap
and water.
But you ‘ll have to hurry.”(RD1)
Matilda sa: “Om jag vore som du så skulle jag gå och tvätta igenom det
ordentligt, pappa, med tvål och vatten.
Men du blir tvungen att snabba på.”(RD1T)
Out of 34 examples with you as the subject 19 were translated by a generic pronoun
in Swedish making the obligation more vague or general (expressing little speaker
involvement).
In (8) the translator has used behöver (‘need’) to mark what needs to be done (put-
ting less imposition on the hearer):
(8) You only have to drive through the West Midlands to see that if we are in the
Super-League of top industrial nations, somebody must be moving
the goalposts. (DL1)
Man behöver bara köra genom West Midlands för att se att någon måste
ha flyttat på målsnöret för att placera oss i superligan av
industrinationer. (DL1T)
The obligation markers can come with a certain ‘evaluative prosody’ depending on
whether they are associated with something positive or negative (good or bad, desir-
able or undesirable) (Partington 2015).
When the subject has no control over the action have to can come to express
evaluation rather than obligation (Myhill and Smith 1995). In example (9), for
example, have to is chosen to suggest that sitting in the front is something
negative:
(9) He gets carsick and I do not, which is why he has to sit in the front. (MA1)
Han blir bilsjuk och det blir inte jag, det är därför han måste sitta i framsätet.
(MA1T)
In (10) the big bad wolf has to go somewhere else for his dinner (against his will).
(10) The big bad wolf has to go somewhere else to get his dinner; these little
piggies are home free.” (SK1)
Den stora stygga vargen får leta efter sin middag någon annanstans, dom
tre små grisarna har klarat sej.”(SK1T)
Need to and should (or ought to) and their Swedish correspondences encode a
weak deontic meaning (the speaker is open to the possibility that the obligation may
not result in an action). Unlike must these markers do not involve self-imposition (in
the first person) but communicate the speaker’s felt needs to do something
(participant-internal meaning). In non-fiction texts on the other hand need to and
should (ought to) were more frequent and sometimes translated with måste (signal-
ling strong obligation) (Sect. 6.1).
In the following sentence need to conveys that the subject did not feel the need
to sit down:
Need to can also signal the speaker’s positive attitude to the carrying out of the
action. With a generic second person subject need to can, for example, be inter-
preted as a recommendation:
(12) All you need to do is be prudent and not go there again. (RR1)
Allt man behöver göra är att vara försiktig och inte gå dit igen. (RR1T)
In (13) the speaker (a morgue attendant) uses need to rather than must or have to
with directive force:
(13) “We ‘ll need to know what arrangements you want made,” he said. (SG1)
“Vi behöver få veta hur ni vill arrangera begravningen”, sade han. (SG1T)
The authority imposed by the obligation marker is softened by the use of we (rather
than I) and by placing the time when the speaker needs to know in the future (cf. the
use of need to in non-fiction in Sect. 6.1).
The Swedish modal auxiliaries meaning obligation in the data analysed are måste,
få (‘may’), ska/skulle (‘shall’, ‘should’), bör/borde (‘ought to’) (referred to as deon-
tic modal auxiliaries in the Swedish reference grammar Teleman et al. 1999). Få is
also an auxiliary with the meaning permission (=may) and ska/skall has developed
future meaning. Translations can show whether they have been interpreted as hav-
ing an obligation meaning.
According to Teleman et al. (1999: 296), få can have ‘approximately the same
meaning as måste in situations where it is clear that the action referred to in the
sentence is not in the hearer’s interest’ (my translation). This makes it different from
permission (the action is in the hearer’s interest). Let us consider some example
sentences with obligation få and their translations into an obligation marker in
English:
If the subject is the second person the verb has the illocutionary force of a
speaker-initiated directive. In (14) får conveys that the hearer does not intend to
open the gate willingly (it is not in his interest to do so):
(15) Men han hade varit medvetslös en god stund och Birger ville inte ta nån
risk.— Du får åka till lasarettet, sa han.
Det hade Vidart ingenting emot.(KE1)
But he had been unconscious for quite a while, so Birger was taking no risks.
“You must go to hospital,” he said.
Vidart had no objections, but he was worried about the milk. (KE1T)
In (16) the action is treated as unwelcome to the hearer (‘you must show me the
harbour even if it involves some extra effort for you’). Få is therefore used with
persuasive force:
In all the examples of få some kind of negative evaluation takes place. Forgiving and
forgetting (an injustice) are a necessary evil if one is to survive.
(17) För man får glömma och förlåta om man ska överleva och förresten hade
priset på potatisen stigit till nästan två kronor för en tunna. (KE2)
One must forgive and forget if one is to survive, not to mention that the
price of potatoes had risen to nearly two kronor a barrel. (KE2T)
Behöva (‘need to’) is found in different patterns with different meanings.2 When
the subject is the first person the verb refers to a need felt by the speaker:
(18) Men jag behöver prata med dig några minuter. (HM2)
“I need to talk to you about something.” (HM2T)
The source of the need can be internal or external. In (19) the translator has chosen
have to indicating that the source is external (for example that the speaker needs the
cassette to make recordings) and to soften the imposition of the action:
In (20) behöver man makes the utterance into a recommendation (translation: ‘you
have to’):
(20) Dom här plastmattorna behöver man bara skölja lite. (SC1)
You only have to wipe these plastic ones.” (SC1T)
The obligation can also be anchored in a certain social or functional norm (duty,
custom, order, normality, appropriateness) (Teleman et al. 1999: 316). In (23) the
reference is to what is important or essential:
(23) Det viktiga är inte att bestämma tidpunkten, knappt ens att resa.
Det viktiga är att man kan resa när tiden är inne.
Men förberedelserna skall vara genomtänkta. (BL1)
The essential thing is not to determine a time to leave, scarcely even
to make a voyage at all; it is being able to leave when the right
time for departure comes.
But the preparations must be carefully made. (BL1T)
To sum up, must expresses strong obligation associated with the speaker’s author-
ity (deontic meaning). Have to was used in several different contexts besides
expressing participant-external obligation compelled by the circumstances. It was
used instead of must in some contexts to express more politeness. Have to was also
used with a loss of the obligation meaning to negatively evaluate an action. Swedish
få (originally with permission meaning) was used with obligation meaning alternat-
ing with måste. In all the examples with få some kind of negative evaluation was
expressed. Swedish ska(ll) makes explicit deontic meanings where the source norm
involves personal or institutionalized authority. Need to and Swedish behöva are
used for favourable evaluation with a weak deontic meaning.
Figure 1 summarises the meanings of the modal markers of obligation in fiction
texts.
The Semantic Field of Obligation in an English-Swedish Contrastive Perspective 25
Obligation
participant- participant-
internal external
The distribution and use of obligation markers is closely associated with genre or
text type. It is therefore interesting to study them in as many different text types as
possible. The non-fiction texts were atypical in that epistemic meaning was rare (cf
Sect. 4). Moreover must/måste was more frequent than in fiction. As in fiction a
large number of (grammaticalized or lexical) expressions were used to express
obligation.
Must and måste were the most frequent obligation markers in non-fiction. Have
to is ‘marked’ in non-fiction where it is ranked below must. On the other hand need
to, should (and ought to) were strikingly more frequent in non-fiction than in
fiction.
In the non-fiction texts must, have to, should and need to are used in a similar way.
They can for example have either strong or weak impositive force depending on the
context and they can express the speaker’s favourable or unfavourable attitude to the
realization of the verbal action. Prosodies can change depending on the syntactic
environment of the marker as well as the discourse type or genre the obligation
markers appear in. This is particular clear when we make a comparison between
26 K. Aijmer
fiction and non-fiction as in this work.In non-fiction texts speakers/writers use obli-
gation markers primarily as ‘an engine of persuasion’ (Partington 2015: 280). The
markers are directed towards an event in the future which is evaluated either posi-
tively or negatively and they are used by the speaker as a strategy in order to influ-
ence a potential audience. The evaluative meaning of the obligation marker depends
on the meaning of the marker (need/behöva, should/bör have for example positive
meaning) or on extralinguistic features of the discourse. Political discourse is char-
acterized by a number of special features. According to Lewis (2015: 171), ‘it is
often very carefully crafted, every nuance being analysed, and is designed for a
wider audience than the immediate hearers; it aims to impress and persuade and
may have a hortatory function; it has a ceremonial function that favours rhetorical
routines; and above all it deals largely with unrealized affairs.’
In (24) the Swedish translator has indicated (by means of bör ‘should’) his/her
interpretation that granting periods of rest and adequate breaks will have a positive
effect on ensuring the safety and health of Community workers:
(24) Whereas, in order to ensure the safety and health of Community workers,
the latter must be granted minimum daily, weekly and annual periods
of rest and adequate breaks; whereas it is also necessary in this context
to place a maximum limit on weekly working hours; (EEA1)
För att trygga hälsa och säkerhet för arbetstagare inom gemenskapen bör
arbetstagarna ges dygnsvila, veckovila och semester av en viss minsta
längd samt tillräckliga raster. I detta sammanhang är det även nödvändigt
att sätta en övre gräns för veckoarbetstiden. (EEA1T)
(25) Montgomery himself must accept responsibility for one major Allied
misfortune at this time: he asked for, and received, the support of the US
First Army to secure his right flank. (MH1)
Montgomery får själv ta på sig ansvaret för ett av de allierades stora
misslyckanden vid denna tidpunkt. Han begärde och fick stöd av USA:s
Första armé för att säkra sin högra flank. (MH1T)
Table 5 Must , have to, need and Swedish måste with different types of subject
must have to need to måste
Collective we 48 39 10 60
Collectives (countries, institutions) 47 13 4 29
Abstract nouns 66 10 5 48
Passives 144 5 13 52
I 7 11 12
You 3 4 –
Other (including it, there, this) 31 4 1 1
Generic pronoun (one, they, everyone, people Swe ‘man’) 11 1 1 32
TOTAL 357 85 234 234
As shown by this example the obligation markers can also combine with other rhe-
torical strategies such as the use of ‘impersonalization’ (see Table 5). We in example
(26) is the collective or vague ‘we’ referring to ‘we in this country’, ‘we in the
European Union’, etc. When the grammatical subject was not we it was for example
a third person subject with a passive construction. Other examples are collective
nouns such as ‘Countries of the European Union’ or ‘Swedes living and working
abroad’. Abstract nouns are for example ‘the development of trade’, ‘evaluation’.
Table 5 only includes examples from the category speeches in the European
Parliament and political debates in English and in Swedish:
The low frequency of you as subject is noticeable. Rather than saying ‘you must’
which has a strong impositive force, the collective we (eg we need) is used as a tactic
to soften the imposition. A comparison between English and Swedish shows that
Swedish texts use a generic pronoun man which only rarely has a correspondence in
English (one).
With a passive following the modal marker and a third person subject no direct
reference is made to the speaker and hearer.3 The evaluative potential of the marker
can be exploited for persuasive effects. In (27) the use of need to helps the interpre-
tation that the action (matching the flexibility of EU member states by certain crite-
ria) is judged to be favourable (needs to be done). The imposition is only expressed
weakly since it is not directed to a special individual.
With need to the obligation is also represented as being in the best interest
of ‘us’.
3
Compare also Nokkonen (2006: 60) who describes these uses ‘as cases which are still clearly
deontic, but they are not very subjective in nature’.
28 K. Aijmer
With have to and must the obligation can also refer to something which is
regarded as negative or unpleasant. According to Lewis (2015), have to (in political
debates) makes a negative evaluation of the verbal action. Here is an example from
my data:
(28) But we have to implement the nuclear as well as the fossil fuel provisions
of that agreement.
We are taking far too long to decide whether to support the completion
of the Khmelnitsky and Rovno reactors.
I can tell the House that the number of Russian scientists and engineers
in the Khmelnitsky area has greatly increased in recent weeks.
We have to make up our minds.
Are we going to complete those reactors to Western standards or are we
going to leave it to the Russians and let the Memorandum of Understanding
go down as a dead piece of paper? (EADA1)
Men vi måste genomföra både kärnkraftssidan och den fossila bränslesidan
av den överenskommelsen.
Vi tar alldeles för lång tid på oss för att besluta huruvida vi skall stödja
färdigställandet av Khmenilitskij- och Rovno-reaktorerna.
Jag kan berätta för parlamentet att antalet ryska vetenskapsmän och
tekniker i Khmenilitskijområdet har ökat betydligt under
de senaste veckorna.
Vi måste bestämma oss.
Skall vi färdigställa de här reaktorerna med väststandard eller skall vi
lämna det till ryssarna och låta avsiktsförklaringen bli ett
dött papper? (EADA1T)
In (28) the speaker is talking about our ambivalent attitude to nuclear power.
Implementing the provisions of the nuclear agreement is however a necessary evil.
We have to make up our minds although this is unwanted.
The Swedish obligation marker måste is more frequent than must is in English
reflecting the fact that it has few competitors. Like must it co-occurs with imper-
sonal subjects (no special agent is intended). Depending on the context it can indi-
cate strong or weak obligation, participant-internal and participant-external
meaning, express ‘positive’ and ‘negative’ evaluation of the action imposed. The
reference of the grammatical subject is vague. However, it is possible to present the
obligation as not being in the best interest of the general public, workers, members
of the European Union etc.:
The strength of imposition (and evaluation) is not an inherent meaning of måste
but depends on the context:
The Semantic Field of Obligation in an English-Swedish Contrastive Perspective 29
The translator has chosen we need which suggests a positive evaluation. It is desir-
able that we (in the European Union) review our attitudes towards senior citizens.
Måste is vague between different types of evaluative prosody. In (30) the transla-
tor has interpreted the speaker’s attitudes to the activity variously by using either
have to or must. The imposed obligation will be unpleasant for those who prefer a
quick education to a broad education or life-long learning (have to). Must, on the
other hand, implies that integrating working life and education will be for the gen-
eral good.
(30) Den andra faktorn är att vi måste se till att skaffa en utbildning som går att
använda under lång tid när vi skaffar oss en utbildning. Det måste vara en
bred grundutbildning, eftersom samhället förändras i allt snabbare takt.
Det går inte att ha snabba utbildningar. Vidare måste det också vara ett
livslångt lärande. Arbetsliv och utbildning måste helt integreras. (EAND1)
The other factor is that we must ensure that when we obtain an education
we obtain one which can be used for a long time. There has to be a broad
basic education, because society is changing ever more rapidly. It is not
possible to have a quick education. Furthermore, there has to also be
life-long learning. Working life and education must be fully integrated.
(EAND1T)
Får is generally negative as shown by the context. The imposition of the obliga-
tion will have a negative effect on ‘us’ (members of the European Union):
(32) Vi får räkna med att biståndet i krisländerna kommer att tvingas
fungera i en korrupt miljö under lång tid. (CO1)
We must accept that development assistance in crisis countries will have to
operate in a corrupt environment for a long time to come. (CO1T)
30 K. Aijmer
Seeing every technique in the light of other techniques is regarded as negative (there
is no other alternative; it follows that it is bad):
Behöver can be used with the same meaning as ‘strong’ obligation markers:
(34) Under de första månaderna då risken för avstötning och infektion efter
organtransplantation är som störst behöver patienterna undersökas
polikliniskt en till två gånger per vecka. (ORG1)
During the first few months, when the risk of rejection and infection after
organ transplantation is greatest, the patients must be examined at the
outpatient clinic once or twice a week. (ORG1T)
While måste is vague between many different interpretations (it does not for
instance refer to a specific source) skall refers to an institutionalized source norm
such as legal regulations:
(35) Arbetslokal skall vara så utformad och inredd att den är lämplig från
arbetsmiljösynpunkt. (ARBM1)
Work facilities must be arranged and equipped in such a way as to provide
a suitable working environment. (ARBM1T)
7 Conclusion
The study raises a number of issues having to do with genre and with on-going
changes in the modal system in English reflected in functional overlaps. In my data
the obligation markers were the same in fiction and non-fiction but we also saw
some genre preferences. Have to was for example more frequent in fiction than in
non-fiction (which is in line with its taking over some of the functions of must). In
both English and Swedish the obligation markers can be associated with specific
contexts of use. In fiction the obligation markers were frequent with first and second
person subjects. Must was used for exhortations (speech acts implying a high degree
The Semantic Field of Obligation in an English-Swedish Contrastive Perspective 31
Obligation
participant- participant-
internal external
––
deontic Imposition
(by circumstances,
norms)
References
Altenberg, B., & Aijmer, K. (2001). The English-Swedish parallel corpus: A resource for contras-
tive research and translation studies. In C. Mair & M. Hundt (Eds.), Corpus linguistics and
linguistic theory. Papers from the 20th International Conference on English Language Research
on Computerized Corpora (ICAME 20) Freiburg im Breisgau 1999 (pp. 15–33). Amsterdam/
Philadelphia: Rodopi.
Altenberg, B., Aijmer, K. and M. Svensson. 2001. The English-Swedish Parallel Corpus (ESPC).
Manual of enlarged version. http://www.sol.lu.se/engelska/corpus/corpus/espc.html#size
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Coates, J. (1983). The semantics of the modal auxiliaries. London: Croom Helm.
Leech, G., Hundt, M., Mair, C., & Smith, N. (2009). Change in contemporary English: a gram-
matical study. Cambridge: Cambridge University Press.
Lewis, D. (2015). A comparable-corpus approach to the expression of obligation across English
and French. Nordic Journal of English Studies, 14(1), 152–173.
Myhill, J., & Smith, L. A. (1995). The discourse and interactive function of obligation expressions.
In J. Bybee & S. Fleischman (Eds.), Modality in grammar and discourse (pp. 239–292).
Amsterdam: John Benjamins.
Nokkonen, S. (2006). The semantic variation of NEED TO in four recent British English corpora.
International Journal of Corpus Linguistics, 11(1), 29–71.
Nuyts, J. (2016). Analyses of the modal meanings. In J. Nuyts & J. van der Auwera (Eds.), The
Oxford handbook of modality and mood (pp. 31–49). Oxford: Oxford University Press.
Partington, A. (2015). Evaluative prosody. In K. Aijmer & C. Rühlemann (Eds.), Corpus pragmat-
ics. A handbook (pp. 279–303). Cambridge: Cambridge University Press.
Teleman, U., Hellberg, S., & Andersson, E. (1999). Svenska Akademiens grammatik. Stockholm:
Norstedt.
van der Auwera, J., & Plungian, V. A. (1998). Modality’s semantic map. Linguistic Typology, 2,
79–124.
English so and Dutch dus in a Parallel Corpus:
An Investigation into Their Mutual
Translatability
Lieven Buysse
Abstract English so and Dutch dus have been characterised as the highly frequent
prototypical markers of ‘result’ or ‘inference’ in their respective languages. This
study investigates the functional scope of both based on close scrutiny of the
bidirectional Dutch-English component of the Dutch Parallel Corpus, a 10 million
word sentence-aligned corpus of translated texts. Seven functions are identified in
the ideational, interpersonal and textual domains. The mutual correspondence of the
two markers is mapped in a combined quantitative and qualitative analysis of how
they are translated into the target language for each of their functions, as well as of
how they are backtranslated (e.g. which Dutch forms have so as their translation
equivalent in English?). The results show a high overall correspondence, yet with a
slight translation bias, in that the correspondence is considerably higher when so is
translated into Dutch than when dus is translated into English.
1 Introduction
Pragmatic markers are typically words or phrases that do not belong to the proposi-
tional message of an utterance, and are hence semantically and syntactically
optional, but contribute to it in various subtle ways, such as by expressing speakers’
attitudes to their interlocutors or to the message, or by making plain which relations
hold between an utterance and its co-text or context. Ever since Schiffrin’s (1987)
seminal work on “discourse markers” these linguistic items have featured highly on
the agenda of researchers in pragmatics and discourse analysis. One of the most
daunting challenges in the description of pragmatic markers is posed by their
polyfunctionality, which raises questions as to their functional scope as well as to
L. Buysse (*)
Faculty of Arts, KU Leuven, Brussels, Belgium
e-mail: lieven.buysse@kuleuven.be
(1) Ik ben niet veel geniale mensen tegengekomen in mijn leven, dus ik weet
waarover ik praat als ik hem geniaal noem. [‘…, dus I know what I’m
talking about when I call him genious.’]
I haven’t met many brilliant people in my life, so I know what I’m talking
about when I call him brilliant.1
(Fiction, grue-002593, Dutch-English)
(3) That there, there were only the stones in, the walls to hear me – and herself,
who they kept dumb as a stone, and so could tell no-one.
Dat daar alleen de stenen in de muren me konden horen – en zijzelf, die
moet zwijgen als een steen, en dus niets kan doorvertellen. [‘…, en dus can’t
pass anything on.’]
(Fiction, wat-002588, English-Dutch)
The main formal difference between so and dus is the position they can occupy
in the clause. In all previous examples both take clause-initial position, which is the
1
Unless stated otherwise, all examples have been drawn from the Dutch Parallel Corpus (see Sect.
2). For these examples the source text fragment appears first and is followed by the target text frag-
ment. An additional literal translation has been provided in square brackets for the Dutch clause in
which the relevant marker occurs. Each example ends with the basic metadata in rounded brackets:
the text type, the text number in the corpus, and the translation direction).
English so and Dutch dus in a Parallel Corpus 35
sole position for so. Dus, however, also takes mid-position, as exemplified in (4),
and even occupies clause-final2 slots, as in (5).
(4) De hoge positie van Nederland wordt dus verklaard door het lage percentage
leerlingen dat onder niveau 2 scoort.
[‘The high position of the Netherlands is dus explained by the low
percentage of pupils that score below level 2.’]
(External communication, vla-001191, Dutch-English)
(5) Groot mag dus, als het maar opvalt en origineel is.
[‘Large is okay dus, as long as it stands out and is original.’]
(External communication, wst-000768, Dutch-English)
On a semantic level too the two markers show similarities and differences. They
have both been attributed the status of prototypical causal or resultative markers,
and have been attested as highly frequent items in spoken as well as written lan-
guage. So has particularly been studied as a pragmatic marker in discourse-functional
approaches (see e.g. Schiffrin 1987; Redeker 1990; Müller 2005; Lam 2009; Buysse
2012), whereas dus has received most attention from cognitive linguists (see e.g.
Pander Maat and Sanders 1995, 2000; Pander Maat and Degand 2001; Stukker et al.
2008), who have been particularly interested in mapping coherence relations in the
causal domain. One main assumption is clearly shared by both research strands, viz.
that causal markers can function in three domains: the ideational, interpersonal, and
textual domains.3 Ideational relations connect states of affairs in the world described
in the discourse, as in sentence (a) in (6), where so relates a state of affairs (he is
home) to another state of affairs from which it results (he is sick): John’s being home
was caused by his being sick. In the interpersonal domain markers relate “the illo-
cutionary meaning of one of the discourse units with the locutionary meaning of the
other” (Degand 2001: 79). In sentence (b), for example, the second proposition is a
claim inferred from the state of affairs expressed in the first segment: the speaker
claims that John is home on the basis of the observation that his lights are burning.
Textual relations, finally, are discourse-organising relations (e.g. a list or a digression),
which may also take the form of a speech act, as in sentence (c), where the first
proposition sparks a request for information in the second.
2
So too has been observed to occur in a position resembling clause-final slots, such as turn-final
position. In such cases so indeed does not explicitly preface a segment, but an implied segment can
be retrieved from the context (Schiffrin 1987; Müller 2005; Buysse 2014), which is clearly differ-
ent from clause-final tokens of dus.
3
The terminology used here is that of Halliday and Mathiessen’s (2004) metafunctions. Note that
many studies on dus have followed Sweetser’s (1990) terminology, distinguishing between con-
tent, epistemic and speech act domains (e.g. Pander Maat and Sanders 2000; Stukker et al. 2008),
and that other proposals have also been put forward such as Redeker’s (2006: 354) “components
of discourse coherence” (ideational, rhetorical, sequential). For our present purposes the finer
details of these approaches and their mutual differences are of minor relevance.
36 L. Buysse
The distinction between sentences (a) and (b) corresponds to the well-established
theoretical division between semantic and pragmatic relations (cf. Van Dijk 1979).
Applied to so the distinction can be interpreted as one between so marking a relation
of ‘result’ versus marking one of ‘inference’ (Schiffrin 1987; Müller 2005; Buysse
2012).
As for the textual domain, quite a few studies on so have focused on particular
discourse functions or highly specific contexts. Johnson (2002), for example, looks
at so-prefaced questions in police interviews, whereas Norrick (2008) identifies so
as a conversational response token in a story-telling context, and Bolden (2009)
examines its potential to mark an utterance as having been on the conversational
agenda for some time. More comprehensive functional mappings of so, such as
those devised by Müller (2005), Lam (2009), and Buysse (2012), have exposed a
wide variety of textual functions, including marking a summary, signalling a return
to the main discourse unit (e.g. after a digression), initiating (a part of) a conversa-
tion, starting a new sequence in a story, and marking self-correction.
Observations of similar functions for dus have been scarce. Nevertheless, Evers-
Vermeul (2010) discusses two “discourse marker” uses of dus (2010: 153), both of
which are to do with information status. First, dus may indicate that the information
contained in the segment in which it occurs is already somehow available to the
interlocutor (Evers-Vermeul 2010: 161); second, it may have a double function of
marking a conclusion as well as marking that the conclusion is obvious (since it can
be inferred from the prior co-text) or logical (since it is the only sensible conclusion
one could draw) (2010: 167). In another study Degand (2011) identifies two meta-
discursive functions of clause-final dus in spoken Dutch, viz. reformulation and
floor-yielding, both of which have also been noted for so.
Similar to most other pragmatic markers in English, yet unlike dus in Dutch, so
does not only appear as a pragmatic marker, but may also appear, as Müller (2005)
points out, as an adverb of degree or manner (e.g. she’s so great), a pro-form (e.g. I
think so), in fixed expressions (e.g. and so on), and to express purpose (often in the
form of so that).4
In short, the different angles from which so and dus have been approached in
previous research do not allow for a systematic comparison of these two markers
that nonetheless appear to exhibit many formal and functional resemblances, not in
the least their status as prototypical markers of ‘result’ or ‘inference’ in their respec-
tive languages. The aim of the present study is, therefore, to map functional simi-
4
Note, though, that diachronic research has shown that dus used to have an anaphoric function with
a meaning similar to thus or in this way, which gradually got lost between the 16th and 19th cen-
tury (Evers-Vermeul 2010).
English so and Dutch dus in a Parallel Corpus 37
larities and differences between so and dus by looking into their mutual translatability
in a parallel corpus.
Section 2 describes the data used for this investigation. In Sect. 3 the functional
translation correspondence of dus and so is discussed by focusing on each of the
markers’ functions as attested in the corpus. A quantitative analysis of the corre-
spondence between dus and so is provided in Sect. 4, with conclusions drawn in
Sect. 5.
2 Data
The data for this study have been extracted from the bi-directional Dutch-English
component5 of the Dutch Parallel Corpus or DPC (Macken et al. 2011; Paulussen
et al. 2013), a sentence-aligned corpus of translated texts. The texts that constitute the
DPC were published between 1997 and 2009, and belong to five text types: fictional
and non-fictional literature, journalistic texts, instructive texts, administrative texts,
and external communication (such as press releases, brochures and corporate maga-
zines). The corpus has been balanced proportionally for translation direction as well
as for text type, resulting in 500,000 words for each text type for each translation
direction (e.g. Dutch-English and English-Dutch), which amounts to a corpus of 5
million words for the purposes of this study. All instances of so and dus were extracted
automatically from the corpus (together with the aligned target-text sentences), and
subsequently checked manually to remove any double entries and irrelevant tokens.
The latter especially pertained to those instances where so is an adverb of manner or
degree, a pro-form, part of a fixed expression, or a marker of purpose (see Sect. 1). As
this is a bi-directional corpus, different angles can be looked at. It is not only possible
to consider how so has been translated from English into Dutch (and dus from Dutch
into English), but also backtranslations have been taken into account, which means
that I have searched for so in English target texts and traced the correspondences for
these tokens in the Dutch source texts (and vice versa for dus).
The main drawback of working with a corpus of translated texts is their inherent
bias for written registers while pragmatic markers are rather more typical of spoken
registers. As Johansson (2006) aptly points out, though, the target texts are the result
of a thorough process of translation in which translators have independently inter-
preted source texts, and “[w]hat we are studying is the result of this interpretation
(and recreation) process” (2006: 117). This can shed light on how a pragmatic marker
functions in the source language, and how this function can be conveyed in the target
language. Moreover, many texts in the DPC may have appeared in written mode, but
were either meant to be spoken or otherwise reflect spoken language. Many admin-
istrative texts, for example, are transcripts from meetings at the European Parliament
5
The DPC also has a bidirectional Dutch-French component, but this was not included in the pres-
ent study.
38 L. Buysse
In the DPC 697 tokens of so and 1229 of dus fulfilling a pragmatic marker function
have been identified and analysed. These tokens fall into seven functional categories
(Table 1), each of which will be discussed in this section. The categories are largely
those identified in Buysse (2012) for so, adjusted on the basis of the present corpus
analysis as the original classification was based on spoken data whereas the DPC
consists mainly of written data. One category has been added that is specific to dus,
viz. Reiteration.
Schiffrin (1987: 191) describes so as a “marker of result”, clearly indicating that this
meaning relation is at the heart of so’s functional spectrum. Subsequent investiga-
tions (Fraser 1990; Müller 2005; Lam 2009; Buysse 2012) have confirmed this. Dus
has been claimed to be more typical of “epistemic” relations (i.e. inferential rela-
tions) than of “content causal” (i.e. resultative) relations (e.g. Pander Maat and
Sanders 2000; Pander Maat and Degand 2001; Stukker et al. 2008, 2009), although
it does occur in such resultative contexts as well. In (7), for example, Broccoli’s
poor vision was caused by the fact that he was not wearing his glasses, and in (8) the
fact that they cannot chop wood is caused by the absence of chainsaws.
(7) Broccoli had zijn bril niet op, dus hij kon niet goed zien. [‘…, dus he could
not see well.’]
Broccoli wasn’t wearing his glasses, so he couldn’t see much.
(Fiction, gru-002593, Dutch-English)
Stukker et al. (2008) contend that when dus is used outside of its habitual epis-
temic (or inferential) context the rhetorical effect of “speaker foregrounding” (2008:
1306) is produced: since epistemic relations inherently involve cognitive processes,
in using dus the speaker/author indicates that s/he is somehow involved in the estab-
lishment of the relation between the segments. For example, in excerpt (7) the two
segments are clearly causally related, yet by the mere use of dus the perspective
shifts subtly in that the second segment could be viewed as the speaker/author’s
personal observation at the time of speaking/writing rather than as an objective
report of a past event. Similarly, in (8) the dus-prefaced proposition is presented as
the speaker/author’s observation rather than as an objective statement of fact.
The ‘resultative’ category takes up 28.6% (N = 93) of all tokens of so that have
been translated into Dutch, and 33.6% (N = 125) of its backtranslations.6 The rates
for dus are considerably lower, with only 13.8% (N = 124) of its translations into
English and 21.6% (N = 72) of its backtranslations. On the whole, so and dus are by
far each other’s preferred translation correspondents in the resultative category
(Table 2).7 It would seem that this correspondence is tighter when English is the
source language: if zero correspondence is ignored (viz. all source text tokens that
do not have a correspondent in the target text and vice versa), a majority of
‘resultative’ tokens of so are translated with dus, and two thirds of target text tokens
of ‘resultative’ dus have so as their correspondent; with translations in which Dutch
6
For want of precise numbers as to the overall size of the various components of the DPC (e.g.
translations Dutch-English, translations English-Dutch, etc.) absolute numbers cannot be normal-
ized to e.g. 1000 words. Instead the percentages have been calculated within each component.
7
Tables 2, 3, 4, 5, 6, 7 and 8 always mention the three most frequent correspondents. The other
markers are summarized in a single line, indicating how many other markers there are and what the
total frequency of this group of markers is. For example, in Table 2so is translated by 8 other mark-
ers than those making up the top three, totalling 14 tokens of such other markers. If several markers
in the top three have the same frequency, they are all mentioned in the same rank. For example,
Table 3 indicates that daarom and zodat each occur seven times (“2x7”), and that makes them the
second most frequent correspondents of so in Dutch target texts. The percentages are based on the
total number of tokens minus zero correspondences.
40 L. Buysse
Table 2 Top three of correspondents of ‘resultative’ so and dus in translations and backtranslations
in absolute numbers (N) and adjusted percentages (excluding tokens with zero correspondent)
so> Dutch target text Dutch source text >so
marker N Adj.% marker N Adj.%
dus 40 54.1 dus 43 37.7
daarom (‘that’s why’), zodat (‘so 2× 2 × 9.5 daarom (‘that’s why’) 20 17.7
that’) 7
en (‘and’) 6 8.1 zodat (‘so that’) 9 8
8 other 14 18.9 9 other 42 37.2
zero 19 zero 11
Total 93 Total 125
dus> English target text English source text >dus
marker N Adj.% marker N Adj.%
so 43 45.3 so 40 65.6
therefore 23 24.2 therefore 6 9.8
consequently 7 7.4 as such, hence, since, 4 × 4 × 3.3
thus 2
10 other 22 23.2 7 other 7 11.5
zero 29 zero 11
Total 124 Total 72
serves as the source language the shares are still very high (37.7% and 45.3%) but
also considerably lower than when English is the source language.
An inference can be paraphrased as: “from state of affairs X I conclude the follow-
ing: Y” (Buysse 2012: 1768). It has been well established that both so and dus are
considered the prototypical markers of inference in their respective languages (see
Sect. 1). For example, in (9) the speaker/author deduces from the state of someone’s
fingernails in the first segment that this person cannot hold an occupation on the
land or in a factory in the second segment. The difference with a ‘resultative’ rela-
tion is clear: the fact that someone is not working on the land or in a factory is not
caused by their long fingernails.
Dus has the ability to occur in a particular context that is alien to so, by stating
an inferential conclusion that is presented as obvious in that the reader/interlocutor
is expected to rely on shared background knowledge to retrieve the grounds on
which the conclusion is based. Evers-Vermeul (2010) refers to this function of dus
as that of an “accessibility marker” (2010: 171), labelling the stated conclusion as
English so and Dutch dus in a Parallel Corpus 41
(10) While the Americans were unleashing mayhem to the north, the British
were methodically applying Lugard-style colonialism in Basra. They
formed alliances with sheikhs, bribed warlords and won hearts and minds
by going unarmoured. There was optimism in the air. British policy
demanded one thing, momentum towards local sovereignty and early
withdrawal. There was no such momentum.
Terwijl de Amerikanen herrie schopten in het noorden, pasten de Britten in
Basra heel methodisch oude koloniale principes toe: ze gingen allianties aan
met plaatselijke sjeiks, kochten krijgsheren om en wonnen de sympathie van
de bevolking door zich in ongepantserde voertuigen op straat te begeven. De
Britse aanpak vereiste maar een ding: dat er werk gemaakt werd van Iraakse
soevereiniteit en van een vroege terugtrekking der troepen. Maar dat is
erdusnooit van gekomen. [‘But that dus never happened.’]
(Journalistic texts, sta-002559, English-Dutch)
In (11) this inferential prompt is even more outspoken. The excerpt has been taken
from a statement by the Dutch Prime Minister, responding to the murder of a contro-
versial politician earlier that day. Halfway through the statement he points out that it
reflects his personal sentiments (rather than his Cabinet’s). Dus presents this state-
ment as obvious because it can be inferred from, for example, the tone the speaker
has used so far or from the situation in which the statement is being delivered.
(11) Maar dat alles schiet natuurlijk door je kop op een moment, zoals nu, dat dat
nieuws tot je komt. Dat je er steeds meer van doordrongen bent van wat er in
Nederland is gebeurd. In Nederland, een verdraagzaam land, met natuurlijk
politieke tegenstellingen, zoals in iedere democratie. Dat is democratie. Maar
wel met respect voor elkaar, respect voor elkaars mening. Respect voor
elkaars mening houdt natuurlijk ook in dat je elkaar daarop kunt bestrijden,
maar met woorden, niet met kogels. Wat hier gebeurd is, is onbeschrijflijk.
Dit zijndusmijn persoonlijke ontboezemingen. [‘These are dus my personal
sentiments.’] Ik kan het niet anders zeggen. Ik ben er kapot van.
But it all runs through your mind at a moment like this, when you hear news
like this. As it begins to sink in that this has happened in the Netherlands, a
tolerant country, with differences of political opinion, of course, like any
democracy. That is the nature of democracy. But with respect for each other,
respect for each other’s views. Respecting each other’s views means of
course that you can come into conflict with each other, but with words, not
with bullets. What has happened here is indescribable. These are my
personal feelings. I cannot say it any other way. I am devastated.
(Administrative texts, kok-001321, Dutch-English)
42 L. Buysse
Table 3 Top three of correspondents of ‘inferential’ so and dus in translations and backtranslations
in absolute numbers (N) and adjusted percentages (excluding tokens with zero correspondent)
so> Dutch target text Dutch source text >so
marker N Adj.% marker N Adj.%
dus 32 76.2 dus 35 85.4
Daarom (‘that’s why’), 3× 3 × 4.8 dan ook (‘as a result’) 2 4.9
daardoor (‘because of that’), 2
dan (‘then’)
en (‘and’), op dezelfde wijze (‘in 4× 4 × 2.4 ook (‘also’), vandaar dat 4 4 × 2.4
this way’), waardoor (‘by 1 (‘hence’), zo (‘well’), ×
which’), wat maakt (‘which zodat (‘so that’) 1
makes’)
zero 9 zero 6
Total 51 Total 47
dus> English target text English source text >dus
marker N Adj.% marker N Adj.%
therefore 88 47.6 so 32 50.8
so 35 18.9 therefore, thus 2 2 × 15.9
×
10
thus 32 17.3 and 3 3 × 4.8
16 other 30 16.2 8 other 8 12.7
zero 58 zero 32
Total 243 Total 95
The inferential category accounts for 15.7% (N = 51) of source text tokens of so,
and 12.6% (N = 47) of its backtranslations. These ratios are considerably higher for
dus, with 27.1% (N = 243) of its source text tokens indexing an inferential relation
as well as 28.5% (N = 95) of its backtranslations. As Table 3 shows, dus is almost
the exclusive correspondent for so, both for its translations into Dutch and for its
backtranslations. A similar situation holds for the backtranslations of dus, but not
for the inferential tokens of dus that are translated into English, where so changes
places with therefore. Interestingly, 23.9% (N = 58) of inferential tokens of dus are
not translated into English at all, and 33.7% (N = 32) of its backtranslations were
added by the translator as there is no corresponding marker in the English source
text. This is less the case for so with 17.7% (N = 9) of translations and 12.8% (N =
6) of backtranslations not having a correspondent.
retrieved from a group of segments in the prior co-text. This may take the form of
(i) a summarizing conclusion stating the upshot of that section of the discourse, (ii)
an opinion that rests on preceding argumentation, or (iii) a request motivated in the
prior discourse.
First, the summarizing function has been attributed to so in prior research (cf.
Redeker 1990; Müller 2005; Buysse 2012), and can be confirmed for dus as well in
the DPC. The idea that a successful approach to the climate issue rests on two pillars
has been developed at length in the prior co-text of example (12). To bring this
section of the text to a close this argumentation is summarized to a single claim,
marked with dus.
(12) Naar mijn overtuiging is energie de alfa en omega van de discussie. Energie
is de motor achter ontwikkeling. De drijvende kracht achter een beter leven
voor honderden miljoenen mensen. In developing countries weten grote
groepen mensen met hard werken een betere toekomst te verwerven voor
zichzelf en hun kinderen. Zij ontsnappen aan de armoede en de ellende. Dat
gaat niet zonder energie. Het International Energy Agency schat dat het
energiegebruik van de developing countries de komende halve eeuw met
230 procent zal stijgen. Zij zijn dan goed voor meer dan de helft van het
wereldwijde energiegebruik. Elke succesvolle aanpak van het
klimaatprobleem dientdusgebaseerd te zijn op twee pijlers: energie is de
sleutel en nauwe samenwerking tussen developed en developing countries
is cruciaal.
[‘Any successful approach to the climate problem ought to be dus based on
two pillars: …’]
In my opinion, the whole discussion revolves around energy. Energy is the
driving force behind development. The key to a better life for hundreds of
millions of people. In developing countries, large numbers of people work
extremely hard to secure a better future for themselves and their children.
That’s how they escape poverty and hardship. But it all costs energy. The
International Energy Agency expects that total energy consumption in the
developing world will rise by 230% over the next 50 years. That will be
more than half of total global consumption. So, any successful approach to
climate change must be built on two central ideas: one: energy is the key,
and two: close cooperation between developed and developing countries is
essential.
(Administrative texts, bal.-001248, Dutch-English)
The second type of textual conclusion does not summarize what precedes as
much as it takes the prior co-text as the grounds to voice an opinion and thereby end
a section or turn. Excerpt (13) has been drawn from the proceedings of the European
Parliament. In a debate on immigration policy an MEP describes the situation and
ends with a so-prefaced segment stating that countries ought to work together to
address the challenges sketched in the prior co-text.
44 L. Buysse
(13) As Mr. Duquesne implied in his contribution, if the recent happenings have
shown us anything, it is that we cannot turn a blind eye to events around the
world and hope that they will go away. The problems of people wanting
asylum, the situation and the plight of the people of poor and troubled
countries all over the world are our concerns and they manifest themselves
on our doorsteps, on our shores and in our parliaments if we do not address
them. (…) This is the plight of desperate people seeking desperate
measures to start a new life. But these people are not resorting to this sort
of action lightly; they are escaping from terror, war, torture, rape, vile
regimes posing as governments and, of course, in some cases, poverty.
Sothere can be no more appropriate time for countries to be working
together to confront these humanitarian challenges.
De heer Duquesne zei het al in zijn bijdrage: als de recente gebeurtenissen
ons iets hebben geleerd, is het dat we gebeurtenissen in de wereld niet
zomaar kunnen negeren en dan maar hopen dat het probleem vanzelf
verdwijnt. De problemen van asielzoekers, de erbarmelijke omstandigheden
waarin mensen in arme en noodlijdende landen over de hele wereld
verkeren, zijn ook onze zaak en als we er niets aan doen, zullen we er van
dichtbij, aan onze eigen kusten en in onze eigen parlementen, mee worden
geconfronteerd. (…) Zo wanhopig zijn deze mensen dat ze op deze
hachelijke wijze een nieuw leven willen beginnen. Maar deze mensen doen
dit niet zomaar; ze zijn op de vlucht voor terreur, oorlog, martelingen,
verkrachting, verachtelijke regimes die zich regering noemen, en in
sommige gevallen natuurlijk ook armoede. Er isdusgeen beter moment
voor landen om deze humanitaire problemen met vereende krachten aan te
pakken. [‘There is dus no better moment for countries to address these
humanitarian problems with joint forces.’]
(Administrative texts, erp-000443, English-Dutch)
Third, in interaction so can also mark a speech act of request (cf. Fraser 1990;
Schiffrin 1987; Müller 2005), in which case it relates the request (or, by extension,
a directive) to a preceding motivation or justification. In the DPC dus takes on this
role as well, but contrary to the other two types of textual conclusion, so and dus are
rarely each other’s correspondents in this function. Apart from zero correspondence,
therefore is the most likely alternative in English, and in Dutch dan ook (literally
‘then also’, which translates best into English as ‘therefore’ or ‘hence’) and daarom
(‘that is why’) stand out.
English so and Dutch dus in a Parallel Corpus 45
Typically, a lengthy turn is rounded off with a call to (specific members of) the
audience to perform an action based on the argumentation developed in the prior
co-text, as in (14) and (15), both of which have also been taken from a parliamentary
debate. The speech act tends to be explicitly marked by phrases such as I urge you/
ik dring er bij u op aan, I call upon you/ik doe een beroep op u, I ask/ik vraag, etc.
(14) I am glad that the rapporteur has eventually agreed that we need a
compromise on agriculture, so I urge you all to vote for Amendment No 11.
Ik ben blij dat de rapporteur er uiteindelijk mee heeft ingestemd dat er een
compromis nodig was met betrekking tot de landbouw. Ik dring erdan
ookbij u op aan om vóór amendement 11 te stemmen. [‘I urge dan ook you
to vote in favour of amendment 11.’]
(Administrative texts, erp-000447, English-Dutch)
(15) Daarom, collega’s moeten wij ervoor zorgen dat wij voldoende hulp aan
deze regio bieden. En wat doen we? In de begroting 2002 schroeven we de
begrotingsmiddelen voor deze regio terug. Ik roepdusde collega’s van de
Begrotingscommissie en de hele plenaire vergadering op om die begroting
weer recht te zetten en aan Centraal-Azië te geven waar het recht op heeft.
[‘I call dus on the colleagues of the Budgets Committee and the entire
plenary session to rectify this budget and give Central Asia what it is
entitled to.’]
That is why we need to ensure that we lend sufficient support to that region.
And what do we do? In the 2002 budget, we cut back the budgetary
resources allocated to that region. Ithereforeurge the MEPs of the
Committee on Budgets, together with the entire plenary session, to rectify
that budget and to give Central Asia what it is entitled to.
(Administrative texts, erp-000450, Dutch-English)
Both so and dus mark a textual conclusion in over one fifth of their translations
into the other language (with respectively 26.5%, N = 86, and 27.5%, N = 246). In
backtranslations the shares for this function amount to over one quarter (20.4%, N
= 76 for so; 22.2%, N = 74 for dus). The unchallenged preferred correspondent for
so in translations into Dutch as well as backtranslations is dus (Table 4). This is
reciprocated for the backtranslations of dus, but not for its translations into English,
where therefore ranks highest, accounting for a majority of translations, followed by
over one fifth for so.
46 L. Buysse
Table 4 Top three of correspondents of so and dus as textual conclusion markers in translations
and backtranslations in absolute numbers (N) and adjusted percentages (excluding tokens with
zero correspondent)
so> Dutch target text Dutch source text >so
marker N Adj.% marker N Adj.%
dus 42 59.2 dus 45 66.2
daarom (‘that’s why’) 11 15.5 daarom (‘that’s why’) 12 17.7
dan ook (‘as a result’) 8 11.3 dan ook (‘as a result’) 8 11.8
7 other 10 14.1 3 other 3 4.4
zero 15 zero 8
Total 86 Total 76
dus> English target text English source text >dus
marker N Adj.% marker N Adj.%
therefore 106 52.2 so 42 68.9
so 45 22.2 thus 8 13.1
thus 17 8.4 therefore 3 4.9
17 other 35 17.2 6 other 8 13.1
zero 43 zero 13
Total 246 Total 74
Boundary markers are signposts for and between larger units in the discourse, and
as such help the recipient of the message to follow the thread of the discourse. In the
DPC we can distinguish three types of boundary markers: (i) pivotal transitions
between adjacent sections, (ii) return to the main discourse unit, and (iii) introduce
questions.
First, so and dus can mark boundaries between adjacent discourse sections,
introducing sentences that serve a transitional or pivotal goal between two larger
discourse segments. For example, in (16) the author rounds off his discussion of a
first problem in his exposé with a transitional sentence, marked by dus/so, before
moving on to issues of secondary importance, whereas in (17) a new section is
started with a transitional sentence that refers back to the previous section.
(16) Als we diverse kustlijnen in Europa bekijken (…) dan liggen op een afstand
van een paar honderd kilometer twee, misschien wel zes, zeven, acht hele
grote havens waar de concurrentie vooral voortspruit uit die afstand tussen
de havens. Die is namelijk kort. (…) Dit leidt echter tot een valse
concurrentie waarvan we eigenlijk niet gediend zijn. Het gaatdusin eerste
instantie om dat probleem. [‘It concerns dus in first instance this problem.’]
If we look at the various coastlines in Europe (…), there are two, maybe
six, seven or eight major ports located within a few hundred kilometres of
each other, where competition is mainly generated from that small distance
between the ports. (…) However, this leads to distorted competition which
does not really benefit us. Sothat is the first problem.
(Administrative texts, erp-000458, Dutch-English)
English so and Dutch dus in a Parallel Corpus 47
(17) Eerst even een misverstand rechtzetten: het Nederlands is geen kleine taal.
Het is de moedertaal van ruim 15 miljoen Nederlanders en 6 miljoen
Vlamingen; en nog eens 400.000 Surinamers maken er dagelijks gebruik
van. Er zijn ruim 6.000 talen in de wereld en op de ranglijst staat het
Nederlands tussen de 45ste en 50ste plaats. (…) Zij plaatsen het Nederlands
zelfs in de top twintig.
[section heading] Een gezonde taal
Het Nederlands verkeertdusin stralende gezondheid. Toch is er zorg nodig
als we dit zo willen houden, want talen kunnen snel terrein verliezen.
[‘Dutch is dus in radiant health.’]
Let us begin by correcting a misconception: Dutch is not a small language.
There are more than 6000 languages in the world and Dutch is ranked
somewhere between 45th and 50th (…) They place Dutch in the top 20.
[section heading] A healthy language
So, Dutch is in fine fettle. However, care is required if we want to keep it
that way, since languages can easily lose ground.
(Journalistic texts, vla-002265, Dutch-English)
At the start of a new section of a text, so is either translated by dus or not trans-
lated at all. Dus, on the other hand, has a wider range of correspondences besides so,
such as then – as in (18) – thus, therefore, clearly, and it can be seen that.
(18) ELAt zit er warmpjes in met zijn risicokapitaal, van zaaigeld tot
risicokapitaal. Dat geld wordt beheerd door mensen met talent om jonge
bedrijven te begeleiden in hun groei. (…) Gedurende vele jaren, zeker
sedert de “golden sixties”, vestigden zich honderden buitenlandse bedrijven
tussen Leuven-Eindhoven en Aken. Die beweging is niet stilgevallen.
[section heading] Topklasse
ELAt beschiktdusover vele troeven om een belangrijke rol te spelen in het
wereldwijde internationale net van kenniseconomieën. [‘ELAt possesses
dus many assets in order to play an important role in the worldwide net of
knowledge economies.’]
ELAt is awash with capital, from seed money to risk capital. The money is
managed by talented individuals with a view to supporting young
companies in their growth phase. (…) Over a period of many years, and
particularly since the “golden sixties”, hundreds of foreign companies have
set up shop in the area between Leuven-Eindhoven and Aachen, and the
trend continues apace.
[section heading] Top class
ELAt,then, has many assets enabling it to play a major role in the global
international network of knowledge economies.
(Journalistic texts, vla-002265, Dutch-English)
Second, dus and so can act as “pop marker[s]” (Polanyi and Scha 1983: 265),
which indicate a transition to a “main idea unit” (Schiffrin 1987; Müller 2005;
Buysse 2012) or a “return to a main point” (Lam 2010: 662), whereby a relation is
48 L. Buysse
indexed between two non-adjacent discourse segments. In (19) the narrator of the
story gets distracted by his memory of another character’s scent and elaborates on
that before returning to the main focus of the narration.
(20) And yet in his own life he goes to great lengths to avoid company, even
though he does get lonely. “If I didn’t I’d be superhuman. I’m sure even the
Pope gets lonely.” Sowhy does he choose to be alone? “Well, you see, I
consider that to be a privilege. (...)”
Maar in zijn eigen leven heeft hij er alles voor over om gezelschap te
vermijden, ook al voelt hij zich wel degelijk eenzaam. “Anders zou ik een
supermens zijn. Ik weet zeker dat zelfs de paus eenzaam kan zijn.” Waarom
wil hijdanalleen zijn? “Wel, ik beschouw dat als een voorrecht. (...)" [‘Why
does he dan want to be alone?’]
(Journalistic texts, sta-002483, English-Dutch)
English so and Dutch dus in a Parallel Corpus 49
(21) In Nederland zijn we ervan overtuigd dat ook de overheid één van de
partners is bij het opvoeden van kinderen. Deze houding betekent een breuk
met het verleden. Waar in het verleden de autonomie van het gezin bijna
onaantastbaar leek, kiest de huidige regering voor een andere benadering.
(…) Het belang van ieder kind. Dus van de 95% van de Nederlandse
kinderen waar het goed mee gaat en die tevreden zijn met hun leven. Maar
ook van de 5% die het moeilijk hebben. Wat doen wedanzoal? Om te
beginnen stellen we het kind centraal. [‘What sort of things do we dan do?’]
My government believes that the state also has a responsibility when it
comes to raising children. That’s a major change in attitude. In the past,
family autonomy was largely unquestioned. But the present government
wants to change that. (…) I’m talking about all children’s interests. Not just
the 95% of Dutch children who are doing well and are happy. But also the
5% who have problems. Sowhat are we doing? To start with, we are putting
children first.
(Administrative texts, bal.-001241, Dutch-English)
The only context in which dus surfaces in an interrogative clause is when the
question merely seeks confirmation of the inference the speaker/author has drawn
from the preceding co-text, as illustrated in (22).
(22) “Enkele jaren geleden stelden wij vast dat wij al 30 jaar teveel
verschillende onderdelen verwerkten in onze vrachtwagens. Dat was duur
en onpraktisch, dus hielden wij ons designsysteem kritisch tegen het licht
en bedachten wij een nieuw en beter alternatief. Vandaag produceren wij
bijvoorbeeld drie verschillende cabines, maar daarin monteren wij steeds
dezelfde spoiler.”
Het gaatdusom een vorm van standaardisering? [‘It concerns dus a form of
standardisation?’]
“We realised that for more than 30 years we were using too many different
components in our trucks. This was expensive and impractical, so we took
a critical look at our design system and worked out a better one. Today, for
example, we produce three different base cabs but they all have the same
windscreen.”
Sostandardisation is the name of the game?
(External communication, arc-002053, Dutch-English)
The boundary marking function has been attested in 11.7% (N = 38) of transla-
tions of so and 15.6% (N = 58) of backtranslations. The shares for dus are consider-
ably more modest with 3.7% (N = 33) of translations and 2.4% (N = 8) of
backtranslations exhibiting this function. The overview of correspondents for so and
dus in Table 5 indeed suggests that so is used more often for boundary marking than
dus. Interestingly, almost half of boundary marker tokens of so in an English source
text do not have a correspondent in a Dutch target text and at the same time so was
used as a boundary marker 22 times in an English translation without there being a
50 L. Buysse
Table 5 Top three of correspondents of so and dus as boundary markers in translations and
backtranslations in absolute numbers (N) and adjusted percentages (excluding tokens with zero
correspondent)
so> Dutch target text Dutch source text >so
marker N Adj.% marker N Adj.%
dan (‘then’) 11 55.0 dus 12 33.3
dus 6 30.0 dan (‘then’) 10 27.8
daarom (‘that’s why’), zo (‘well’), de 3× 3 × 5.0 nu (‘now’), zo 2 × 2 × 11.1
vraag is nu (‘the question now is’) 1 (‘well’) 4
0 other 0 0.0 4 other 6 16.7
zero 18 zero 22
Total 38 Total 58
dus> English target text English source text >dus
marker N Adj.% marker N Adj.%
so 12 48.0 so 6 75.0
therefore 5 20.0 then 2 × 2 × 12.5
1
thus 2 8.0
6 other 6 24.0
zero 8 zero 0
Total 33 Total 8
Dutch correspondent in the source text. This might point at a tendency whereby
boundaries between large sections of discourse are marked more explicitly in
English than in Dutch.8 The overall numbers for this category are fairly small, how-
ever, so caution should be taken in drawing any firm conclusions.
8
Similar observations have been made about the correspondence between Swedish and English
(see Altenberg 2007).
English so and Dutch dus in a Parallel Corpus 51
Table 6 Top three of correspondents of ‘sequential’ so and dus in translations and backtranslations
in absolute numbers (N) and adjusted percentages (excluding tokens with zero correspondent)
so> Dutch target text Dutch source text >so
marker N Adj.% marker N Adj.%
dus 18 66.7 dus 7 31.8
zo (‘well’) 3 11.1 dan (‘then’), en (‘and’), zo (‘well’) 3× 3 × 13.6
3
dan (‘then’) 2 7.4 daarom (‘that’s why’), toen (‘then’) 2× 2 × 9.1
2
4 other 4 14.8 2 other 2 9.1
zero 9 zero 5
Total 36 Total 27
dus> English target text English source text >dus
marker N Adj.% marker N Adj.%
so 7 87.5 so 18 90.0
therefore 1 12.5 and, instead 2× 2 × 5.0
1
zero 4 zero 2
Total 12 Total 22
(23) Renata, who could speak English, became an interpreter with the British
army, and suggested her sister enlist as well. “She said to me, ‘Why don’t
you become an interpreter too?’ I said: ‘I can’t speak English.’ She said, ‘It
doesn’t matter.’ SoI became an interpreter and we were part of the British
army.”
Renata, die Engels kende, werd tolk voor het Britse leger en stelde haar zus
voor zich ook aan te melden. “Ze zei me: ‘Waarom word je ook geen tolk?’
Ik zei, ‘Ik ken geen Engels.’ Ze zei dat dat niets uitmaakte. Duswerd ik tolk
en we maakten deel uit van het Britse leger." [‘Dus I became an
interpreter…’]
(Journalistic texts, sta-002505, English-Dutch)
3.6 Elaboration/Restatement
Rendle-Short 2003; Müller 2005; Buysse 2009, 2012; and Lam 2009, 2010). In (24)
the information on the company’s result is further enhanced by indicating that it is
similar to that of the third quarter of the year before. The use of so in English and
dus in Dutch suggests that this information can be inferred from the wide context of
the company’s figures (albeit not from the preceding co-text). Similarly, the addition
that a diagnosis can now be made without using film in (25) can be inferred from the
context (e.g. one’s knowledge of current practice as well as from the obvious fact
that a computer screen does not require printing) but is worth mentioning as it is
likely to be one of the main advantages of the new technology.
(24) Assuming the value of the US dollar does not further decline relative to the
euro, the Company expects to achieve an operating result before
amortization of consolidation goodwill between euro 7 and 12 million,
socomparable to the third quarter of 2002.
Uitgaande van de veronderstelling dat de waarde van de US dollar niet
verder zakt in verhouding tot de euro, verwacht de groep een operationeel
resultaat vóór afschrijving van de consolidatiegoodwill tussen 7 en 12
miljoen euro, dusvergelijkbaar met dat van het derde kwartaal van 2002.
[‘…, dus comparable to that of the third quarter of 2002.’]
(External communication, bco-002443, English-Dutch)
(25) Next to this there are the graphics controllers that feature state-of-the-art
technology for the rendering of reliable and accurate images for diagnosis
on the screen, sowithout using film.
Verder zijn er ook de grafische borden die instaan voor heel nauwkeurige
en volledig betrouwbare beelden gebruikt voor diagnose op het scherm,
duszonder gebruik van film. [‘…, dus without use of film.’]
(External communication, bco-002368, English-Dutch)
(26) Het betreft tevens een vrij soepele formule: gedurende de eerste 6 maanden
van elk jaar (dusvan 1 januari tot 30 juni) kunt u de terugbetaling vragen
van uw Record-aandelen. [‘…(dus from 1 January to 30 June)…’]
In addition, it is a relatively flexible formula: during the first 6 months of
each year (i.e.from 1 January to 30 June) you can apply to redeem your
Record shares.
(External communication, ing-001886, Dutch-English)
Of all tokens of so that were translated into Dutch 6.5% (N = 21) had an ‘elabora-
tive’ function as well as 10.5% (N = 39) of its backtranslations. The picture looks
altogether different for dus, which has 23.7% (N = 212) of its translations in an
‘elaborative’ role and 15.9% (N = 53) of its backtranslations. When so is translated
English so and Dutch dus in a Parallel Corpus 53
Table 7 Top three of correspondents of so and dus with an elaborative function in translations and
backtranslations in absolute numbers (N) and adjusted percentages (excluding tokens with zero
correspondent)
so> Dutch target text Dutch source text >so
marker N Adj.% marker N Adj.%
dus 10 71.4 dus 10 30.3
bijvoorbeeld (‘for example’), en (‘and’), zo 4 × 1 4 × 7.1 zo (‘in this way’) 8 24.2
(‘in this way’), zodoende (‘as such’)
daarom (‘that’s 3 9.1
why’)
0 other 0 0.0 10 other 12 36.4
zero 7 zero 6
Total 21 Total 39
dus> English target text English source text >dus
marker N Adj.% marker N Adj.%
therefore 52 35.6 so 10 38.5
i.e. 25 17.1 thus 6 23.1
in other words 15 10.3 hence 3 11.5
21 other (incl. so, N = 10) 54 37.0 7 other 7 26.9
zero 66 zero 27
Total 212 Total 53
into Dutch, dus is by far its most frequent correspondent, but this certainly does not
hold when ‘elaborative’ dus is rendered into English, with so accounting for not
even 4% of corresponding tokens (Table 7).
3.7 Reiteration
There is one functional category that only involves dus, viz. when it marks (a part
of) an utterance as having been stated before at some point in the prior co-text. This
function differs from others in that it does not resume a topic that was temporarily
suspended (as a pop marker would), restate a claim (as with elaboration/restate-
ment) or mark a claim that is to be inferred from the prior co-text. Rather, it subtly
indicates to the reader/hearer that the writer/speaker somehow finds it relevant to
deliberately reiterate a proposition that has already been stated before. For example,
in (27) a section in which a sentence from an Old Dutch manuscript is analysed is
followed by a section on the origins of the letters of the alphabet. In the former sec-
tion the author has already commented on the spelling of w, which also has rele-
vance for the latter section. In the list of examples in that section brief reference is
made to the aforementioned case of w. Dus prompts the reader to recall this example
from its earlier discussion.
54 L. Buysse
(27) Verder maakt het woord ‘uuerk’ duidelijk dat de letter ‘w’ ontstaan is door
een samenvoeging van twee keer een ‘u’. [‘that the letter ‘w’ originates
from a combination of ‘u’ twice’.] In het Engels noemen ze de ‘w’ ook
‘double u’. De ‘w’ kwam pas in de elfde eeuw in gebruik. (...)
[section heading] Herkomst van onze alfabetletters
Veel van onze hoofdletters zijn ontstaan uit tekeningen (pictogrammen). De
M bijvoorbeeld is (…) Maar er zijn ook letters uit andere letters ontstaan.
Zo is de G ontstaan uit de C; er is simpelweg een streepje bijgezet. En de
W isdustwee keer een U. [‘And W is dus U twice.’] De U zelf is trouwens
een geronde V.
The word ‘uuerk’ (‘werk’ in modern Dutch) clearly shows that the letter
‘w’ has its origins in ‘u’ written twice and joined together, as is implied by
the English name for the letter, ‘double u’. The ‘w’ did not come into use
until the eleventh century. (…)
[section heading] Origin of the letters of the alphabet
Many of our capital letters come from drawings (pictograms). The M, for
example, (…) But there are also letters that have come from other letters.
For example, the G came from the C; a little dash has simply been added.
And the W is U written twice. The U itself, incidentally, is a rounded V.
(Non-fictional literature, ons-000476, Dutch-English)
This function is not very common, as it only accounts for 2.9% (N = 26) of
source text tokens of dus into English and 2.7% (N = 9) of backtranslations. Over
two thirds of these tokens do not have a correspondent in English (18 and 6 tokens,
respectively). In translations therefore occurs 3 times, i.e. 2 times, and as, but, and
to do this once each; in backtranslations and, instead and therefore occur once each.
4 Q
uantitative Analysis of Correspondence between So
and Dus
4.1 Correspondents
The overall correspondence between so and dus differs depending on the translation
direction (Table 8). In total 148 tokens of so are rendered in Dutch with dus, which
amounts to nearly 46% of source text tokens of so and slightly over 44% of target
text tokens of dus. These numbers are even considerably higher, with 59.7% and
60.9% respectively, when zero correspondences are ignored. Clearly, when decid-
ing on a Dutch correspondent for so, translators opt for dus in a majority of cases.
In absolute terms the numbers are similar when Dutch source texts are translated
into English, with 152 tokens of dus translated into English, but in relative terms the
differences are more outspoken: almost 41% of target text tokens of so are transla-
tions of dus, which compares neatly with the number for the English-Dutch
English so and Dutch dus in a Parallel Corpus 55
Table 8 Overall top three of correspondents of so and dus in translations and backtranslations in
absolute numbers (N) and adjusted percentages (excluding tokens with zero correspondent)
so> Dutch target text Dutch source text >so
marker N Adj.% marker N Adj.%
dus 148 59.7 dus 152 48.6
daarom (‘that’s why’) 22 8.9 daarom (‘that’s why’) 38 12.1
dan (‘then’) 13 5.2 en (‘and’), zo (‘well’) 2 × 14 2 × 4.5
22 other 65 26.2 30 other 95 30.3
zero 77 zero 59
Total 325 Total 372
dus> English target text English source text >dus
marker N Adj.% marker N Adj.%
therefore 278 41.5 so 148 60.9
so 152 22.7 thus 26 10.7
thus 63 9.4 therefore 21 8.6
48 other 177 26.4 23 other 48 19.8
zero 226 zero 91
Total 896 Total 333
translation direction, but only a mere 17% of source text tokens of dus are translated
with so. Again these rates are somewhat higher if zero correspondences are not
taken into account (48.6% and 22.7%, respectively).
The absence of a correspondent for about one quarter of tokens – 23.7% for so
and 25.2% for dus – is in line with previous findings on the translation of pragmatic
markers and causal connectors (e.g. Bazzanella and Morra 2000; Aijmer and
Altenberg 2002; Altenberg 2007, 2010), and can be explained by their syntactic and
semantic optionality. Interestingly, though, 27.0% of target text tokens of dus do not
have a correspondent in the English source texts and have, therefore, been added by
the translator. This compares to a much lower rate of 15.9% of target text tokens of
so.
The preference for therefore over so may be explained by two main factors. First,
many of the texts included in the DPC have been taken from a fairly formal context.
As therefore is considered more formal than so and is certainly more typical of writ-
ten language (Biber et al. 1999: 887), translators may have felt it more often appro-
priate to translate dus with therefore than with so.
Second, the preferred position of dus more closely resembles that of therefore
than that of so, which may have led translators more easily to the former than to
the latter. In over two thirds of cases (68.3%, N = 612) dus occurs in mid-position
in Dutch source texts, compared to only 27.9% of times (N = 250) in clause-initial
position and 3.8% (N = 34) in final position. Of the 278 tokens of dus that have
been translated with therefore 225, amounting to 80.9%, take mid-position in the
source text.
56 L. Buysse
4.2 Functions
Figure 19 provides an overview of the functional distribution of so and dus for each
translation direction. Ideational and interpersonal relations are clearly at the heart of
so and dus, as they predominantly mark ‘resultative’ and ‘inferential’ relations.
Taken together these take up 40–50% of all tokens. There are, however, also func-
tional differences between so and dus.
First, the hierarchy between the ‘resultative’ and ‘inferential’ functions appears to
buttress, on the one hand, Schiffrin’s (1987) characterization of so as a “marker of
result” (1987: 191), and on the other hand, the status awarded to dus by Stukker et al.
(2008) of the prototypical marker of “epistemic causal relations” (2008: 1304). Whereas
28.6% of translations of so and 33.6% of its backtranslations have a ‘resultative’ func-
tion, the shares for dus are considerably lower, at 13.8% and 21.6% respectively. The
opposite holds for ‘inferential’ tokens: 27.1% of translated tokens of dus and 28.5% of
its backtranslations are ‘inferential’, compared to 15.7% and 12.6% for so.
Second, the incidence of tokens serving a boundary marking function is consider-
ably higher for so than for dus: 11.7% of source text tokens and 15.6% of target text
tokens of so, as compared with 3.7% and 2.4% for dus. As suggested in Sect. 3.4, one
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
so > DU dus > EN DU > so EN > dus
REIT 0.0 2.9 0.0 2.7
EL/RE 6.5 23.7 10.5 15.9
SEQ 11.1 1.3 7.3 6.6
BM 11.7 3.7 15.6 2.4
CONC 26.5 27.5 20.4 22.2
INF 15.7 27.1 12.6 28.5
RES 28.6 13.8 33.6 21.6
The absolute numbers can be found in the discussion of each function (see Sect. 3).
9
English so and Dutch dus in a Parallel Corpus 57
factor in this might be a more outspoken tendency to mark boundaries between large
sections of discourse in English than in Dutch, given the relatively high number of
zero correspondents for so. Another reason is the limited employability of dus for
one of the three types assumed under the category, viz. introducing questions.
A third and final major observation is the high rate of tokens of dus elaborating
or restating a prior segment (22% in source texts and 15.9% in target texts), com-
pared to more modest rates for so (6.5% and 10.3%, respectively). This textual
function has a strong affinity with inferential conclusions – as both entail a process
of inferential deduction – making it a natural extension of the functional scope of a
prototypical inferential marker like dus.
The distribution of so and dus across text types is shown in Table 9, which indicates
that they are both highly common in administrative and journalistic texts. The same
holds for the text type literature (30.1% for so and 27.9% for dus), but there is a dif-
ference in the share taken up by the two kinds of literature in the corpus: fictional
and non-fictional literature are almost on a par for so (13.5% and 16.6%, respec-
tively), whereas fictional literature has a low share for dus (3.9%) compared with
non-fictional literature (24.0%). External communication accounts for 11.2% of
tokens of so and 19.5% of tokens of dus. These markers rarely occur in instructional
texts (1.0% and 1.1%, respectively).
An in-depth analysis of all frequencies of so and dus for each text type falls beyond
the scope of this article. Some observations are nonetheless worth pointing out. The
shares of the text types in resultative and inferential tokens of so and dus largely mir-
ror the shares in the overall incidence of these markers, presented in Table 9. For all
other categories, however, those text types that offer a more productive environment for
so and dus to fulfil a specific function exceed the weight that they might be expected
to have based on the overall occurrence of the markers in the text type.
Although administrative texts only take up slightly over one fifth of overall
tokens of so and dus (Table 9), this text type accounts for 51.9% of tokens of so
Table 9 Tokens of so and dus according to their incidence in each text type
so dus
Text type N % N %
Administrative texts 157 22.5 279 22.7
External communication 78 11.2 240 19.5
Instructions 7 1.0 14 1.2
Journalistic texts 245 35.2 353 28.7
Literature: Fiction 94 13.5 48 3.9
Literature: Non-fiction 116 16.6 295 24.0
Total 697 100.0 1229 100.0
58 L. Buysse
marking a textual conclusion and 36.6% of dus in this function. Clearly, administra-
tive texts such as speeches and lengthy interventions in parliamentary sessions are
well-suited for so and dus to introduce an opinion at the end of an argumentative
stretch of monologic discourse, to preface a request that rests on such argumenta-
tion or state a summarizing upshot of the prior co-text.
Similarly, journalistic texts dominate the boundary marking function with 50.0%
for tokens of so and 43.9% for tokens of dus, as compared with overall shares for
this text type of 35.2% and 28.7%, respectively. This too can be attributed to the
nature of the text type: questions abound in, for example, interviews, but journalists
also often make use of pivotal sentences as rhetorical signposts for the reader.
Fictional texts equally take up a larger share of boundary marking tokens of so and
dus (24.0% and 14.6%, respectively) than they do of these markers in general
(13.5% and 3.9%, respectively), which is particularly due to the capacity of so and
dus to signal a return to the main discourse unit. The shares of all other genres
within this function are, on the other hand, lower than their overall shares, with non-
fiction reaching a bottom 1.0% for so and 14.6% for dus (as opposed to 16.6% and
24.0% overall, respectively).
The sequential category is also dominated by tokens from journalistic texts
(49.2% for so and 58.8% for dus), as such texts often feature stories with a sequen-
tial structure. At the other end of the spectrum administrative texts (3.2% and 2.9%)
and external communication (4.8% and 11.8%) do not contain as many sequential
tokens of so and dus as they do of these markers on the whole.
Many elaborative instances of so and dus, finally, have been taken from jour-
nalistic texts (35.0% and 24.9%), in line with their overall share in the corpus
(35.2% and 28.7%), but the category external communication (21.7% for so and
27.9% for dus) clearly outnumbers its shares in their overall incidence (11.2%
and 19.5%), which can particularly be attributed to the many instances of finan-
cial reports and press releases where specific phrases are further spelt out, as
illustrated in (24).
In sum, so and dus occur in all text types represented in the DPC, but journal-
istic and administrative texts together account for the majority of tokens, whereas
their incidence is marginal in instructive texts. For the functions that are closest to
the functional core of so and dus, viz. resultative and inferential uses, each text
type accounts for a share that closely resembles its share in the overall frequency
of the markers. The other functions, however, are more specific and therefore bet-
ter suited in certain environments that are more common to one text type than to
another.
5 Conclusion
The scrutiny of 1926 tokens of English so and Dutch dus in a bi-directional parallel
corpus has revealed that the functional scope of these markers is highly similar.
Apart from one, quantitatively marginal category (viz. reiteration) all seven
English so and Dutch dus in a Parallel Corpus 59
functions that have been attested apply to both markers. Within this functional spec-
trum the incidence of functions tends to vary. So appears to prefer ‘resultative’ over
‘inferential’ relations, and dus does the opposite, thereby confirming traditional
characterisations of the prototypical uses of so and dus. In the textual domain so is
predominantly a marker of boundaries between larger sections of the discourse,
whereas dus indexes elaboration or restatement.
The mutual correspondence of so and dus is overall quite high, although there is
a “translation bias” (Altenberg 1999: 258), in that the degree of correspondence is
considerably higher when so is translated into Dutch than when dus is translated
into English. This could be explained by three factors.
First, the incidence of so is in part determined by that of its most prominent rival
marker, viz. therefore, which can take on most of the roles played by so albeit in a
more formal context. In the relatively formal register in which the texts that make
up the corpus can be situated, translators have often chosen the more formal candi-
date. In this respect, Altenberg’s (1999) suggestion that “asymmetrical correspon-
dence” (1999: 259) between causal markers may be due to a difference in the
markers’ stylistic status in the two language systems appears to hold.
Second, a similar rival marker for dus is lacking in Dutch. The closest candidate
is daarom (‘that’s why’), but its incidence in the DPC comes nowhere near that of
dus. Stukker et al. (2008) contend that daarom is largely confined to “content causal
relations” – i.e. ‘resultative’ relations – for which the DPC proffers further evidence.
Although dus can more often be observed in a ‘resultative’ context than Stukker
et al. might suggest, daarom cannot be witnessed to enjoy the same degree of free-
dom and largely sticks to ‘resultative’ contexts.
Third, there may well be a tendency to mark certain relations more explicitly in
Dutch than in English. We should always bear in mind that the target texts are the
result of a process of translation. If a translator encounters a token of so in an
English text s/he may instantly think of dus as the default translation option, but this
also counts for the opposite translation direction. This makes it all the more remark-
able that dus has more often been added to target texts by translators without there
being a source text correspondent. This could point at a tendency in Dutch to mark
inferential relations more explicitly than in English, and with dus in particular,
which helps to account for the observed translation bias.
It should be stressed that the overview of the functional correspondence between
so and dus laid bare in this investigation is not fully comprehensive. The Dutch
Parallel Corpus suffers from one insurmountable drawback: being a parallel corpus
it is based on written texts, thereby excluding spontaneous speech. Consequently, a
number of functions that have been attested in prior research (e.g. floor-holding and
floor-yielding tokens of so; see e.g. Müller 2005) could not be observed in this
study. Future contrastive studies that concentrate on comparable rather than parallel
corpora could complement the findings for parallel corpora. The remarkable func-
tional similarities (and subtle differences) between so and dus will hopefully also
spark an interest in more such comparative studies, so that eventually a cross-
linguistic map of pragmatic markers in the ‘resultative’/‘inferential’ domain (e.g.
French donc, Spanish pues, German also) may be drawn up.
60 L. Buysse
References
Aijmer, K., & Altenberg, B. (2002). Zero translations and cross-linguistic equivalence: Evidence
from the English-Swedish Parallel Corpus. In L. E. Breivik & A. Hasselgren (Eds.), From the
COLT’s mouth… and others’. Language corpora studies in honour of Anna-Brita Stenström
(pp. 19–41). Amsterdam: Rodopi.
Aijmer, K., & Simon-Vandenbergen, A.-M. (2003). The discourse particle well and its equivalents
in Swedish and Dutch. Linguistics, 41, 1123–1161.
Aijmer, K., Foolen, A., & Simon-Vandenbergen, A.-M. (2006). Pragmatic markers in translation:
A methodological proposal. In K. Fischer (Ed.), Approaches to discourse particles (pp. 101–
114). Amsterdam: Elsevier.
Altenberg, B. (1999). Adverbial connectors in English and Swedish: Semantic and lexical corre-
spondences. In H. Hassalgård & S. Oksefjell (Eds.), Out of corpora. Studies in honour of Stig
Johansson (pp. 249–268). Amsterdam: Rodopi.
Altenberg, B. (2007). The correspondence of resultive connectors in English and Swedish. Nordic
Journal of English Studies, 6, 2–25.
Altenberg, B. (2010). Conclusive English then and Swedish då: A corpus-based contrastive study.
Languages in Contrast, 10, 102–123.
Bazzanella, C., & Morra, L. (2000). Discourse markers and the indeterminacy of translation. In
I. Korzen & C. Marello (Eds.), Argomenti per una linguistica della traduzione. Notes pour une
linguistique de la traduction. On linguistic aspects of translation (pp. 149–157). Alessandria:
Edizione dell’ Orso.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Grammar of spoken and
written English. Harlow: Longman.
Bolden, G. B. (2009). Implementing incipient actions: The discourse marker ‘so’ in English con-
versation. Journal of Pragmatics, 41, 974–998.
Buysse, L. (2009). So as a marker of elaboration in native and non-native speech. In S. Slembrouck,
M. Taverniers, & M. Van Herreweghe (Eds.), From will to well. Studies in linguistics offered to
Anne-Marie Simon-Vandenbergen (pp. 79–91). Gent: Academia Press.
Buysse, L. (2012). So as a multifunctional discourse marker in native and learner speech. Journal
of Pragmatics, 44, 1764–1782.
Buysse, L. (2014). ‘So what’s a year in a lifetime so.’ Non-prefatory use of so in native and learner
English. Text & Talk, 34, 23–47.
Buysse, L. In press. Question tags in translation. An investigation into the translatability of English
question tags into Dutch. To appear in: Languages in Contrast (accepted).
Degand, L. (2001). Form and function of causation. A theoretical and empirical investigation of
causal constructions in Dutch. Leuven: Uitgeverij Peeters.
Degand, L. (2009). On describing polysemous discourse markers: What does translation add to the
picture? In S. Slembrouck, M. Taverniers, & M. Van Herreweghe (Eds.), From will to well.
Studies in linguistics offered to Anne-Marie Simon-Vandenbergen (pp. 173–183). Gent:
Academia Press.
Degand, L. (2011). Connectieven in de rechterperiferie. Een contrastieve analyse van dus en donc
in gesproken taal. Nederlandse Taalkunde, 16, 333–341.
Denturck, K., & Vandepitte, S. (2009). The translation of stance indexes: Causal connectors Dutch
want and dus and their French and English correspondents. In S. Slembrouck, M. Taverniers,
& M. Van Herreweghe (Eds.), From will to well. Studies in linguistics offered to Anne-Marie
Simon-Vandenbergen (pp. 185–197). Gent: Academia Press.
E-ANS. (2012). Algemene Nederlandse Spraakkunst. http://ans.ruhosting.nl/index.html. Accessed
28 February 2015.
Evers-Vermeul, J. (2010). Dus vooraan of in het midden? Over vorm-functierelaties in het gebruik
van connectieven. Nederlandse Taalkunde, 15, 149–175.
Fraser, B. (1990). An approach to discourse markers. Journal of Pragmatics, 14, 383–395.
English so and Dutch dus in a Parallel Corpus 61
Halliday, M. A. K., & Matthiessen, C. M. I. M. (2004). An introduction to Functional Grammar.
London/New York: Arnold.
Hogeweg, L. (2009). The meaning and interpretation of the Dutch particle wel. Journal of
Pragmatics, 41, 519–539.
Johansson, S. (2006). How well can well be translated? On the English discourse particle well and
its correspondences in Norwegian and German. In K. Aijmer & A.-M. Simon-Vandenbergen
(Eds.), Pragmatic markers in contrast (pp. 115–137). Oxford/Amsterdam: Elsevier.
Johnson, A. (2002). So…?: Pragmatic implications of so-prefaced questions in formal police inter-
views. In J. Cotterill (Ed.), Language in the legal process (pp. 91–110). Hampshire: Palgrave
MacMillan.
Lam, P. W. Y. (2009). The effect of text type on the use of so as a discourse particle. Discourse
Studies, 11, 353–372.
Lam, P. W. Y. (2010). Toward a functional framework for discourse particles: A comparison of well
and so. Text and Talk, 30, 657–677.
Macken, L., De Clercq, O., & Paulussen, H. (2011). Dutch parallel corpus: A balanced copyright-
cleared parallel corpus. Meta, 56, 374–390.
Müller, S. (2005). Discourse markers in native and non-native English discourse. Amsterdam/
Philadelphia: John Benjamins.
Niemegeers, S. (2009). Dutch modal particles maar and wel and their English equivalents in dif-
ferent genres. Translation and Interpreting Studies, 4, 47–66.
Norrick, N. R. (2008). Negotiating the reception of stories in conversation: Teller strategies for
modulating response. Narrative Inquiry, 18, 131–151.
Pander Maat, H., & Degand, L. (2001). Scaling causal relations and connectives in terms of
speaker involvement. Cognitive Linguistics, 12, 211–245.
Pander Maat, H., & Sanders, T. (1995). Nederlandse causale connectieven en het onderscheid tus-
sen inhoudelijke en epistemische coherentie-relaties. Leuvense Bijdragen, 84, 349–374.
Pander Maat, H., & Sanders, T. (2000). Domains of use or subjectivity? The distribution of three
Dutch causal connectives explained. In E. Couper-Kuhlen & B. Kortmann (Eds.), Cause, con-
dition, concession, contrast. Cognitive and discourse perspectives (pp. 57–82). Berlin: Mouton
de Gruyter.
Paulussen, H., Macken, L., Vandeweghe, W., & Desmet, P. (2013). Dutch parallel corpus: A bal-
anced parallel corpus for Dutch-English and Dutch-French. In P. Spyns & J. Odijk (Eds.),
Essential speech and language technology for Dutch (pp. 185–199). Heidelberg: Springer.
Polanyi, L., & Scha, R. J. H. (1983). The syntax of discourse. Text, 3, 261–270.
Redeker, G. (1990). Ideational and pragmatic markers of discourse structure. Journal of Pragmatics,
14, 367–381.
Redeker, G. (2006). Discourse markers as attentional cues at discourse transitions. In K. Fischer
(Ed.), Approaches to discourse particles (pp. 339–358). Amsterdam: Elsevier.
Rendle-Short, J. (2003). ‘So what does this show us?’: Analysis of the discourse marker ‘so’ in
seminar talk. Australian Review of Applied Linguistics, 26, 46–62.
Schiffrin, D. (1987). Discourse markers. Cambridge: Cambridge University Press.
Stukker, N., Sanders, T., & Verhagen, A. (2008). Causality in verbs and in discourse connectives:
Converging evidence of cross-level parallels in Dutch linguistic categorization. Journal of
Pragmatics, 40, 1296–1322.
Stukker, N., Sanders, T., & Verhagen, A. (2009). Categories of subjectivity in Dutch causal connec-
tives: A usage-based analysis. In T. Sanders & E. Sweetser (Eds.), Causal categories in dis-
course and cognition (pp. 119–171). Berlin: Mouton de Gruyter.
Sweetser, E. E. (1990). From etymology to pragmatics. Metaphorical and cultural aspects of
semantic structure. Cambridge: Cambridge University Press.
Van Dijk, T. (1979). Pragmatic connectives. Journal of Pragmatics, 3, 447–456.
What English Translation Equivalents Can
Reveal about the Czech “Modal” Particle prý:
A Cross-Register Study
Michaela Martinková and Markéta Janebová
1 State of the Art
In the second meaning, prý introduces somebody else’s direct reported speech.
This is exemplified in (2):
The question to be asked here is whether prý carries the meaning of doubt and
uncertainty in all of those cases in which it does not introduce direct speech.
According to SSČ it indeed does, and so it does according to the major grammar
books: Komárek et al. (1986, 232), for example, discuss prý in the sections on epis-
temic modality, alongside modal particles whose function is to evaluate the degree
of certainty of the content of the text or a part of it, i.e. epistemic particles.
In an entry on reported speech in the Encyclopedia of Czech (Encyklopedický
slovník češtiny; Grepl [2002, 375]), however, utterances with prý are considered to
be a special type of reproducing an original utterance, alongside direct and indirect
speech, and no meaning of uncertainty or doubt is mentioned. This is in agreement
with the etymology of the word: historically, prý goes back to the transitive verbum
dicendi praviti [to say], which was a full-fledged verb with rich inflection.1 Prý was
originally the 3rd person singular or aorist form of this verb (praví [say:PRS.3SG]
or pravi [say:AORIST]), which later underwent phonetic reduction, via the stages
of praj and prej (Machek 2010, 481), and lost all inflections. The latter form (prej)
is still very frequent in Common Czech, a variety considered as non-standard
(Krčmová 2002, 81).
Grepl notes (2002, 375) that in sentences with prý the original utterance can
either remain unchanged, or its form can be modified in a way that makes it similar
to indirect speech. For convenience, we will distinguish here between prý introduc-
ing direct and indirect speech. Sentence (3) is an example of direct speech as what
we have here is the exact wording of the original utterance Přivezu ti nějaký dárek
‘I will bring you a present’: the verb is in the first person singular form and its
understood subject is the first person singular pronoun, whose referent is “the origi-
nal speaker” (for the terminology, see Huddleston and Pullum 2002, 1023). The
pronominal clitic ti [you] is attached to the verb:
1
The verb still exists today, but Czech monolingual dictionaries (e.g. SSČ) mark it as bookish and
obsolete.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 65
As demonstrated in (5), prý introducing indirect speech is not limited to the sen-
tence initial position, but it can appear in the middle field as well:
A systematic corpus-based study of the positions of prý is, however, missing,3 and
corpus-based analyses of prý are scarce. In their study of the collocational profile of
prý in the monolingual written SYN2000 corpus of Czech, Hoffmanová and
Kolářová (2007, 101) note a high frequency of prý in journalistic texts and also
briefly mention the important role prý has in the rendering of dialogues in fiction. In
their study of the adverb údajně [allegedly] in journalistic texts (the AnoPress data-
base), Hirschová and Schneiderová (2012, 2) observe a reporter’s distance from the
reported facts not only for údajně, but also for prý. Importantly, they are the first
ones to discuss both expressions in the context of evidentiality, the evidence being
a verbal report from somebody else.4 Arguably, by making it explicit that the
2
Sentence (4) allows for another interpretation, in which the source of the reported information is
unknown; this will be dealt with later in this section.
3
There is only a 1951 study by Trávníček.
4
Grepl and Karlík (1998, 485) do not use the term evidentiality yet, but they identify a difference
between epistemic particles such as možná [maybe] and those which introduce someone else’s
opinion: while both are considered to mark a speaker’s stance, in the latter case by making it
explicit that the presented information comes from elsewhere the speaker avoids any responsibility
for it.
66 M. Martinková and M. Janebová
information comes from elsewhere, the journalists try to avoid responsibility for the
truth of the reported statement, or to show disagreement with it (Hirschová and
Schneiderová 2012, 2).
Though according to the authors both prý and údajně are evidentials of the hear-
say type,5 where, as they argue, the source of the reported information is not known
(ibid.), their analysis presents numerous examples in which the source is present.
This is in agreement with Aikhenvald (2004, 178),6 who mentions several languages
in which “[t]he same evidential may combine the meanings of a reported and a
quotative”. Evidentiality which prý marks is thus – if we borrow Aikhenvald’s
(2004) terminology – both reported (i.e. the authorship is not specified) and quota-
tive (i.e. the author is introduced). In other words, example (4) quoted above and
repeated here for convenience as (6) can be interpreted as either “He said he would
bring me a present”, or as “It was said he would bring me a present”:
The question arises, however, as to how to account for the meaning of “uncer-
tainty” or “doubt” expressed by prý introducing indirect speech, mentioned in the
Czech linguistic sources quoted above, namely SSČ and Komárek et al. (1986). In
general, these types of meaning belong to the domain of epistemic modality (cf. also
Komárek et al. 1986), which, according to Lyons (1977, 805) “express[es] different
degrees of commitment to factuality”.
Aikhenvald (2004) is quite explicit when it comes to the issue of evidential mark-
ers expressing modality: in her opinion, evidentiality “covers the way in which the
information was acquired, without necessarily relating to the degree of speaker’s
5
Like Aikhenvald (2004), Hirschová and Schneiderová (2012) maintain that evidentials are non-
truth-conditional. It has been argued, however, that evidential and epistemic adverbials need not be
non-truth-conditional; see e.g. Ifantidou-Trouki (1993) and Papafragou (2006). Ifantidou applies
the embedding tests of truth-conditionality on the hearsay adverbial allegedly; in her test, allegedly
contributes to the proposition expressed, and the same can be said about the Czech adverb údajně,
which translates allegedly. On the other hand, prý does not seem to pass the test, cf. her example If
the cook has allegedly poisoned the soup, the police should make an inquiry, which is acceptable
when translated into Czech by means of údajně (i), but not with prý (ii):
(i) Pokud kuchařka údajně otrávila polévku, měla by policie zahájit vyšetřování.
‘If the cook ÚDAJNĚ poisoned the soup, the police should start the inquiry.’
(ii) ?Pokud prý kuchařka otrávila polévku, měla by policie zahájit vyšetřování.
?‘If the cook PRÝ poisoned the soup, the police should start the inquiry.’
The difference between údajně and prý is worth more attention, but it is beyond the scope of
the present study.
6
It has to be remembered, though, that in Czech, as in other Slavic languages, evidentiality is not
a grammatical category; Hirschová and Schneiderová (2012) talk about lexical means of express-
ing evidentiality.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 67
certainty concerning the statement or whether it is true or not” (2004, 3); she goes
on to add that evidentials “may acquire secondary meanings – of reliability, proba-
bility, and possibility (known as epistemic extensions), but they do not have to”
(2004, 6).
An opposite view is endorsed by Palmer (1986, 51ff.), who discusses evidential-
ity in the chapter on epistemic modality. According to him, presenting the informa-
tion as something a speaker has been told about is one of at least four ways in which
a “speaker may indicate that he is not presenting what he is saying as a fact”.
Ultimately, as he argues, all the four ways (the others are speculation, deduction and
appearance) “are concerned with the indication by the speaker of his (lack of) com-
mitment to the truth of the proposition being expressed”. More specifically, by using
an evidential, the speaker “provide[s] an indication of the degree of commitment”
and “qualifies” the proposition “in terms of the type of evidence” he or she has
(1986, 54). In this sense, says Palmer, evidentials are “subjective” (ibid.).
The relationship between modality and evidentiality is a strongly debated issue
(see e.g. Chafe and Nichols 1986, Willett 1988, and Dendale and Tasmowski 2001);
clearly, as Plungian (2001, 354) remarks, epistemic modality is “a domain where
evidential and modal values overlap” because it is concerned with the probability of
a proposition (P), which indicates that “the speaker has no direct knowledge of P”.
As Traugott (1989, 33) notices, “epistemics and evidentials share a great number of
similarities in their semantic development, and the histories of items in the one
domain can illuminate the histories of items in the other. Naturally, though, it may
be useful in some other endeavor, such as a fine-grained analysis of modality, mood,
and data-source/authority to distinguish epistemics and evidentials.”
This paper attempts at such a fine-grained analysis. We will investigate whether
the meaning of uncertainty or doubt is encoded in the meaning of prý, i.e. whether
prý in reporting indirect speech either (a) always carries the meaning of uncertainty
or doubt (SSČ and Komárek et al.), or (b) whether perhaps two autonomous senses
can be recognized for prý introducing indirect speech, or (c) whether uncertainty
and doubt are only epistemic overtones, which prý introducing indirect speech may
but need not carry.
We will argue for the last option. At the same time, we will resort to Traugott’s
concept of subjectification (e.g. 1989, 1995, 2010) as a “a pragmatic-semantic pro-
cess” and “a gradient phenomenon, whereby forms and constructions that at first
express primarily concrete, lexical, and objective meanings come through repeated
use in local syntactic contexts to serve increasingly abstract, pragmatic, i nterpersonal,
and speaker-based functions” (1995, 32). In her 1995 paper, Traugott discusses
examples “that correlate with change of grammatical status from main verb con-
structions not merely to auxiliaries (i.e. reduced verbs), but to discourse particles
with quasi-adverbial properties” (1995, 36); these examples include, among others,
quotative like, be going to, I think and let’s. Typically, Traugott argues, such changes
“involve a shift from relatively objective reference to use as markers of discourse
reference, i.e. they acquire a metalinguistic function of creating text and signalling
information flow” (1995, 39). This, as we believe, is what happened to prý, when it
developed from the lexical verb praviti meaning “to say”.
68 M. Martinková and M. Janebová
Following Johansson (2007), who argues that “in monolingual corpora we can
easily study forms and formal patterns, but meanings are less accessible”, we will
look at prý through the lens of another language. Since “one of the most fascinating
aspects of multilingual corpora is that they can make meanings visible through
translation” (Johansson 2007, 57), we turn to the English-Czech and Czech-English
sections of InterCorp (through the search engine KonText),7 a multilingual parallel
corpus of texts written or transcribed in 39 languages (as of 2016), all of which are
aligned with their Czech counterparts. Our aim is to investigate the functions of prý8
via its English correspondences in English source and target texts belonging to three
different registers.
As a starting point of the corpus analysis, we are interested to see whether the
source of the reported information (original speaker) is left unexpressed, as the
dictionary definition of prý seems to suggest (Sect. 3), and whether there is a differ-
ence between registers in this respect (Sect. 4). Several methodological issues will
be raised. Ultimately, however, we are interested in knowing in which registers the
correspondences of prý explicate the function of prý as an evidential marker of
reported information or whether it is rendered as an epistemic marker expressing a
lack of the speaker’s commitment to the factuality of the proposition (i.e. uncer-
tainty or doubt). Section 5 thus delves deeper into the status of the component of
doubt in the meaning of prý. More specifically, we question the definition in the
dictionaries which see doubt as an inherent part of its meaning. Here we try to out-
line the pragmatic mechanisms by which the modal overtones of doubt about the
truth of the reported facts arise.
2 Data and Methodology
All studies of register variation in parallel corpora are restricted to the types of texts
available in the corpora, which, in turn, are restricted to texts that are often trans-
lated. According to Hasselgård (2010), “this precludes the study of many types of
text, such as conversation, daily newspapers, and academic prose”. Our study
focuses on three registers which are available in InterCorp: fiction, journalistic texts
and spoken language. Unfortunately, journalistic texts are only represented by the
PressEurope database, which does not contain any news reporting, and spoken lan-
guage in InterCorp is far from spontaneous: our data come from the Proceedings
from the European Parliament (Europarl) and from subtitles taken from the Open
Subtitles Database. In addition, the data representing each register vary in size (the
PressEurope subcorpus is the smallest and the subcorpus of Subtitles translated
7
Both InterCorp and KonText were created at Charles University in Prague. https://ucnk.ff.cuni.cz/
intercorp/?lang=en
8
Prý, not the Common Czech prej, was selected for the analysis, since its usage is not restricted to
informal registers.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 69
from English by far the largest; see Table 1) and come from different periods: the
PressEurope texts in InterCorp cover the period between 2009 and 2014 and the
Europarl texts date from 2007 to 2011. Our subcorpus of Fiction was created manu-
ally to include only books published after 1950 and to ensure that not more than two
novels per author are included. As far as the subcorpora of Subtitles are concerned,
since InterCorp only provides information about the year in which the original lan-
guage version of the film was released (no information is provided for Czech as the
target language), and since the number of Czech original films in InterCorp is very
low, all Subtitle data available were included in the subcorpus.
This brings us to two more problems regarding comparability of the subcorpora:
first, Czech as a small language is the source language of a much lower number of
texts than English, and so subcorpora consisting of Czech target texts are always
larger than subcorpora of Czech source texts. This applies to all subcorpora, as
demonstrated in Table 1 above. Second, there is a problem with the concept of
“original language”: while in our subcorpus of Fiction and of Subtitles Czech or
English is always the language of the original, the same cannot be said about the
PressEurope subcorpus; InterCorp does not provide information about the language
of the original text for the texts included in the PressEurope database. Europarl
(which is not annotated for the original language either) questions even the concept
of the source language: as Gast and Levshina (2014, 377–378) argue, “until 2003
the texts were translated directly from the source languages into any of the target
languages. From 2003 onwards … all languages were first translated into English
and then into the relevant target language”.9 What makes this even more compli-
cated is the fact that a large proportion of the Europarl texts are not even annotated
for the source language, which means, for example, that a potential subcorpus of
Czech translations from English source texts only has 9,284 text positions (TPs) and
prý does not occur in it. To obtain a sensible amount of Europarl data for analysis,
following Gast and Levshina (2014),10 we resorted to a methodologically problem-
atic solution not to differentiate between Czech as source and target language and to
include even translations from other languages. This explains the single number in
the last row in Table 1 above.
9
We hear that this practice is, however, abandoned if an interpreter between e.g. German and
Czech is available; the translation then goes directly from German to Czech (Šárka Timarová, pers.
comm.).
10
Gast and Levshina (2014, 379) argue that the “translations of the EUROPARL corpus are of a
very high quality and certainly come close to that ideal”, namely the ideal of “near equivalence” in
a translation corpus.
70 M. Martinková and M. Janebová
A simple Word form query (with Case unmatched) was used to retrieve all tokens of
prý from the individual subcorpora; absolute and normalised (instances per million
words, ipm) frequencies of prý are presented in Table 2.
It follows from Table 2 that Czech target texts (TTs) in the subcorpus of Fiction
and of PressEurope texts show significant translation effects, namely a lower nor-
malised frequency of prý in TTs than in the source texts (STs). This is statistically
more significant in the Fiction texts (LL 101.08, p < 0.0001) than in PressEurope
(LL 44.25, p < 0.0001); no statistical significance is observed for the Subtitles (LL
1.40, p > 0.05).11 These quantitative differences, however, by no means indicate that
the data are not reliable,12 as only two incorrect translations into Czech were identi-
fied. Incidentally, in both of them the Czech translation suggests that the original
author is also the reporter, which Czech does not allow; prý introducing indirect
speech can only be used to report someone else’s words:
(7) But I only thanked her and said no, that I wished to be on my own.
[EnCz.Fict:BJ_S].13
Jen jsem jí ale poděkoval a odmítl jsem, že prý chci být sám.
‘I only thanked her and refused, [saying] that I PRÝ want to be on my own.’
11
For the statistics, we used Andrew Hardie’s online calculator available at http://corpora.lancs.
ac.uk/sigtest/ (p-value returned by the Fisher exact text). Naturally, it could not be applied to the
Europarl data, where Czech originals and Czech translations are not distinguished.
12
Compare Altenberg and Granger (2002, 40): “Translation effects, whether induced by the source
language or universal strategies, are seldom violations of the target language system in profes-
sional translations, but quantitative deviations from the target language norm . . . . As such they are
of course eligible as potential translation equivalents.”
13
The legend in square brackets indicates source language, target language, subcorpus and abbrevi-
ated title (in this order, where applicable). For the abbreviations and list of titles quoted in this
paper, see Appendix 1 and 2.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 71
All tokens of prý found in Fiction, PressEurope, Europarl and source texts of
Subtitles, and a random sample of 200 tokens of prý in target texts of Subtitles14
were sorted and subjected to scrutiny. Section 3.1 demonstrates with concrete (but
randomly selected) examples which types of English correspondences were counted
as those in which the source of the reported information (original speaker) is
unknown (not expressed), while Section 3.2 presents the types of correspondences
identifying the original speaker. Section 4 then provides a detailed comparison of
the registers.
The correspondences of prý with no reference to the original speaker range from
clauses with nouns such as word or rumour, clauses with the evidential verb seem,
verbs think, suppose and guess, evidential adverbs and the evidential semi-auxiliary
be supposed to15 to reporting clauses with verba dicendi and verbs referring “to the
receptive end of the communication process” (Quirk et al. 1985, 181). Furthermore,
the reporter may seek for the confirmation of the validity of the reported
statement.
Sentence (8) exemplifies the noun word in the correspondence, sentence (9) a
clause with the verb seem, (10) a clause with the verb suppose, and in (11) the
reporter seeks a confirmation of the validity of the reported statement:
(8) Word is that Randy, the Boy Wonder, is convinced that he can turn the center
into a hot acquisition target that will attract one of the big pharmaceutical
companies. [En-Cz.Fict:KJA_FA]
Zázračný chlapec Randy je prý přesvědčený, že může ústav změnit ve
velepřitažlivý cíl investorů a přilákat jednu z největších farmaceutických
firem.
‘The wonder boy Randy is PRÝ convinced that …’
(9) So when Willem began hitting Catharina it seems Tanneke got in between
them to protect her. [En-Cz.Fict:CT_GP]
Takže když Willem začal Catharinu mlátit, Tanneke prý vběhla mezi ně, aby
ji chránila.
‘So when Willem began hitting Catharina, Tanneke PRÝ got in between
them to protect her.’
14
The examples from Czech STs come from films released in the period between 1955 and 2010,
and from the Czech subtitles of films released in English between 1915 and 2012 (but only five
tokens of the two hundred are pre-1950, and 172 are post-1980).
15
Chafe (1986) lists evidently, apparently, and be supposed to as hearsay evidentials.
72 M. Martinková and M. Janebová
The most frequent adverbs are the evidential ones, namely apparently, allegedly,
supposedly, and reportedly:
(13) He’s supposed to be after me. So McGonagall reckons he might have sent
it. [En-Cz.Fict:RJK_PA]
Jde mi prý po krku, takže McGonagallovou napadlo, že mi Kulový blesk
možná poslal on.
‘He is PRÝ after me, so …’
Sentences (14)–(18) exemplify the cases of reporting clauses in which the source of
the reported information is underspecified or entirely left out. In (14) and (15) the
reporting verb has a general subject argument, namely the generically used 3rd per-
son plural pronoun they and the noun people; most typically, however, the reporting
verb is used in the passive, as in (16):
(15) People sometimes tell me I’ve missed out on life because I never married
and had children. [En-Cz.Fict:IK_AFW]
Prý jsem o mnoho přišel, protože jsem se neoženil a neměl jsem děti.
‘PRÝ I’ve missed a lot because . .
(16) Though he was said to be in his mid-sixties, he didn’t look to be any older
than her fifty-year-old father. [En-Cz.Fict:RF_HS]
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 73
Sentence (17) exemplifies the tokens in which the subject argument of the reporting
verb in the passive voice is the reporter (addressee in the original interchange):
(17) I’m told that the initial tests have gone very well. [En-Cz.
Fict:KJA_FA]
Klinické testy prý zatím probíhají velice dobře.
‘The clinical tests PRÝ are going very well.’
(18) I hear that Dubrovnik is the most beautiful city in the world. ..
[En-Cz.Fict:SAR_HT]
Dubrovník je prý nejkrásnější město na světě …
‘Dubrovnik is PRÝ the most beautiful city in the world …’
(19) The rumor is that Lily and James Potter are – are – that they’re – dead
[En-Cz.Fict:RJK_PS]
A tvrdí se, že Lily a James Potterovi jsou jsou – že prý jsou mrtví.
‘And it is claimed that Lily and James Potter are are – that they PRÝ are
dead.’
Correspondences such as (19), where prý is in fact added (or omitted) in the transla-
tion, since the Czech sentence contains another overt marker of indirect reporting
(one which has its own counterpart in English), will be referred to as “indirect cor-
respondences”. These, in turn, will be kept separate from the zero correspondences
“proper”. The term zero correspondence will only be used for cases such as (20), in
which no direct or indirect correspondence of prý can be identified:
(20) A Japanese team has arrived in Skardu and they’re paying 6 dollars a day.
[En-Cz.Subt:K2]
74 M. Martinková and M. Janebová
On the basis of the data, we distinguish two types of direct correspondence in sen-
tences with the known source. First, prý corresponds to an English reporting clause
with a noun or pronoun in the subject argument of the reporting verb; this noun or
pronoun refers to the original speaker:
(22) (My husband doesn’t know about. .. you know…) and your mother says
she’s not going to tell your father, either. [En-Cz.Fict:DC_CW]
A tvoje matka to prý otci taky nepoví.
‘And your mother PRÝ will not tell your father either.’
(23) According to the Bolivians, it was a routine stop, and when they discovered
Mathis’ body, Bond disarmed and shot them. [En-Cz.Subt:QS]
Když prý bolivijská policie našla Mathisovo tělo, začal Bond střílet.
‘When PRÝ the Bolivian police discovered Mathis’ body, …’
Then there are the cases of indirect correspondence: the English clause or phrase
containing reference to the source of the reported information has its own counter-
part, which occurs alongside prý. This is the case of (24) and (25), where prý occurs
in the final clause of a (very long) reported complex, while the reporting clause has
its own counterpart. In (25), this involves the loss of the original sentence boundary;
in the English translation the sentences are joined:
(24) Miss Vavasour insisted that his daughter and her family should all stay for
lunch, that she would cook a chicken… [En-Cz.Fict:BJ_C]
Slečna Vavasourová nedala jinak, než že jeho dcera musí i s celou rodinou
zůstat na oběd, že prý upeče kuře…
‘Miss Vavasour insisted that his daughter and her family should all stay for
lunch, that PRÝ she would cook a chicken …’
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 75
(25) The secretary reported that Mr. Uzel had turned up and was maintaining a
vigil in the corridor outside my study, doggedly waiting to see me.
[Cz-En.Fict:SV_SP]
Sekretářka mi hlásí, že hajný Uzel vartuje na chodbě přede dveřmi mé
pracovny. Nedal prý se odbýt.
‘The secretary reports to me that gamekeeper Uzel is maintaining a vigil
in front of the door outside my study. He PRÝ wouldn’t be got rid of.’
In (26) the English reporting clause (introducing the source of the reported informa-
tion) is even found in a larger preceding context, outside the sentence boundary:
(26) (“He thinks it’s going to storm,” Rachel explained when the meeting
was over.) “He says you can go, but he will not send a guide. It’s too
dangerous.” [En-Cz.Fict:GJ_T]
Říká, že můžete odejít, ale průvodce s vámi poslat nechce. Prý by to bylo
příliš nebezpečné.
‘[He] says you can go, but he does not want to send a guide. PRÝ it would
be too dangerous.’
In (27) and (28) the source of the reported information is inferred from the imme-
diately preceding context. It was a participant in the original interchange, which
directly precedes:
(27) (Tracy was on the other line. She was very upset.) Becky has taken a sudden
turn for the worse and has been moved to the ICU. [En-Cz.Fict:CR_T]
Becky se prý náhle zhoršila, a tak ji převezli na JIPku.
‘Becky PRÝ got suddenly worse and so they moved her to the ICU.’
(28) (This is the boarding house Albert. Send us someone at once. A lodger has
gone mad.) Sorry? How do we know? How is he? [Cz-EN.Subt:PSP]
Prosím? Jak se to jeví? Jak se … jak prý se to jeví?
‘Sorry? How does it show? How … how PRÝ does it show?16
In (29), the reporter is the agent of the verbal event in the secondary predication
after have, i.e. it is semantically present:
(29) – Karen, I’ve had those images of the creature analyzed. – What is it? – It’s
something new, but gorilla-like. [En-Cz.Subt:C]
– Karen, nechal jsem analyzovat záběry těch bytostí. – Co je to? – Prý je
to něco nového, prý něco jako gorila.
‘– Karen, I’ve had those images of the creature analysed. – What is it? –
PRÝ it’s something new, PRÝ something like a gorilla.’
This is a telephone conversation between the owner of the boarding house and the police, but we
16
Figure 1 suggests a difference between the Fiction texts, where the source of the
reported information tends to be expressed (60% of correspondences of prý in
Czech target texts and 57.4% of correspondences of prý in Czech source texts), and
the other subcorpora, in which it is left unexpressed in the majority of tokens; this
is most evident in the Europarl corpus.
A closer look at the Fiction subcorpus reveals that if the source of the reported infor-
mation is expressed, the most frequent English correspondence of prý is a reporting
clause with a non-generic subject: 61 tokens (21.4%) of prý in Czech target texts
(TT prý) correspond directly to a reporting clause whose subject is a specific noun
or pronoun referring to the original source. For the Czech source texts the percent-
age is even higher (36.1%, 56 tokens). The most frequent verb is the verb say, which
covers 82% of these correspondences of TT prý (50 of the 61) and 67.9% of these
correspondences of ST prý (38 of the 56). Among the remaining tokens in Czech
source texts where prý has a direct correspondence with a reporting clause in which
an explicit reference to the source of the reported information is made, however,
there are nine tokens of the verb claim, as in (30). This might suggest that the trans-
lators do sometimes try to express the reporter’s lesser commitment to the truth of
the reported statement.17 This issue will be readdressed in Section 5.
100%
90%
80% 66 114
70% 11 119
36
60%
44
50%
40%
30% 89 171
9 Source unknown
20% 81
18 Source known
10% 7
0%
TT
ST
TT
RL
ST
TT
PA
Cz
Cz
Cz
Cz
Cz
RO
N
S
N
PE
LE
LE
IO
EU
RO
TI
CT
IT
IT
C
BT
BT
FI
FI
E
SS
SU
SU
E
PR
17
Oxford Advanced Learner’s Dictionary (8th ed.) defines claim as “to say that sth is true although
it has not been proved and other people may not believe it”.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 77
Cz ST Cz TT
100 200
80 Zero 150 Zero
33 1 correspondences correspondences
60 3 110
100 16
Indirect 23 Indirect
40 correspondences correspondences
56 62 50
20 Direct 61 75 Direct
correspondences correspondences
0 0
Source Source Source Source
known unknown known unknown
(30) He claimed to have read somewhere (but more likely the possibility just
occurred to him) that lung cancer was infectious, and he was constantly
making a scene about my endangering our child. [Cz-En.Fict:KP_SZS]
Dočetl se prý kdesi (ale spíš si usmyslel), že rakovina plic je nakažlivá …
‘He read PRÝ somewhere (but more likely he just took it into his head)
that …’
The subcorpus of Subtitles shows a different picture (see Fig. 3): first, the absolute
number of the tokens of ST prý is rather low, namely 54, and the source of the
reported information is known only in 33.3% of the correspondences (18 tokens).
TT prý has a higher correspondence with the known source (40.5%; 81 tokens) than
ST prý, but the unknown source still prevails. The most frequent correspondences in
the Subtitles are communication verbs such as hear (e.g. [31]) and understand,
which cover ca. 22% of correspondences of prý (12 out of the 54 tokens of ST prý
and 43 out of the 200 tokens of TT prý).
78 M. Martinková and M. Janebová
Cz ST Cz TT
40 140
35 6 120
30 Zero 15 Zero
4 correspondences 100 6 correspondences
25
80
20 Indirect 18 Indirect
60
15 9 26 correspondences 98 correspondences
10 40 63
5 9 Direct 20 Direct
correspondences correspondences
0 0
Source Source Source Source
known unknown known unknown
A difference can be observed between STs and TTs, namely that TT prý has a
high percentage of direct correspondences with an English reporting clause intro-
ducing the original speaker (63 tokens, i.e. 31.5%), higher than ST prý (9 tokens, i.e.
16.7%) and even higher than TT prý found in Fiction (21.4%). This might be due to
the fact that in Subtitles, where space is very limited, prý as a three-letter word is
considered to be a useful tool as an equivalent of a whole clause (compare also the
fact that in the Subtitles where Czech is the target language prý has the highest rela-
tive frequency of the target texts of all subcorpora; ipm 89). In direct correspon-
dences of TT prý it is again reporting clauses with the verb say that dominate
(covering 40 out of the 63 tokens of a reporting clause with a non-generic subject as
a direct correspondence of prý, i.e. 63.5%), and five out of the nine direct correspon-
dences of ST prý contain a reporting clause with the verb say, as in (32):
The original speaker is sometimes known because the reported utterance immedi-
ately precedes; if the addressee was also present in the original interchange (which
they were both in [33] and [34]), the reporting function is blocked because the
information is not new to them. In such cases, prý expresses a strong detachment of
the reporter from the reported information and the whole statement is ironic:
‘–Daddy, you aren’t talking reasonably. – So now you will be saucy on top
of everything! I PRÝ am not talking reasonably!’
(34) – Well, uh … the way I see it, this is a pretty big favor … – Some big favor.
I could operate that goddamn thing. [EN-Cz.Subt:RLD]
– Prý velká laskavost … – Sám bych uměl obsluhovat tu blbou pec.
‘– PRÝ a big favor …’
The Europarl data show not only the lowest relative frequency of prý (3.4), but also
the lowest percentage of correspondences revealing the original speaker, namely
13.7% (seven tokens). In addition, the context suggests a detachment of the reporter
from the reported statement:
(35) (I offer, as an example of such dogmas, the recent article by Václav Klaus
advising us how to overcome this financial crisis by temporarily softening
social, environmental and health standards) because, he says, these
standards obstruct rational human behaviour. [Cz-En.Europarl]
…protože prý tyto standardy brání racionálnímu lidskému jednání.
‘…protože PRÝ these standards obstruct rational human behaviour.’
(36) It is said that Kosovo does not set a precedent. (That is a mistake …)
[Cz-En.Europarl]
Kosovo není prý žádný precedent. (Ale to je omyl …)
‘Kosovo PRÝ is no precedent. (But that is a mistake …)’
As Fig. 4 shows, zero correspondences proper are not found at all and indirect cor-
respondences cover 19.6% (10 tokens) of all correspondences of prý in Europarl,
which is a proportion very close to the proportion of zero and indirect
80 M. Martinková and M. Janebová
The PressEurope data are the least reliable, since there are only six tokens of prý in
Czech target texts and 20 tokens of prý in Czech source texts. Among the latter, nine
were classified as introducing the source; however, five of them (including sentence
[37]) come from one section of one text, concluded by “at least, so says Karel Kříž”
(a journalist quoted in the text):
(37) And the same fate awaits the Czech Republic, whose capital is rapidly being
drained away. [Cz-En.PressEurope]
Kolonií je asi Česká republika, protože z ní prý teče kapitál jako z
vodovodní trubky.
‘… the Czech Republic, because PRÝ capital is rapidly being drained away.’
That is to say, texts included in PressEurope are special in that they do not report but
argue, namely for or against the contents of other articles (such as the one written
by Karel Kříž). It thus turns out to be impossible for us to prove or disprove
Hirschová and Schneiderová’s (2012) hypothesis that prý is used by journalists to
avoid responsibility for what they are reporting. What was confirmed, on the other
hand, is a detachment from the reported information. In correspondences with a
reporting clause introducing the original speaker, a wider context proves the reported
statement to be either false, or at least open to discussion:
(38) The Judicial Council denied her sickness benefits, alleging that she was
faking her illness. (Shortly afterwards, she fell into a coma and died of
heart failure.) [Cz-En.PressEurope]
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 81
2 4 Direct
2 correspondences
0
Source Source
known unknown
Figure 5 suggests a high proportion of indirect and zero correspondences (the latter
include three tokens of a missing sentence). The numbers are, however, too low to
allow for any conclusions.
(39) (Only once in a while I am upset by some political outrage or other, like
Sakharov,) who claims that American workers are better off than Soviet
workers. (American workers may have higher wages – and the freedom to
fight for them in continual struggles – but I know very well how workers in
your country live in security, in peace, how they are cared form every way
by the state.) [Cz-En.Fict:SJ_EHS]
Americký dělník se prý má líp než sovětský.
‘The American worker is PRÝ better off than the Soviet one.’
Detachment from the reported proposition can also be seen in other cases; it is a
wider context which makes it explicit:
(40) The absurd thing is, it was these very lines that some of the critics beat me
over the head with. They said the heroes of my Monologues were simply
the bourgeoisie in proletarian dress. [Cz-En.Fict:SJ_EHS]
Prý hrdinové mých Samomluv jsou buržousti převlečení za proletáře.
‘PRÝ the heroes of my Monologues are simply the bourgeoisie [pejorative]
in proletarian dress.’
(41) But the true source of poetry, I was told by this comrade-person, is not, as I
wrote, Beauty, but Class Hatred, Class War. [Cz-En.Fict:SJ_EHS]
Pramenem poesie však prý není, jak jsem napsal já, Krása, řekl mi soudruh
člověk, ale Třídní Nenávist, Třídní Boj.
‘The true source of poetry, however, PRÝ is not, as I wrote, Beauty, said
this comrade-person, but Class Hatred, Class War.’
This doubt or lack of commitment to the truth of the proposition, however, cannot
be regarded as an inherent part of the meaning of prý because it is not obligatory:
examples can be found (even in Škvorecký and Kundera) where there is no doubt
expressed or implied, as in (42):
(42) Those beautiful, faded eyes from Kiruna are constantly watching me. Her
grandfather, she said, used to own iron mines in Kiruna, a city of the
midnight sun. [Cz-En.Fict.:SJ_EHS]
Vytrvale mě pozoruje krásně vyšisovanýma očima z Kiruny, kde prý měla
kdysi dědečka majitelem železných dolů, v tom městě půlnočního slunce.
‘She constantly watches me with the beautifully faded eyes from Kiruna,
where PRÝ she once had a grandfather, owner of iron mines, in that city
of the midnight sun.’
The same can be said about the subcorpus of Subtitles, where we observed a lower
proportion of correspondences with a reference to the original speaker. Even in
tokens with correspondences where no specific source is expressed and with zero
correspondences no component of doubt is necessarily present. In fact, in the fol-
lowing examples the reporters commit themselves to the factuality of the
proposition:
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 83
(43) They said you’d last another two weeks, but we can’t wait that long,
Mrs. Stubb! [Cz-En.Subt: PVV]
Říkali, že prý vydržíte asi tak čtrnáct dnů, jenomže my nemůžeme tak
dlouho čekat, paní Stubová!
‘They said you’ll PRÝ last another fourteen days, but we can’t wait that
long, Mrs. Stubb!’
(44) Life is beautiful, so enjoy it. I’m coming for you in a year. [Cz-EN.Subt:
CVP]
Život je prý krásný, tak si ho užívejte, přesně za rok si pro vás přijdu!
‘Life is PRÝ beautiful, so enjoy it. I’m coming for you in a year!’
Most importantly, the overall results show that typical counterparts of prý are not
modal markers (modal verbs/adverbs/adjectives), but a reporting clause (most fre-
quently with the verb say), communication verbs such as hear, evidential adverbs
such as apparently, allegedly, supposedly and reportedly, or the evidential semi-
auxiliary be supposed to. Importantly, these correspondences can be found in both
of the major categories, i.e. in tokens with the source known and unknown alike.
Therefore, it can be concluded that doubt and uncertainty are not an inherent part
of the meaning of prý, but its overtones: in other words, they may be inferred from
the context. The mechanism which triggers such an interpretation, we believe, is
pragmatic inferencing. This process, which can be called “invited inference” (see
e.g. Traugott and Dasher 2002) has, according to Traugott, “a cognitive-
communicative motivation”: it is “the attempt on the speaker’s part to increase the
informativeness to the interlocutor of what is being said” (1995, 49). This is in
accordance with Aikhenvald (2004, 164), who observes that “[n]ot every reported
evidential implies that the information is unreliable”. Speakers “may choose to
employ the reported evidential for two reasons: firstly, to show his or her objectiv-
ity; that the speaker was not the eyewitness to an event and knows about it from
someone else. Secondly, as a means of ‘shifting’ responsibility for the information
and relating facts considered unreliable” (2004, 180): in the latter case, the reported
evidential gains an “epistemic extension”. Such an extension, as we have seen, is
contextually bound.18
The present-day function of prý is that of signalling information flow in that the
speaker either directly reports an utterance or signals to the hearer that the source of
information is external to the speaker. In this respect, we can speak of the polysemy
of prý in terms of the direct and indirect reporting functions (see Section 1 for the
18
In a similar vein, Bybee et al. (1994, 180) argue that “an indirect evidential, which indicates that
the speaker has only indirect knowledge concerning the proposition being asserted, implies that the
speaker is not totally committed to the truth of that proposition and thus implies an epistemic
value”. In other words, “the implication is definitely an epistemic one – that the speaker does not
vouch unconditionally for the accuracy of the information” (1994, 203).
84 M. Martinková and M. Janebová
grammatical differences). Polysemy, as Traugott (2010, 32) put it, is typical of sub-
jectification: “by hypothesis most new semantic developments emerge as polyse-
mies, pragmatic to begin with, then semantic”. As our data show, however, it is not
justified to speak of the polysemy (i.e. polysemy of the evidential and modal use) of
prý with the indirect reporting function, because doubt and uncertainty are not part
of its conventional meaning. Prý rather seems to behave like, for example, in fact in
Aijmer and Simon-Vandenbergen’s (2004) analysis. Here the authors argue that the
uses of in fact are “pragmatic implicatures which are conventionalised to a greater
or lesser extent, as some contextual meanings are more frequent and more conven-
tionalised than others” (Aijmer and Simon-Vandenbergen 2004, 1788).
It is interesting to notice under which contextual circumstances the pragmatic
implicatures arise. In recent years, subjectivity is discussed alongside intersubjectiv-
ity (see e.g. Davidse et al. 2010). In contrast to subjectivity, intersubjectivity marks
the speaker-addressee relationship in a more prominent way. Through the invited
inference, prý can function as a marker of intersubjectivity, which is described as
“the explicit expression of the SP[eaker]/W[riter]‘s attention to the ‘self’ of addressee/
reader in both an epistemic sense (paying attention to their presumed attitudes to the
content of what is said), and in a more social sense (paying attention to their ‘face’ or
‘image needs’ associated with social stance and identity)” (Traugott 2003, 128). In
this respect, Traugott mentions some uses of hedges such as well or perhaps (2010,
37). When applied to the usage of prý in cases such as (11) (repeated here as [45]),
by using prý the speaker invites the addressee not only to infer that he or she does not
commit to the factuality of the reported statement, but also to confirm it:19
As for the “social” sense, we observed several cases, which, however, express the
opposite of “paying attention” to the face of the addressee. In (46) Draco Malfoy is
humiliating Harry Potter by making him recognize the fact that he fainted:
(46) (As Harry stepped down, a drawling, delighted voice sounded in his ear.)
“You fainted, Potter? (Is Longbottom telling the truth? You actually
fainted?”) [En-Cz.Fict:RJK_PA]
Tys prý omdlel, Pottere?
‘You PRÝ fainted, Potter?’
In (33), repeated here as (47), the father reports what his daughter has just said to
him, using prý. Just as it is infelicitous for the speaker to report (indirectly) his or
19
However, we cannot go as far as claiming that prý has undergone the process of “intersubjectifi-
cation”, which, according to Traugott (2010), follows, or arises from, subjectification. Traugott
(2010, 37) makes a distinction between intersubjectivity and intersubjectification along the follow-
ing lines: “If it is derivable from the context, it is only a case of increased pragmatic intersubjectiv-
ity. In other words, there may be more addressee-oriented uses, but unless a form–meaning pair has
come to code intersubjectivity, we are not seeing intersubjectification”.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 85
her own utterance with prý (see example [7] in Section 3), it is normally pragmati-
cally infelicitous for the speaker to report the hearer’s utterance to the hearer or the
audience who participated in the original interchange; when it does happen, the
sentence expresses a negative attitude, criticism or irony.
(47) – Daddy, do be reasonable! – And saucy you are, too! So I’m not
reasonable. [Cz-EN.Subt: BJK]
Tatínku, s Tebou není rozumná řeč! – Tak ty budeš ještě drzá! Se mnou
prý není rozumná řeč!
‘– Daddy, you aren’t talking reasonably. – So now you will be saucy on top
of everything! I PRÝ am not talking reasonably!’
The same applies to (48), where the speaker mocks the original speakers in front of
them:
(48) You littluns started all this, with the fear talk. Beasts! Where from?
[En-Cz.Fict:GW_LF]
Vy mrňousi jste to všechno začali tím ustrašeným žvaněním. Že prý jsou
tu obludy! Kde by se tu vzaly?
‘You littluns started all this, with the fear talk. [That] PRÝ there are beasts
here! Where would they come from?’
However, as Aikhenvald (2004, 183) notices, reported evidentials can (with proper
intonation or gestures) also be used ironically even “when used in a statement that
both the speaker and the hearer know to be true”, i.e. even if they have no “overtones
of unreliable information” (2004, 184). This is exemplified not only in (46) above, but
also in (49), which presents a dialogue between Nicolas Cage as the FBI agent Stanley
Goodspeed and Sean Connery as John Patrick Mason (currently a prisoner), who has
been told before that there is a serious problem and the FBI needs his help urgently:
(49) – [Goodspeed:] I’m Stanley Goodspeed. – [Mason:] But of course you are.
– [Goodspeed:] Of course I am. Huh. – [Mason:] And you have an
emergency. – [Goodspeed:] That’s right. [En-Cz.Subt: TR]
– Jsem Stanley Goodspeed. – Ale jistě. – Jistě. – A prý máte problém.
– Ano.
‘… And PRÝ you have an emergency.’
This again confirms that no modal overtones are encoded in the meaning of prý.
6 Conclusions
This paper investigated the dominant functions of the Czech (arguably modal) par-
ticle prý, which is used to introduce reported speech, both direct and indirect (the
latter function is referred to as evidential). In order to do this, we looked at prý
86 M. Martinková and M. Janebová
through the lens of another language – English – and we focused on three registers,
namely fiction, journalistic texts (represented by PressEurope texts) and spoken lan-
guage (represented by Subtitles and Europarl texts) available in the parallel corpus
InterCorp.
As a starting point of the corpus analysis, we were interested to see whether the
source of the reported information (original speaker) is indeed left unexpressed, as
the dictionary definition of prý suggests. There turns out to be a difference between
Fiction, where the source of the reported information tends to be expressed (60% of
correspondences of prý in Czech target texts and 57.4% of correspondences of prý
in Czech source texts), and the other subcorpora, in which it is left unexpressed in
the majority of tokens; this is most evident in Europarl.
What follows is that in the texts of Fiction, the dominant function of prý is quota-
tive. Czech and English are two languages in which evidentiality is not grammati-
calised, but both have lexical markers of evidentiality. Our analysis of the
correspondences of prý in Fiction reveals that prý in the reporting function is more
often added than omitted. This might suggest that in Czech there is a stronger ten-
dency to mark the external information source than in English – even if there is an
evidential marker such as the verb say, in Czech there is a tendency to reinforce it
lexically with prý (our indirect correspondences). In the remaining registers, prý
was mainly a reported evidential (the source was not known).
Ultimately, we wanted to prove or disprove the dictionary definition of prý,
according to which it is a modal particle expressing doubt whenever it does not
introduce direct speech. It turned out that in all registers alike, regardless of whether
the source of information was known or unknown, prý hardly ever corresponded to
a modal marker. In the majority of cases, it was rendered as an evidential marker:
the most typical correspondences included the verb say and evidential adverbs.
In the cases in which there was a lack of speaker commitment to the factuality of
the proposition (i.e. uncertainty or doubt), it was contextually bound, which is why
we concluded that the uncertainty or doubt are only epistemic overtones (“invited
inferences” in the sense of Traugott and Dasher [e.g. 2002]).
We noted that the rise of prý can be described in terms of subjectification in the
sense of Traugott (e.g. 1995) – ultimately, prý is used as a “marker of discourse
reference”; i.e. it has acquired “a metalinguistic function of creating text and signal-
ling information flow” (1995, 39). We also noticed cases in which the default report-
ing function was blocked because the addressee of the reporting event was present
in the original interchange, i.e. the information was not new. A special case of this
occurs when the addressee of the reporting event is the original source: we classify
such cases as instances of the interpersonal function of prý; the sentences express a
negative attitude, criticism or irony. In some of these cases, the modal overtones of
doubt are not present at all.
We believe that the conclusions we draw from our approximations to spoken
language are plausible for spoken language in general. To really confirm this, we
would need a corpus of interpreted spontaneous dialogues. What urgently calls for
investigation now is the diachrony of the process of subjectification of prý.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 87
Appendices
[Cz-En.Subt: Byl jednou jeden král (There Once Was a King…). 1955. Dir.
BJJK] Bořivoj Zeman.
[Cz-En.Subt: Čert ví proč (The Devil Knows Why). 2003. Dir. Roman
CVP] Vávra.
[Cz-En.Subt:PSP] Pension pro svobodné pány (Pension for Single Gentlemen).
1967. Dir. Jiří Krejčík.
[Cz-En.Subt: Pane, vy jste vdova! (You’re a Widow, Sir!). 1970. Dir.
PVV] Václav Vorlíček.
[En-Cz.Subt:C] Congo (Kongo). 1995. Dir. Frank Marshall.
[En-Cz.Subt:K2] K2 (K2). 1991. Dir. Franc Roddam.
[En-Cz.Subt:QS] Quantum of Solace (Quantum of Solace). 2008. Dir. Marc
Forster.
[En-Cz. The Return of the Living Dead (Návrat oživlých mrtvol).
Subt:RLD] 1985. Dir. Dan O’Bannon.
[En-Cz.Subt:SFA] Shrek Forever After (Shrek: Zvonec a konec). 2010. Dir.
Mike Mitchell.
[En-Cz.Subt:TD] The Duellists (Soupeři). 1977. Dir. Ridley Scott.
[En-Cz.Subt:TH] The Rock (Skála). 1996. Dir. Michael Bay.
References
Aijmer, K., & Simon-Vandenbergen, A.-M. (2004). A model and a methodology for the study of
pragmatic markers: The semantic field of expectation. Journal of Pragmatics, 36, 1781–1805.
Aikhenvald, A. Y. (2004). Evidentiality. Oxford: Oxford University Press.
Altenberg, B., & Granger, S. (2002). Recent trends in cross-linguistic lexical studies. In
B. Altenberg & S. Granger (Eds.), Lexis in contrast: Corpus-based approaches (pp. 3–48).
Amsterdam: John Benjamins.
Bybee, J., Perkins, R., & Pagliuca, W. (1994). The evolution of grammar: Tense, aspect, and
modality in the languages of the World. Chicago: Chicago University Press.
Chafe, W. (1986). Evidentiality in English conversation and academic writing. In W. Chafe &
J. Nichols (Eds.), Evidentiality: The linguistic coding of epistemology (pp. 61–273). Norwood:
Ablex.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 89
Chafe, W., & Nichols, J. (Eds.). (1986). Evidentiality: The linguistic coding of epistemology.
Norwood: Ablex.
Davidse, K., Vandelanotte, L., & Cuyckens, H. (Eds.). (2010). Subjectification, intersubjectifica-
tion and grammaticalization. Berlin: De Gruyter Mouton.
Dendale, P., & Tasmowski, L. (Eds.). (2001). On evidentiality. Amsterdam: Elsevier. Special issue
of Journal of Pragmatics 33(3).
Fronek, J. (2000). Velký česko-anglický slovník. Prague: LEDA.
Gast, V., & Levshina, N. (2014). Motivating W(h)-clefts in English and German: A hypothesis-
driven parallel corpus study. In A.-M. de Cesare (Ed.), Frequency, forms and functions of cleft
constructions in Romance and Germanic. Contrastive, corpus-based studies (pp. 377–414).
Berlin: De Gruyter Mouton.
Grepl, M. (2002). Reprodukce prvotních výpovědí. In P. Karlík, M. Nekula, & J. Pleskalová (Eds.),
Encyklopedický slovník češtiny. Prague: Nakladatelství Lidové noviny.
Grepl, M., & Karlík, P. (1998). Skladba češtiny. Olomouc: Votobia.
Hasselgård, H. (2010). Parallel Corpora and contrastive studies. In: R. Xiao (Ed.), Proceedings of
the international symposium on Using Corpora in Contrastive and Translation Studies 2010
Conference (UCCTS2010). http://www.lancaster.ac.uk/fass/projects/corpus/
UCCTS2010Proceedings/papers/Hasselgard.pdf. Accessed 1 July 2015.
Hirschová, M., & Schneiderová, S. (2012). Evidenciální výrazy v českých publicistických textech
(případ údajně–údajný). In Gramatika a korpus/Grammar and Corpora 2012. http://www.ujc.
cas.cz/miranda2/export/sitesavcr/data.avcr.cz/humansci/ujc/vyzkum/gramatika-a-korpus/pro-
ceedings-2012/konferencni-prispevky/HirschovaMilada_SchneiderovaSona.pdf. Accessed 1
July 2015.
Hoffmanová, J., & Kolářová, I. (2007). Slovo prý/prej: možnosti jeho funkční a sémantické dife-
renciace. In F. Štícha & J. Šimandl (Eds.), Gramatika a korpus/Grammar and Corpora 2005.
Prague: Ústav pro jazyk český Akademie věd České republiky.
Huddleston, R., & Pullum, G. K. (2002). The Cambridge grammar of the English language.
Cambridge: Cambridge University Press.
Ifantidou-Trouki, E. (1993). Sentential adverbs and relevance. Lingua, 90, 69–90.
Johansson, S. (2007). Seeing through multilingual Corpora. On the use of Corpora in contrastive
studies. John Benjamins: Amsterdam.
Komárek, M., Kořenský, J., Petr, J., Veselková, J., et al. (1986). Mluvnice češtiny 2. Tvarosloví.
Prague: Academia.
Krčmová, M. (2002). Čeština obecná. In P. Karlík, M. Nekula, & J. Pleskalová (Eds.),
Encyklopedický slovník češtiny. Prague: Nakladatelství Lidové noviny.
Lyons, J. (1977). Semantics. Cambridge: Cambridge University Press.
Machek, V. (2010). Etymologický slovník jazyka českého. Prague: Nakladatelství Lidové noviny.
Oxford Advanced Learner’s Dictionary. (2010). 8th ed. Ed. A. S. Hornby. Oxford: Oxford
University Press.
Palmer, F. R. (1986). Mood and modality. Cambridge: Cambridge University Press.
Papafragou, A. (2006). Epistemic modality and truth conditions. Lingua, 116, 1688–1702.
Plungian, V. A. (2001). The place of evidentiality within the universal grammatical space. Journal
of Pragmatics, 33(3), 349–357.
Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A comprehensive grammar of the
English language. London: Longman.
Slovník spisovné češtiny pro školu a veřejnost [SSČ]. (2009). Prague: Academia.
Trávníček, F. (1951). Mluvnice spisovné češtiny II. Skladba. Prague: Slovanské nakladatelství.
Traugott, E. C. (1989). On the rise of epistemic meanings in English: An example of subjectifica-
tion in semantic change. Language, 65, 31–55.
Traugott, E. C. (1995). Subjectification in Grammaticalization. In D. Stein & S. Wright (Eds.),
Subjectivity and subjectivisation: Linguistic perspectives (pp. 31–54). Cambridge: Cambridge
University Press.
90 M. Martinková and M. Janebová
Corpora
Czech National Corpus – InterCorp. Institute of the Czech National Corpus, Prague. http://www.
korpus.cz.
Czech National Corpus – SYN2000. Institute of the Czech National Corpus, Prague. http://www.
korpus.cz.
Modal Adverbs of Certainty in EU Legal
Discourse: A Parallel Corpus Approach
Magdalena Szczyrbak
1 Introduction
Although traditional studies into modality focus on modal auxiliaries, more recent
approaches recognise the interplay of lexical means including modal nouns, adjec-
tives and adverbs. Modal adverbs of certainty, it can be argued, attest to the dialogic
orientation of discourse, since they help speakers and writers to “contest, refute, or
build an argument toward alternative or different conclusions” (Traugott 2010: 15).
This being the case, modal adverbs are used for rhetorical purposes and they serve,
among other goals, to foreground stronger arguments and to background alternative
voices. This, in turn, makes them a useful rhetorical device which is frequently
deployed in argumentative writing. This chapter examines the use of modal adverbs
M. Szczyrbak (*)
Institute of English Studies, Jagiellonian University, Kraków, Poland
e-mail: magdalena.szczyrbak@uj.edu.pl
Modal adverbs of certainty, whose role in discourse goes far beyond that of marking
varying degrees of certitude, are inextricably linked to stance and argumentation.
On the one hand, as epistemic stance devices, they “can mark certainty (or doubt),
actuality, precision, or limitation” as well as “indicate the source of knowledge or
the perspective from which the information is given” (Biber et al. 1999: 972). On
the other, they “are indexically related to variables in the social situation and are
associated with types of social activity, with social roles and with power” (Simon-
Vandenbergen and Aijmer 2007: 5). Put differently, they are linked to cultural and
social dimensions including social acts, activity types, social identity and relation-
ships (Simon-Vandenbergen and Aijmer 2007: 55–56). As for the conceptualisation
of stance, the term lends itself to a variety of (often complementary or overlapping)
interpretations, given that it can be expressed by a multitude of linguistic and para-
linguistic resources. Starting with Biber et al. (1999: 966), stance is defined as “per-
sonal feelings, attitudes, value judgments, or assessments”. It is also theorized as
“the space in language where literal, figurative, and functional meanings intersect”
(Precht 2003: 239) or, elsewhere, as a situational dimension which encompasses
“types of (epistemic or affective) attitude and degrees of affective intensity or
strength of commitment” (Aijmer 2007: 330). The role of paralinguistic elements in
expressing stance, in turn, is recognised by Chindamo et al. (2012), for whom com-
municative stance denotes an “attitude which, for some time, is expressed and sus-
tained interactively in communication, in a unimodal or multimodal manner.”
Another approach sees stance as “a linguistically articulated form of social action”
(du Bois 2007: 139), whereby social actors, using covert communicative means,
simultaneously evaluate objects, position themselves and others, and align with
other subjects (du Bois 2007: 163).1 Yet another view – which is most relevant to the
current study – is expounded by Hyland (2005), studying the resources which aca-
demic writers employ in order to express their positions and to connect to readers.
Unlike du Bois, whose definition of stance includes the mutual positioning of
1
In agreement with this view, various interactional practices and linguistic resources have been
explored to date, including, for instance, the role of I guess in conversational stancetaking
(Kärkkäinen 2007), digressing (Kärkkäinen 2012), positioning and alignment in news interviews
(Haddington 2007), resonance in storytelling (Niemelä 2011), challenging the prior speaker and
tag questions (Keisanen 2006, 2007) as well as repetition and returning to prior talk (Rauniomaa
2007, 2008).
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 93
2
Cf. the concept of prototypicality and prototype theory, as proposed by cognitive linguists (see,
e.g., Geeraerts 2006).
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 95
cisely, they can express commitment to the propositional content, convey evaluation
of the propositional content or convey information about the performance of the act
as a whole (Tseronis 2009: 34). What is more, in line with pragma-dialectic assump-
tions, Tseronis (2009: 41) argues that standpoint qualification can be analysed as
part of strategic manoeuvring which language users implement in order to clearly
mark a difference of opinion, while promoting their own interests in the discourse.
Yet, it must be added that, however revealing, Tseronis’s study – following Biber
et al.’s (1999) classification of stance adverbs – aims to provide a theoretical tool for
analysing argumentative discourse rather than to account for the social or cognitive
reasons for qualifying standpoints in argumentation (Tseronis 2009: 12).
The interactional potential and the pragmatic reading of modal adverbs in one
specific discourse, i.e. in legal discourse, are, on the other hand, explored in
Szczyrbak (2014), where it is shown that in the legal context, modal adverbs serve
to foreground and background differing legal arguments and interpretations, on the
one hand, and to demonstrate power and authority, on the other. Again, it is con-
tended that – seen as a site of multiple voices – both spoken and written legal genres
can be approached as polyphonic or heteroglossic. In terms of frequencies, the find-
ings reveal that of course is by far the most common modal adverb in spoken genres
(especially in adversarial proceedings), whereas indeed is most frequently deployed
in written genres (including Opinions of Advocates General) (Szczyrbak 2014: 92).
Analysing the rhetorical effect of selected adverbs, Szczyrbak (2014: 98) also points
out that while of course and certainly are linked to politeness and solidarity, indeed
and clearly are associated with power and authority.3 What the aforementioned
study also shows is that modal adverbs are systematically interwoven into larger
argumentative schemata. Remarkably, in judicial argumentative patterns – incorpo-
rating both the arguer’s actual standpoint and alternative built-in voices – Concessive
sequences comprising claims, acknowledgments (i.e. moves in which the arguer
partly concedes an opposing viewpoint) and counterclaims4 are especially
noticeable.
To conclude this section then: there is a clear link between modal adverbs of cer-
tainty and stance, the principal assumption being that these language devices allow
speakers and writers to engage in a dialogue and to evaluate other standpoints.
Therefore, building on previous studies, in the current investigation I will take the
research into the usage of modal adverbs further by looking at their stancetaking
potential in forensic argumentation, on the one hand, and by examining their canoni-
cal and less obvious meanings, on the other. However, rather than treat stance and
engagement as complementary notions, as proposed by Hyland (2005), I will
conceive of stance as incorporating intersubjective positioning and audience involve-
ment features, among which modal adverbs of certainty play a prominent role.
3
It is also demonstrated that although of course often serves as a solidarity device, it can also be
used to assert authority and superiority of knowledge (Szczyrbak 2014: 97).
4
See Couper-Kuhlen and Thompson (2000) and Barth-Weingarten (2003) for a detailed descrip-
tion of this analytical model.
96 M. Szczyrbak
The present study aims to investigate the role of modal adverbs of certainty in the
Opinions of Advocates General and to explore their polysemy based on the transla-
tion patterns found in English and Polish data. In particular, an attempt will be made
to answer the following questions:
1. What conventional and context-specific meanings of modal adverbs of certainty
are revealed by the bilingual data under study?
2. How frequent are omissions in the translations of these adverbs and what effect
do these omissions have on the argumentative force of the translated texts?
Since in addressing the above issues corpus data will be used, it must be remem-
bered that various types of corpora (e.g. bilingual or multilingual) are widely applied
in contrastive and translation studies for theoretical or practical purposes. While
theoretically-oriented research investigates the manner in which the same ideas are
transmitted in various languages, practically-oriented explorations aim, for instance,
to develop machine translation and computer-assisted translation systems. It should
also be highlighted that, as held by Grisot and Moeschler (2014: 13), corpora allow
“the researcher to uncover on the one hand, what is probable and typical and, on the
other hand, what is unusual about the phenomenon considered.”
At this point, a note clarifying the meaning of parallel corpora is in order, espe-
cially given that there is some confusion related to this term.5 The terminology
adopted for the purposes of the current study is in line, for instance, with Baker
(1999), Hunston (2002) and McEnery and Xiao (2007), who draw a distinction
between a comparable corpus and a parallel corpus. In this approach, a comparable
corpus is defined as one with “the same proportions of the texts of the same genres
in the same domains in a range of different languages in the same sampling period”
(McEnery and Xiao 2007: 20). Thus, the subcorpora composing a comparable cor-
pus are not translations but rather, they use the same sampling frame and show
“similar balance and representativeness.” As regards the definition of a parallel cor-
pus, in the case of which the sampling period is irrelevant, the same linguists hold
that the term refers to “a corpus that contains source texts and their translations” and
which can be either bilingual or multilingual (McEnery and Xiao 2007: 19). What
is more, as McEnery and Xiao (2007: 19) see it, parallel corpora can be uni-
directional, bi-directional or multi-directional, the latter including texts which are
written simultaneously in different languages. Going further, McEnery and Xiao
(2007: 20) subdivide parallel corpora into general and specialised ones, stressing
that specialised parallel corpora (including, for instance, contract law texts) are par-
ticularly useful in domain-specific translation research.
5
For a discussion on the various labels used to describe different types of multilingual corpora, see
McEnery and Xiao (2007).
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 97
6
As there is only one British Advocate General at the ECJ, the Opinions used to compile the corpus
were written by one person. However, this fact appears to have no bearing on the results, since the
focus of the analysis is on the translation process.
98 M. Szczyrbak
Finally, it should also be noted that the range of the data used in the study was
limited and that, therefore, further research is needed for valid generalisations to be
made. Still, despite this limitation, it is believed that they offer insight into the poly-
semy of modal adverbs of certainty and that they can therefore be relevant to future
investigations focusing on other languages, discourses or genres.
4 Results and Discussion
At the outset of the investigation, the most frequent modal adverbs in the English
subcorpus were identified and then their Polish equivalents in the Polish subcorpus
were determined. As a result of the frequency count, the following modal adverbs
were identified as most common: indeed (83 tokens), necessarily (36 tokens), not
necessarily (35 tokens),7of course (35 tokens), clearly (32 tokens)8 and obviously (18
tokens). All the other adverbs which had fewer than 10 occurrences were excluded
from the analysis. For the individual translations and their frequencies, see Table 1.
In the remainder of this section I will illustrate, through examples, the usefulness
of parallel corpora in exploring the polysemy of English modal adverbs of certainty,
assuming that they can provide insight into what might remain unnoticed if only
monolingual corpora were consulted.
As shown above, indeed was by far the most frequent modal adverb of certainty in
the corpus. Fourteen different translations of indeed were recognised in the Polish
data and as many as 16 omissions. Overall, indeed was found: (1) to co-occur with
the Concessive relation9; (2) to mark rhetorical emphasis or (3) to operate as a dis-
course marker. A relatively frequent co-occurrence pattern was that of the emphatic
do followed by indeed (9 occurrences), linked to Concession and associated with
acknowledgments. Example (1) below illustrates such an acknowledgment, sig-
nalled with the concessive whilst (choć in Polish), where the stress introduced by
indeed is strengthened by the emphatic do. Here, the arguer concedes that other
parts of the Framework Decision include references to national law, but, at the same
time, she claims that there is no such mention in the excerpt under consideration.
7
For the purpose of the analysis, this category subsumes instances of negation + necessarily (e.g.
not necessarily, cannot necessarily, without necessarily, etc.).
8
In total, there were 78 occurrences of clearly including its non-modal use as an adverb of
manner.
9
Following the convention found in Barth-Weingarten (2003), whenever capitalised, Concession
refers to the discourse-pragmatic relation, but when written with a lower-case letter, it denotes the
interclausal relation.
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 99
Table 1 (continued)
No. of
English adverbs tokens Polish translations
CLEARLY 32 w sposób oczywisty/w oczywisty sposób ( evidently, in
an evident manner ) (6)
wyraźnie (plainly/expressly) (5)
oczywiście (of course/obviously) (4)
najwyraźniej (apparently/most obviously) (3)
Ø [omission] (3)
oczywisty (obvious/evident) (2)
bez wątpienia (undoubtedly/without a doubt) (2)
jasno (plainly) (2)
naturalnie (naturally) (1)
właśnie (just) (1)
bezwzględnie (unconditionally) (1)
nie ma wątpliwości (there is no doubt) (1)
w sposób wyraźny (plainly/in an express manner) (1)
OBVIOUSLY 18 oczywiście ( of course/obviously ) (11)
oczywisty (obvious/evident) (2)
Ø [omission] (2)
w oczywisty sposób/w sposób oczywisty (evidently, in an
evident manner) (2)
wprost (simply) (1)
(1)
ENG:
The objective pursued by the Framework Decision has already been
identified: the enforcement of financial penalty decisions through mutual
recognition. (14) The term ‘court having jurisdiction in particular in
criminal matters’ used in Article 1(a)(iii) plays a crucial role in determining
the scope of the Framework Decision, because it defines a category of
financial penalty decision that benefits from mutual recognition and hence
enforcement. Whilst other parts of the Framework Decision do indeed
cross-refer to national law, (15) here there is no such mention.
POL:
Cel decyzji ramowej został już wskazany: wykonywanie orzeczeń
nakazujących uiszczenie kary o charakterze pieniężnym w drodze
wzajemnego uznawania (14). Wyrażenie „sąd właściwy także w sprawach
karnych” zawarte w art. 1 lit. a) pkt iii) odgrywa kluczową rolę przy
określaniu zakresu decyzji ramowej, ponieważ określa ono kategorię
orzeczeń nakazujących uiszczenie kary o charakterze pieniężnym
korzystających z wzajemnego uznawania, a w konsekwencji – wykonania.
Choć w innych przepisach decyzji ramowej rzeczywiście występują
odesłania do prawa krajowego (15), to omawiany przepis ich
nie zawiera. [OAG_7]
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 101
Likewise, (2) shows how indeed is deployed in combination with clear to highlight
this part of the argument which is conceded (“that Mrs McCarthy could stay in the
United Kingdom on her own”) and how this acknowledgment is contrasted with the
contested part of the argument (“it is less clear whether the Court considered the
detailed implications”).
(2)
ENG:
0 [IMPLIED CLAIM]
X’ [ACKNOWLEDGMENT]
Whilst it is indeed clear that Mrs. McCarthy could stay in the United
Kingdom on her own by virtue of her nationality and that she was not
being deprived of a right to move under EU law by denying her husband
derived rights as a third country national family member,
Y [COUNTERCLAIM]
it is less clear whether the Court considered the detailed implications.
Perhaps the short answer was simply ‘EU law can’t help: try the ECHR’.
POL:
0 [IMPLIED CLAIM]
X’ [ACKNOWLEDGMENT]
O ile rzeczywiście jest bezsporne, że S. McCarthy sama posiadała prawo
pobytu w Zjednoczonym Królestwie z uwagi na swoje obywatelstwo,
jak również że nie pozbawiano jej prawa do przemieszczania się na
gruncie prawa Unii poprzez odmowę jej mężowi prawa pobytu jako
obywatelowi państwa trzeciego będącemu członkiem rodziny,
Y [COUNTERCLAIM]
o tyle jest już mniej oczywiste, czy Trybunał przeprowadził analizę
szczegółowych implikacji. Niewykluczone, że odpowiedzią jest po
prostu: „Prawo Unii nie może nic zdziałać: spróbujcie w Europejskim
Trybunale Praw Człowieka”. [OAG_3]
(3)
ENG:
I see no basis for saying that, in such circumstances, the EU citizen
should be required temporarily to sacrifice his right to a family life
(or, put slightly differently, that he should be prepared to pay that price in
order subsequently to be able to rely on EU law as against his own
Member State of nationality). Indeed, under Directive 2004/38, family
members are entitled to accompany the EU citizen immediately to the
host Member State. Directive 2004/38 does not make their entitlement to
that derived right conditional on a minimum residence requirement for
the EU citizen. Rather, the conditions applicable to the dependents vary
with length of residence in the territory.
POL:
Nie widzę żadnych podstaw do twierdzenia, że w takich okolicznościach
od obywatela Unii można wymagać tymczasowego poświęcenia prawa
do życia rodzinnego (albo, ujmując rzecz nieco odmiennie, że powinien
on być przygotowany na zapłatę tej ceny za możliwość powołania się w
terminie późniejszym na prawo Unii względem państwa członkowskiego,
którego obywatelstwo posiada). Zgodnie bowiem z dyrektywą 2004/38
członkowie rodziny uprawnieni są do towarzyszenia obywatelowi Unii
bezpośrednio w państwie członkowskim pochodzenia. Dyrektywa 2004/38
nie uzależnia ich ewentualnego uprawnienia do prawa pochodnego od
wymogu minimalnego pobytu dla obywatela Unii. Przeciwnie, warunki
mające zastosowanie względem osób pozostających na utrzymaniu mogą
różnić się w zależności od długości pobytu na terytorium. [OAG_3]
We may wonder what effect the insertion of bowiem in the excerpt in (3) has on
the interpretation of the relation holding between the sentence with indeed and the
preceding one. In the English excerpt, indeed indicates a kind of sequential relation-
ship between the sentences and it may well be paraphrased as “what is more” (cf.
Aijmer 2007: 332). In addition, it “signals that what follows is not only in agree-
ment with what precedes, but is additional evidence being brought to bear on the
argument” (Traugott and Dasher 2002: 164 quoted in Aijmer 2007: 332). The Polish
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 103
translation, however, is not entirely consistent with the source text, since the use of
bowiem in the Polish sentence suggests that according to the author of the Opinion,
EU citizens should not be required to sacrifice their right to family life and that this
fact follows from Directive 2004/38 under which “family members are entitled to
accompany the EU citizen immediately to the host Member State.” In addition, in
the source text, indeed is not used to mark causality; rather, it adds emphasis and has
a discourse marker status. Finally, the authority associated with the English indeed
is no longer detectable in the Polish wording. Interestingly, the cross-checking of
the English correspondences of bowiem found in the Polish data suggests that bow-
iem is sometimes inserted in the Polish translation to mark cohesion, even where
there is no direct equivalent in the English source text.
In a similar vein, omission of indeed in the Polish version of the Opinion seems
to lessen the rhetorical force of the translated text and, potentially, its ability to influ-
ence the reader’s attitude and beliefs. During the analysis, several patterns became
visible. Firstly, it was observed that sentence-initial occurrences of indeed were
sometimes left untranslated (7 tokens), as in (4). It must be admitted, however, that
although these omissions accounted for almost 50% of all omissions of indeed,
sentence-initial uses of this adverb were more frequently rendered in Polish as w
istocie (as a matter of fact) or bowiem (for/because) discussed above.
(4)
ENG:
The objective of those articles was to protect shareholders and creditors
from market behaviour that might reduce a company’s capital and falsely
raise its share price. That objective is not defeated by a company acquiring
its own shares where a legal obligation requires it to do so. Indeed, as the
Portuguese Government and the Commission rightly point out, Article
20(1)(d) specifically permits Member States to allow a company to acquire
shares ‘by virtue of a legal obligation’ without having recourse to the
procedures laid down in Article 19.
POL:
Celem tych artykułów była ochrona akcjonariuszy i wierzycieli przed
zachowaniami rynkowymi, które mogą zmniejszyć kapitał spółki lub
sztucznie podwyższyć cenę akcji spółki. Z celem tym nie jest sprzeczne
nabycie przez spółkę jej akcji w wykonaniu obowiązku przewidzianego
prawem. [OMISSION] Jak trafnie wskazały rząd portugalski i Komisja,
art. 20 ust. 1 lit. d) pozwala państwom członkowskim na nabycie akcji
właśnie „w wykonaniu obowiązków ustawowych”, bez konieczności
stosowania procedur przewidzianych w art. 19. [OAG_5]
Secondly, the strategy of omission was seen also in the case of parenthetical uses
of indeed, most notably in the structures: and indeed, or indeed and though indeed,
as illustrated in (5), in which “the right to impose criminal sanctions” is no longer
emphasised in the Polish text, unlike the English original.
104 M. Szczyrbak
(5)
ENG:
Article 25 merely confirms that the administrative measures and
sanctionsthat it requires Member States to impose are ‘without prejudice
to their civil liability regime[s]’ (or indeed to their right to impose
criminal sanctions).
POL:
Artykuł 25 jedynie potwierdza, że środki i sankcje administracyjne,
których nakładania wymaga on od państw członkowskich, pozostają, bez
uszczerbku dla ich systemu odpowiedzialności cywilnej” (lub
[OMISSION] ich prawa do nakładania sankcji karnych). [OAG_5]
(6)
ENG:
It follows from the references there to ‘sufficiently serious’, ‘severe
violation’ and ‘accumulation … which is sufficiently severe’ that not
every violation of human rights (repugnant though it indeed may be)
will fall to be considered as an ‘act of persecution’ for the
purposes of Article 9.
POL:
Z zawartych w nim wyrażeń: „wystarczająco poważne”, „poważne
naruszenie” i „kumulacja […] naruszeń […], które są wystarczająco
poważne” wynika, że nie każde naruszenie praw człowieka
(niezależnie od tego, jak [OMISSION] może być dolegliwe) można
uznać za kwalifikujące się jako „akt prześladowania” do celów art.
9 dyrektywy. [OAG_9]
In sum, the contrastive analysis has shown that indeed can adopt different meanings
and that these meanings are not always interchangeable. It was also demonstrated
that during the translation process the rhetorical force of arguments may be affected
due to the omission of this adverb – which itself can be interpreted as another mean-
ing – or through the choice of non-conventional equivalents of the co-occurring
adjectives.
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 105
(7)
ENG:
I therefore have little difficulty in agreeing with the majority of the
submissions to the Court on this question that Article 5(2)(a) of the
Directive covers only analogue to analogue copying. The word
‘photographic’ necessarily requires optical input of an analogue
original, and the need for paper or a similar output medium means
that the output must also be analogue.
POL:
Z tego względu nie mam wielkich trudności, aby zgodzić się ze
zgłoszonym Trybunałowi w zakresie tego pytania stanowiskiem
większości, zgodnie z którym art. 5 ust. 2 lit. a) dyrektywy obejmuje
tylko kopiowanie „z formatu analogowego na analogowy”. Słowo
„fotograficzna” koniecznie wymaga optycznego wprowadzenia
oryginału w formie analogowej, a potrzeba posłużenia się papierem lub
podobnym nośnikiem wyjściowym oznacza, że etap wyjścia musi
dotyczyć formy analogowej. [OAG_12]
The effect of the omission of necessarily in the translated text, as compared with the
original, can in turn be observed in (8) and (9). Accordingly, the Polish wording in
(8), i.e. “miał on wiedzę” (he was aware), lacks any equivalent unit signalling the
writer’s epistemic stance conveyed in the English text by necessarily,11 similarly to
(9), in which the deontic modalisation expressed by necessarily is no longer present
in the Polish unmodalised statement “a w konsekwencji arbitralny” (and hence
arbitrary).
Remarkably, it was the most frequent translation strategy in the case of this adverb.
10
As pointed out by Simon-Vandenbergen and Aijmer (2007: 188), epistemic uses of necessarily
11
(8)
ENG:
I am also far from certain that he would necessarily have been aware
of the (limited) possibilities of applying to this Court for legal aid.
POL:
Daleka jestem również od pewności, że miał on [OMISSION] wiedzę
na temat (ograniczonych) możliwości zwrócenia się do Trybunału
o pomoc prawną. [OAG_7]
(9)
ENG:
In order to avoid this logical conundrum, most legal residence tests
specify a fixed (and hence necessarily arbitrary) ‘qualifying’ period
of presence before residence is achieved. There is no objective difference,
however, between presence the day before and presence the day after
the magic figure is attained.
POL:
W celu uniknięcia tej łamigłówki logicznej większość kryteriów
prawnych zamieszkania przewiduje określony (a w konsekwencji
[OMISSION] arbitralny) okres „kwalifikacyjny” obecności, zanim
nastąpi zamieszkanie. Nie ma jednak żadnej obiektywnej różnicy
pomiędzy obecnością w dniu poprzedzającym magiczną cezurę a
obecnością w dniu następującym po niej. [OAG_7]
In contrast to the translations of necessarily, less variety was observed in the case
of not necessarily, with 20 instances of the prototypical niekoniecznie, 11 attesta-
tions of nie musieć (not have to) and only one omission. While the translations of
necessarily emphasised inevitability or necessity, Polish renditions of not necessar-
ily revealed the writer’s epistemic stance, as in (10). At this point it might also be
remarked, following Simon-Vandenbergen and Aijmer (2007: 190), that since nega-
tion presupposes its counterpart in the discourse, not necessarily marks the counter-
ing of an expectation based on the writer’s own experience or logical assumptions.
This was clearly reflected by the Polish translations such as, for instance, nie musi
(wcale) (not have to at all) or nie zawsze (not always).
(10)
ENG:
Where one physically resides is a question of fact. However, the place
where a person actually lives or is registered as living may not
necessarily be the place at which a Member State defines, as a matter
of law, that person to have his permanent residence or domicile.
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 107
POL:
Fizyczne miejsce zamieszkania jest kwestią z zakresu okoliczności
faktycznych. Jednakże miejsce, w którym dana osoba faktycznie
zamieszkuje lub jest zameldowana jako zamieszkała, może
niekoniecznie być miejscem, które państwo członkowskie określa na
gruncie prawa jako miejsce jej stałego zamieszkania. [OAG_11]
Overall, the study has shown that the modalisation expressed by necessarily was
sometimes lost in the Polish version of the Opinions, even though it must be
acknowledged that in the majority of occurrences Polish equivalents of necessarily
were identified in the text. These translations, as predicted, conveyed the meaning
of external necessity and inevitability. On the other hand, in the case of not neces-
sarily, the translations confirmed the meaning of counterexpectancy and, signifi-
cantly, only one occurrence of this adverb was left untranslated.
(11)
ENG:
Y [COUNTERCLAIM]
The difficulties associated with preferring a uniform interpretation
over one that defers to national law in its definition of that provision are,
in my view, more theoretical than real.
X’ [ACKNOWLEDGMENT]
It is, of course, true that each Member State has its own particular
structure of courts; and that neither this Framework Decision nor any
other has thus far attempted any degree of harmonisation in that field.
Y’ [RETURN TO COUNTERCLAIM]
However, I point out that, from a practical point of view, whether a
‘court having jurisdiction in particular in criminal matters’ is interpreted
as an autonomous concept or interpreted by reference to the law of the
issuing State makes no actual difference to the court in the executing
State. It is still faced with the basic problem that it is (probably)
unfamiliar with the court structure of the issuing State. It may therefore
be unable, without making further enquiries, to satisfy itself whether
ornot the court in the issuing State satisfies that definition.
POL:
Y [COUNTERCLAIM]
Trudności związane z przedkładaniem jednolitej wykładni nad
wykładnią,która odsyła do prawa krajowego w celu zdefiniowania
owego przepisu, mają charakter bardziej teoretyczny niźli rzeczywisty.
X’ [ACKNOWLEDGMENT]
Pozostaje oczywiście prawdą, że każde państwo członkowskie
posiada swój własny ustrój sądów, a także że ani niniejsza decyzja
ramowa, ani pozostałe decyzje ramowe dotychczas nie podejmowały
próby dokonania harmonizacji w tym zakresie.
Y’ [RETURN TO COUNTERCLAIM]
Zwracam jednakże uwagę, iż z praktycznego punktu widzenia to,
czy wyrażenie „sąd właściwy także w sprawach karnych” jest
interpretowane jako autonomiczne pojęcie, czy też w drodze odesłania
do prawa państwa wydającego, nie stanowi faktycznie żadnej różnicydla
sądu w państwie wykonującym. Sąd ten dalej stoi przed zasadniczym
problemem (prawdopodobnej) nieznajomości ustroju sądów w państwie
wydającym. Zatem bez zasięgnięcia bardziej szczegółowych informacji
może on nie być w stanie przekonać się, czy sąd w państwie wydającym
spełnia tę definicję. [OAG_7]
the necessary condition for the request to be treated as valid is that it has been com-
pleted…” Again, of course is used for interpersonal ends and it operates as an
engagement device which shows that the Advocate General – to use Hyland’s (2005)
words – recognises the presence of readers and tries to connect to them and to pull
them along with her argument.
(12)
ENG:
This Court cannot, of course, say which interpretation is correct but it
seems to me that neither would be inconsistent with Articles 7 and 22
of Directive 92/12 – provided, of course, that (i) the request is treated
as valid once it has been completed and (ii) the relevant provisions are
sufficiently clear to ensure that whatever procedure is applied complies
with the requirements of legal certainty.
POL:
Trybunał nie może z pewnością rozstrzygnąć, która interpretacja jest
prawidłowa, lecz moim zdaniem ani pierwsza, ani druga nie są sprzeczne
z art. 7 i 22 dyrektywy 92/12, oczywiście pod warunkiem że i) wniosek
jest uznany za prawidłowy po uzupełnieniu go oraz ii) odpowiednie
przepisy są wystarczająco jasne, aby zapewnić, że niezależnie
odstosowanej procedury odpowiada ona wymogom pewności
prawa. [OAG_14]
Finally, it should be noted that unlike indeed and necessarily in the case of which
“the authorial imprint” was lost in the translation process, of course was almost
always translated and oczywiście was the translators’ preferred choice.12
The translations of clearly which were found in the corpus suggested the following
meanings: (1) “obviousness resulting from accessible evidence”, such as in w oczy-
wisty sposób (evidently/in an evident manner) or oczywiście (of course/obviously)
and (2) authority and conviction, as indicated by the translations wyraźnie (plainly/
expressly) and najwyraźniej (apparently/most obviously).13 By analogy to of course,
the adverb clearly was used both sentence-initially and sentence-medially.
12
It is interesting to note that in the case of Swedish, Dutch and German correspondences of of
course, the most frequent translations, as attested by Simon-Vandenbergen and Aijmer (2007: 342-
343), i.e. naturligtvis, natuurlijk and natürlich, respectively, are conventional equivalents of the
English naturally, which suggests that “naturalness” or the fact of being “expected and accepted”
is the most salient meaning of of course. This, however, is not corroborated by the Polish data
analysed here, where only two instances of of course were translated as naturalnie (naturally).
13
On the other hand, the non-modal use of clearly, typical of legalese and linked to explicitness (as
in clearly defined or clearly indicate) was translated as wyraźnie (plainly/expressly) or jasno
(plainly).
110 M. Szczyrbak
As pointed out above, the obviousness indicated by clearly was mirrored by the
Polish translation oczywiście (of course), which is shown in (13) below, whereas
authority and conviction based on accessible evidence were conveyed by w sposób
wyraźny (in a clear manner), as in (14).14 Interestingly enough, although the latter
translation seems to indicate an adverb of manner,15 its sentence-initial occurrence
in English, though not marked off by a comma, excludes this possibility.
(13)
ENG:
I can accept that a measure which reduces the amount of duty payable
on the purchase of a new principal residence is likely to facilitate
moving in general, and that that may include moving closer to one’s
place of work, with the health and environmental benefits attendant
thereon. But that begs the question: why not facilitate, in the same way,
moving into (or out of) the Flemish Region (which would clearly be
beneficial in order to limit cross-border commuting)? The disputed
measure, however, links availability of the offset to sequential purchases
within the Flemish Region.
POL:
Jestem w stanie przyjąć, że środek zmniejszający kwotę opłaty należnej
przy zakupie nowej nieruchomości stanowiącej główne miejsce
zamieszkania może ogólnie ułatwiać przenoszenie się, co może
obejmować przenoszenie się bliżej miejsca pracy danej osoby z
towarzyszącymi temu korzyściami dla zdrowia i środowiska. Rodzi to
jednak pytanie: dlaczego nie ułatwiać w ten sam sposób przenoszenia
się do Regionu Flamandzkiego (lub poza ten region) (co byłoby
oczywiście korzystne dla ograniczenia dojazdów transgranicznych)?
Sporny środek łączy jednakże dostępność możliwości odliczenia z
kolejnymi zakupami w Regionie Flamandzkim. [OAG_29]
(14)
ENG:
Clearly there are points of similarity between the contested measures
in those cases and the present matter: Indeed, the Commission alleges
discrimination and restriction of Treaty freedoms in all three.
POL:
W sposób wyraźny pomiędzy spornymi środkami w tychże sprawach
oraz w obecnej sprawie istnieją elementy podobieństwa: Komisja zarzuca
dyskryminację i ograniczenie swobód traktatowych we wszystkich
trzech sprawach. [OAG_29]
14
Cf. the most common German translations of clearly and of course, that is deutlich and natürlich,
respectively (Simon-Vandenbergen and Aijmer 2007: 331, 343), which indicate the difference
between the two adverbs. In the Polish translations analysed here, this difference is less obvious.
15
Only one such translation was attested by the data.
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 111
For illustrative purposes, omissions of clearly in the Polish text are shown in (15)
and (16) below. Again, the absence of the Polish equivalent in the translation results
in the unmodalised statements “it has to look” and “the General Court has,”
respectively.
(15)
ENG:
In order for a national court to do this effectively, it clearly has to
look beyond the wording of the Decree.
POL:
Aby sąd krajowy mógł to skutecznie rozważyć, powinien [OMISSION]
kierować się czymś więcej, niż tylko brzmieniem dekretu. [OAG_24]
(16)
ENG:
In any event, the General Court clearly has ‘full jurisdiction’ for the
purposes of Article 6(1) ECHR (not to be confused with the EU concept
of unlimited jurisdiction to review financial penalties).
POL:
W każdym razie Sąd [OMISSION] posiada „pełne kompetencje
orzecznicze” w rozumieniu art. 6 ust. 1 EKPC (nie należy tego
mylić z unijnym pojęciem „nieograniczonego prawa orzekania” w
zakresie kontroli kar finansowych). [OAG_30]
The last adverb to be discussed in this chapter is obviously, which was translated
chiefly as oczywiście (of course). Alternative translations included the adjective
oczywisty (obvious) as well as the adverbials w oczywisty sposób (evidently/in an
evident manner) and wprost (simply). With regard to omissions, only two instances
were found. As for position in the sentence, the adverb occurred mostly medially
and once initially. The core meaning of obviously as borne out by the Polish data
was that of “obviousness,” rather than its evidential status. A point worth noting
here is that in the dataset analysed, both of course and obviously had the same Polish
counterpart, i.e. oczywiście, as its preferred translation (see (12) and (17)). This is in
contrast with what Simon-Vandenbergen and Aijmer (2007: 219–220) say about the
differences between of course and obviously. In their view, although both adverbs
share the backgrounding function, of course means “as everybody knows or should
know” or “according to expectations,” whereas obviously means “as evidence
shows” or “as knowledge of the world shows.” Thus, of course is more forceful and
authority-oriented, while obviously is evidential and does not necessarily imply the
hearer’s knowledge. This, however, was not so manifest in the Polish data, in which
112 M. Szczyrbak
the aspect of “obviousness” stood out, rather than the evidential status of the adverb.
It can be posited then that while English distinguishes between of course and obvi-
ously, linked to “expectation” and “evidence,” accordingly, such a distinction
appears to have less or no relevance in Polish, where only oczywiście is used.
(17)
ENG:
KME asks the Court of Justice to replace the General Court’s appraisal
by KME’s preferred test. Not only is that inadmissible but the General
Court’s appraisal is obviously correct and KME obviously wrong.
POL:
KME zwraca się do Trybunału o zastąpienie oceny dokonanej przez
Sąd preferowanym przez KME kryterium. Jest to nie tylko
niedopuszczalne, ale też ocena Sądu jest oczywiście prawidłowa, podczas
gdy ocena KME jest oczywiście błędna. [OAG_30]
5 Concluding Remarks
As the study of bilingual data has shown, modal adverbs of certainty are polyse-
mous, with more conventional meanings being enriched with ad hoc readings. It is
clearly seen that apart from reflecting the author’s varied degrees of certainty, the
adverbs are used for rhetorical and argumentative purposes, that is to convey autho-
rial stance and to dialogue with alternative standpoints. To be precise, the analysis
confirmed the canonical meanings of indeed, necessarily, not necessarily and of
course, as reflected in the Polish translations w istocie, z konieczności, niekoniec-
znie and oczywiście, respectively. At the same time, somewhat surprisingly, it was
established that indeed was translated non-conventionally as bowiem (for/because),
which did not fit under the general meaning of this adverb. It was also remarkable
to observe that sentence-initial and parenthetical occurrences of indeed were often
left untranslated. Similarly, omission was the most common strategy in the case of
necessarily. Not necessarily, conversely, was almost always retained in the transla-
tion, and so was of course, performing the role of a backgrounding device or a soli-
darity marker. As expected, both of course and indeed were found in Concessive
contexts, in which they prefaced disagreement. Obviousness, in turn, seemed salient
in the case of clearly, which conveyed authority and conviction, too. At the same
time, it was noted, the distinction between of course and clearly appeared to be less
visible than was the case in the translations into other languages.16 Finally, the
16
Cf. Simon-Vandenbergen and Aijmer (2007).
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 113
17
Conveniently, in the case of EU legal discourse, multilingual corpora representing the official
languages of the EU Member States are freely available to an analyst.
114 M. Szczyrbak
References
Adams, H., & Quintana-Toledo, E. (2013). Adverbial stance marking in the introduction and conclu-
sion sections of legal research articles. Revista de Lingüística y Lenguas Aplicadas, 8, 13–22.
Aijmer, K. (2007). Modal adverbs as discourse markers: A bilingual approach to the study of
indeed. In J. Rehbein, C. Hohenstein, & L. Pietsch (Eds.), Connectivity in grammar and dis-
course (pp. 329–344). Amsterdam/Philadelphia: John Benjamins.
Baker, M. (1999). The role of corpora in investigating the linguistic behaviour of professional
translators. International Journal of Corpus Linguistics, 4, 281–298.
Bakhtin, M. M. (1981). The dialogic imagination. In M. Holquist (Ed.), Four essays by
M.M. Bakhtin (C. Emerson & M. Holquist, Trans.). Austin: University of Texas Press.
Barth-Weingarten, D. (2003). Concession in spoken English. On the realisation of a discourse-
pragmatic relation. Tübingen: Narr.
Biber, D., & Finegan, E. (1988). Adverbial stance types in English. Discourse Processes, 11, 1–34.
Biber, D., & Finegan, E. (1989). Styles of stance in English: Lexical and grammatical marking of
evidentiality and affect. Text, 9, 93–125.
Biber, D., et al. (1999). Longman grammar of spoken and written English. Harlow: Longman.
Chafe, W. L. (1986). Evidentiality in English conversation and academic writing. In W. L. Chafe
& J. Nichols (Eds.), Evidentiality: The linguistic coding of epistemology (pp. 261–272).
Norwood: Ablex.
Chindamo, M., Allwood, J., & Ahlsén, E. (2012). Some suggestions for the study of stance in com-
munication. 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/
IEEE International Conference on Privacy, Security, Risk and Trust (pp. 617–622).
Couper-Kuhlen, E., & Thompson, S. A. (2000). Concessive patterns in conversation. In E. Couper-
Kuhlen & B. Kortmann (Eds.), Cause, condition, concession, contrast: Cognitive and dis-
course perspectives (pp. 381–410). Berlin/New York: Mouton de Gruyter.
du Bois, J. W. (2007). The stance triangle. In R. Englebretson (Ed.), Stancetaking in discourse:
Subjectivity, evaluation, interaction (pp. 139–182). Amsterdam/Philadelphia: John Benjamins.
Downing, A. (2009). Surely as a marker of dominance and entitlement in the crime fiction of
P.D. James. Brno Studies in English, 35(2), 79–92.
Geeraerts, D. (2006). Prospects and problems of prototype theory. In D. Geeraerts (Ed.), Cognitive
linguistics: Basic readings (pp. 141–167). Berlin/New York: Mouton de Gruyter.
Grisot, C., & Moeschler, J. (2014). How do empirical methods interact with theoretical pragmat-
ics? The conceptual and procedural contents of the English Simple Past and its translation into
French. In J. Romero-Trillo (Ed.), Yearbook of Corpus Linguistics and Pragmatics 2014: New
empirical and theoretical paradigms (pp. 7–33). Dordrecht: Springer.
Haddington, P. (2007). Positioning and alignment as activities of stancetaking in news interviews.
In R. Englebretson (Ed.), Stancetaking in discourse: Subjectivity, evaluation, interaction
(pp. 283–317). Amsterdam/Philadelphia: John Benjamins.
Hoye, L. (1997). Adverbs and modality in English. Essex: Longman.
Huddleston, R. D., & Pullum, G. K. (2002). The Cambridge grammar of the English language.
Cambridge: Cambridge University Press.
Hunston, S. (2002). Corpora in applied linguistics. Cambridge: Cambridge University Press.
Hyland, K. (1994). Hedging in academic writing and EAP textbooks. English for Specific Purposes,
13, 239–256.
Hyland, K. (2005). Stance and engagement: A model of interaction in academic discourse.
Discourse Studies, 7(2), 173–192.
Kärkkäinen, E. (2007). The role of ‘I guess’ in conversational stancetaking. In R. Englebretson
(Ed.), Stancetaking in discourse: Subjectivity, evaluation, interaction (pp. 183–219).
Amsterdam/Philadelphia: John Benjamins.
Kärkkäinen, E. (2012). On digressing with a stance and not seeking a recipient response. In
E. Kärkkäinen & John du Bois (Eds.), Stance, affect, and intersubjectivity in interaction:
Sequential and dialogic perspectives. Special issue of Text and Talk, 32(4), 477–502.
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 115
Keisanen, T. (2006). Patterns of stance staking: Negative yes/no interrogatives and tag questions in
American English conversation. Acta Universitatis Ouluensis, B71. Oulu: Oulu University
Press. http://urn.fi/urn:isbn:9514280393. Accessed 14 Jan 2014.
Keisanen, T. (2007). Stancetaking as an interactional activity: Challenging the prior speaker. In
R. Englebretson (Ed.), Stancetaking in discourse: Subjectivity, evaluation, interaction
(pp. 253–281). Amsterdam/Philadelphia: John Benjamins.
McEnery, A., & Xiao, Z. (2007). Parallel and comparable corpora: What are they up to? In
G. Anderman & M. Rogers (Eds.), Incorporating corpora: Translation and the linguist.
Translating Europe (pp. 18–31). Clevedon: Multilingual Matters.
Niemelä, M. (2011). Resonance in storytelling: Verbal, prosodic and embodied practices of stance
taking. Acta Universitatis Ouluensis, B95. Oulu: Oulu University Press. http://urn.fi/
urn:isbn:9789514294174. Accessed 14 Jan 2014.
Precht, K. (2003). Stance moods in spoken English: Evidentiality and affect in British and
American conversation. Text, 23(2), 239–257.
Quirk, R., Greenbaum, S., Leech, G., & Svartik, J. (1985). A comprehensive grammar of the
English language. London: Longman.
Rauniomaa, M. (2007). Stance markers in spoken Finnish: minum mielestä and minusta in assess-
ments. In R. Englebretson (Ed.), Stancetaking in discourse: Subjectivity, evaluation, interac-
tion (pp. 221–252). Amsterdam/Philadelphia: John Benjamins.
Rauniomaa, M. (2008). Recovery through repetition. Returning to prior talk and taking a stance in
American-English and Finnish conversations. Acta Universitatis Ouluensis, B85. Oulu: Oulu
University Press. http://urn.fi/urn:isbn:9789514289248. Accessed 14 Jan 2014.
Salager-Meyer, F. (1995). I think that perhaps you should: A study of hedges in written scientific
discourse. Journal of TESOL France, 2, 127–143.
Salmi-Tolonen, T. (2005). Persuasion in judicial argumentation: The opinions of the Advocates
General at the European Court of Justice. In H. Halmari & T. Virtanen (Eds.), Persuasion
across genres. A linguistic approach (pp. 59–101). Amsterdam/Philadelphia: John Benjamins.
Simon-Vandenbergen, A.-M. (1992). The interactional utility of of course in spoken discourse.
Occasional Papers in Systemic Linguistics, 6, 213–226.
Simon-Vandenbergen, A.-M., & Aijmer, K. (2007). The semantic field of modal certainty. A
corpus-based study of English adverbs. Berlin/New York: Mouton de Gruyter.
Simon-Vandenbergen, A.-M., White, P., & Aijmer, K. (2007). Presupposition and ‘taking-for-
granted’ in mass communicated political argument. An illustration from British, Flemish and
Swedish political colloquy. In A. Fetzer & G. E. Lauerbach (Eds.), Political discourse in the
media (pp. 31–74). Amsterdam/Philadelphia: John Benjamins.
Szczyrbak, M. (2014). Of course, indeed or clearly? The interactional potential of modal adverbs
in legal genres. SKASE Journal of Theoretical Linguistics , 11(2), 90–102. http://www.skase.
sk/Volumes/JTL26/pdf_doc/05.pdf. Accessed 20 Jan 2015.
Traugott, E. C., & Dasher, R. B. (2002). Regularity in semantic change. Cambridge: Cambridge
University Press.
Traugott, E. C. (2010). Dialogic contexts as motivations for syntactic change. In R. A. Cloutier,
A. M. Hamilton-Brehm, & W. A. Kretzschmar (Eds.), Variation and change in English gram-
mar and lexicon (pp. 11–27). Berlin/New York: Mouton de Gruyter.
Tseronis, A. (2009). Qualifying standpoints. Stance adverbs as a presentational device for manag-
ing the burden of proof. Utrecht: LOT.
White, P. (2003). Beyond modality and hedging: A dialogic view of the language of intersubjective
stance. Text, 23(2), 259–284.
Primary Sources
Opinions of Advocates General at the European Court of Justice. Downloaded from: http://curia.
europa.eu/jcms/jcms/j_6/. Accessed 10 Jan 2014.
Part II
Contrastive Analysis with Comparable
Corpora
Adverbial Clauses in English and Norwegian
Fiction and News
Hilde Hasselgård
Abstract This paper considers the placement of adverbial clauses in English and
Norwegian with regard to their form, meaning, information status and semantic
relation to the matrix clause proposition. The study is based on comparable original
texts in both languages, representing two registers: fiction and news reportage. End
position of adverbial clauses is most common in both languages, with initial posi-
tion as an alternative in many cases. Positional freedom is found to differ greatly
between finite and non-finite clauses, and also across different semantic types of
adverbial clauses. For those types of adverbial clauses that vary across positions,
mostly time and contingency clauses, information status (new vs. anchored) is
found to have some influence. Iconic order was found to be less important, but was
more noticeable in fiction than in news. The placement of adverbial clauses seems
to be guided by similar principles in both languages. Register differences are identi-
fied in both languages, but they do not show consistent patterns.
1 Introduction
H. Hasselgård (*)
Department of Literature, Area Studies and European Languages, University of Oslo,
Oslo, Norway
e-mail: hilde.hasselgard@ilos.uio.no
Adverbial clauses are defined by Hetterle (2015: 2) as “clausal entities that modify,
in a very general sense, a verb phrase or a main clause and explicitly expresses a
conceptual-semantic concept such as simultaneity, anteriority, posteriority,
causality or conditionality”. In more traditional terms they are subordinate
clauses, finite and non-finite, which have the function of (adjunct) adverbial in a
matrix clause (e.g. Biber et al. 1999: 194). Finite adverbial clauses in both English
and Norwegian are typically marked by “a subordinator indicating the relationship to
the main clause” (ibid.; see also Faarlund et al. 1997: 800). English non-finite adver-
bial clauses include infinitive clauses, participle clauses (−ing and -ed) and verbless
clauses, as well as a category of ‘prepositional clauses’, i.e. a clause governed by a
preposition.1 Norwegian non-finite clauses in the present material are invariably
1
The reason for regarding such constructions as clauses rather than phrases is that they invariably
contain a proposition and are also clause-like in their positional preferences; see Hasselgård (2010:
37).
Adverbial Clauses in English and Norwegian Fiction and News 121
(2) Til tross for at han hadde drukket konjakk, ønsket han å være usynlig. (OEL1)
Lit: “In spite of that he had drunk cognac, wished he to be invisible”
In spite of the cognac he wanted to be invisible. (OEL1T)
The adverbial clauses are furthermore classified semantically into the following cat-
egories: time, space, manner, contingency, respect, and comparison; see Hasselgård
(2010: 39). Contingency clauses comprise adjuncts of condition, concession, cause
and purpose (ibid.); see examples (6)–(11) in Sect. 5.3.
Adverbial positions are classified as in Biber et al. (1999: 771) and Hasselgård
(2010: 41 ff), into initial, medial and end position. Initial is the position before the
matrix clause, as in (2) and (3); medial position is after the subject, but before any
postverbal obligatory element of the matrix clause, as in (4); and end position is
after the matrix clause, as in (1) and (5). The same positions are identified for both
languages. For definitions concerning information structure and text strategy, see
Sects. 6 and 7.
(3) Unless something’s done about her she’ll end up like her mother.
(ICE-GB: W2F)
(4) A 19th-century ornithologist, Robert Gray, when visiting the island in the
1860s, described an occasion on Ailsa Craig when he disturbed the puffin
population. (ICE-GB: W2C)
(5) Josh nodded before straightening up away from the gate. (ICE-GB: W2F)
2
Examples (1) and (2) come from the English-Norwegian Parallel Corpus (ENPC). In ENPC
examples the original is given first. Norwegian examples are followed by a word-for-word transla-
tion, while the published (idiomatic) translation is followed by a tag ending in -T.
122 H. Hasselgård
3 P
revious Work on Adverbial Clauses in English
and Norwegian
(see Diessel 2005: 454). Diessel argues that the placement of adverbial clauses is
governed by three competing forces: processing (which favours final placement
(ibid.: 459)), semantics (acknowledging that “different semantic types of adverbial
clauses differ in their distribution”, (ibid.: 465)), and discourse-pragmatic factors,
including information structure and iconicity, which can often explain the choice of
initial position (ibid.). In a follow-up study, Diessel (2008) looks specifically at how
the placement of temporal clauses introduced by when, after, before, once, and until
may be determined by “iconicity of sequence” (passim). Iconicity is found to have
a “strong and consistent effect on the linear structuring of complex sentences with
temporal adverbial clauses” (2008: 483), but this factor is more clearly visible with
initial than with final adverbial clauses. The placement of an adverbial clause is also
found to be influenced by its length relative to the main clause as well as by the
conjunction introducing it (ibid.: 484).
Thompson et al. (2007: 271 ff) discuss initial adverbial clauses as a means of
cohesion both within and across paragraphs. In both cases an initial adverbial clause
is cohesive by means of back-reference to the previous sentence or paragraph.
However, initial adverbial clauses are also said to be “bidirectional, linking what has
gone before to what is to come” (2007: 296). Conversely, the information encoded
in a postposed adverbial clause “may be significant, closely parallel to that encoded
in clauses in coordination”, and an adverbial clause in end position may even “con-
vey globally crucial information and mark a turning point or peak” (ibid.).
Hasselgård (2010) studies adjunct adverbials in general, and makes particular
note of adjuncts realized by clauses. 74% of adverbial clauses are found in end posi-
tion, 24% in initial position, and 2% in medial position (2010: 87). The semantic
type to occur most frequently in initial position is contingency (ibid.), followed at a
distance by time. However, the same two categories are also the most common ones
in end position (ibid.: 136), reflecting that time and contingency are the most com-
mon meanings conveyed by adverbial clauses overall. It is suggested that
adverbial clauses are placed initially if they do one or more of the following discourse jobs:
(i) provide a setting / frame of reference for the following clause(s); (ii) provide a relevant
and/or necessary restriction on the validity of the matrix clause proposition; (iii) provide a
link to the preceding discourse by means of given information or cohesive devices (2010:
91).
duced by fordi (‘because’) are said to occur in initial position if they convey presup-
posed information and in end position if they convey new information (1997: 1036).
Purpose clauses introduced by slik at (‘so that’) are typically in end position while
those introduced by for at (‘for that’) can vary between the positions (ibid.: 1040 f).
No positional tendency is noted for conditional clauses in general, but it is claimed
that conditionals marked by inversion rather than a subjunction are always initial
(ibid.: 1046).
Fossestøl (1980: 280 ff) discusses the relationship between the temporal sequence
of events and the linear sequence of clauses, noting that adverbial clauses with fordi
(‘because’) tend to be sentence final, thus reversing the temporal sequence of the
cause and consequence. However, he does not offer a detailed study of adverbial
clause placement, but simply puts forward some principles of text organization.
Meier (2001) is a contrastive study of causal subordination in English and
Norwegian based on the English-Norwegian Parallel Corpus (ENPC). Meier found
that clauses introduced by because and its closest Norwegian counterpart fordi are
typically found in end position while clauses introduced by other causal subordina-
tors (English as, since and Norwegian siden, ettersom) are more likely to occur in
initial position. This is linked to the information typically conveyed by such clauses
as well as the range of pragmatic functions typically served.
Hasselgård (2014a) investigates the discourse functions of initial adjunct adver-
bials in English and Norwegian, based on the same material as the present study
(see Sect. 4). Initial adjuncts are found to be more frequent in Norwegian than in
English, partly as a consequence of a generally higher frequency of adjuncts. Initial
placement of adjuncts seems to be less marked in Norwegian, and initial adjuncts
are commonly used for discourse linking.
Hasselgård (2014b) studies conditional clauses in English and Norwegian on the
basis of the non-fiction part of the ENPC. Conditionals are most frequently found in
initial position in both languages, but in original texts, end position is more common
in Norwegian than in English. This is linked to the division of conditionals into
open, hypothetical and pragmatic (p. 192 f.): in particular, open conditionals are
more frequently sentence-final in Norwegian than in English. The similarity between
the languages is, however, extensive enough for the position of the conditional
clause to be changed very rarely in translation between the languages (p. 198).
Diessel (2001: 433 f), in a typologically oriented study, argues that the placement
of adverbial clauses in languages that use both initial and final position varies with
the meaning and function of the clauses, and to some extent with the choice of sub-
ordinator. Hetterle (2015: 121–127) makes similar observations on the positions of
adverbial clauses in a number of languages (not including Norwegian or other
Scandinavian languages).
As will have been noted, all the studies point to variation in adverbial clause
placement according to the semantic type of adverbial clause, information structure,
and discourse coherence. While English adverbial clauses have been extensively
studied and fairly well described, the contribution of the present study will be the
language comparison and the results for Norwegian.
Adverbial Clauses in English and Norwegian Fiction and News 125
Based on the previous studies, the following findings can be expected for the
present one:
• Initial placement of adverbial clauses will be more frequent in Norwegian than
in English, partly as a consequence of an overall higher frequency of adverbial
clauses, and partly because of different positional preferences between the lan-
guages (Hasselgård 2014a, b).
• News will use initial position more often than fiction (Hasselgård 2014a).
• Different syntactic types of clauses will have different positional preferences
(Diessel 2005; Hasselgård 2010). In particular, non-finite clauses will have less
freedom of position. The preferences may vary between languages and
registers.
• Different semantic types of clauses will have different positional preferences
(Diessel 2005, 2008; Hasselgård 2010). For example, conditional and causal
clauses will prefer initial and end position, respectively (Hasselgård 2014b;
Altenberg 1987). The preferences may vary between languages and registers.
• Adverbial clauses containing given information are more likely to be sentence-
initial; those containing new information are more likely to be sentence-final
(Ford and Thompson 1986; Diessel 2005; Hasselgård 2010).
• Experiential iconicism/iconic order (Enkvist 1981; Diessel 2008) is likely to
influence the order of subordinate and matrix clause with the possible exception
of causal clauses introduced by because/fordi; cf. Fossestøl (1980) Altenberg
(1987), and Meier (2001).
The English material has been culled from the British component of the International
Corpus of English (ICE-GB), and is a subset of the material used for the study of
adjunct adverbials in Hasselgård (2010). The Norwegian fiction texts come from the
English-Norwegian Parallel Corpus (ENPC), while the Norwegian news texts are a
collection of news articles sampled from various online newspapers in March 2011;
see Table 1 and the list provided in the references section for details. The adverbial
clauses were extracted and analysed manually. A subset was used for the case stud-
ies of information structure and experiential iconicism (Sects. 6 and 7). Table 1 also
shows the frequency of adverbial clauses per 10,000 words, which gives an
30
25.4
25 21.4
20 18.1 18
14.8
15 Mean sentence length
10.1 10.3 9.6
10 Adv clauses per 100 sentences
5
0
English fiction English news Norwegian Norwegian
fiction news
Fig. 1 Mean sentence length and frequency of adverbial clauses per 100 sentences across lan-
guages and registers
indication that in English such clauses are more frequent in fiction, whereas in
Norwegian they are more frequent in news.
However, as occurrence per number of words is not an ideal measure for the fre-
quency of adverbial clauses, the number of adverbial clauses per 100 orthographic
sentences was also calculated. The number of sentences in the Norwegian material
was calculated with WordSmith Tools (Scott 2014), while for the ICE-GB texts the
number of ‘text units’ given for each corpus text was used. The mean sentence length
is practically identical between English and Norwegian, but the registers differ in both
languages, with sentences being almost twice as long in news as in fiction (see Fig. 1).
This indicates that sentence complexity is greater in news, which correlates with a
markedly higher frequency of adverbial clauses per 100 sentences in news than in fic-
tion, as shown by Fig. 1. Frequencies per 100 sentences highlight similarities between
the languages and differences between the registers, and thus give a different picture
than the calculation per 10,000 words reported in Table 1: in terms of frequency per
sentence Norwegian fiction has fewer adverbial clauses than English fiction, while
Norwegian news has more than English news. It should be noted, however, that the
opportunity of occurrence for adverbials is not the sentence, but the clause; thus fre-
quency per sentence is not an ideal measure either. The quantitative findings of this
study will therefore mainly be given in terms of raw frequencies or proportional dis-
tribution of adverbial clauses across positions within each subcorpus.
The hypothesis that Norwegian will use initial position more often than English is
at best only partially confirmed, as shown in Table 2: initial position is proportion-
ally more frequent in Norwegian fiction than in English fiction, but for news, the
opposite is the case. However, Fisher’s exact test shows that the cross-linguistic
Adverbial Clauses in English and Norwegian Fiction and News 127
Table 2 Frequency of positions of adverbial clauses in English and Norwegian fiction and news
E fiction E news N fiction N news
N % N % N % N %
Initial 27 18.5 27 22.9 57 24.8 33 21.2
Medial 1 0.7 2 1.7 2 0.9 0 0
End 118 80.8 89 75.4 171 74.3 123 78.9
Total 146 100 118 100 230 100 156 100
E/I ratio 5.4 4.4 4.0 3.7
25
20
15
initial
10
end
5
0
E fiction E news N fiction N news
Fig. 2 The percentage of sentences in each subcorpus that contain an initial or final adverbial
clause
difference is not statistically significant for either register.3 Table 2 may indicate that
the hypothesis of a (proportionally) more frequent use of initial position in news
than in fiction is correct for English, but not for Norwegian, though the apparently
different distribution of initial vs. end position between fiction and news is found to
be not statistically significant in both languages.
Figure 2 gives a different perspective on the frequencies, which alters the picture
to some degree. The figure shows the percentage of sentences in each subcorpus (cf.
Fig. 1) that contain an adverbial clause in initial and end position, respectively.
From this perspective, initial adverbial clauses are more frequent in news in both
languages, but so, it must be noted, are adverbial clauses in end position. For the
present, I will not pursue the calculations per sentence any further.
The findings presented here are inconclusive with regard to the hypotheses pre-
sented above. There is a higher frequency of initial adverbial clauses in news than
in fiction in both languages, but as this is matched by a higher frequency of final
clauses, the percentage of clauses in initial position is greater in news than in fiction
only in English. Contrary to expectation, Table 1 and Figs. 1 and 2 show adverbial
clauses to be less frequent in Norwegian than in English fiction. However, none of
the frequency differences observed between languages and registers have proved to
be statistically significant.
The calculation took only initial and end position into account.
3
128 H. Hasselgård
It was predicted that non-finite clauses would have less positional freedom than
finite ones, and findings support this hypothesis. Figure 3 shows the percentage of
initial position for finite and non-finite clauses across languages and registers. Non-
finite clauses are consistently less frequent in initial position than finite ones across
languages and registers. The register difference is greater in Norwegian than in
English as regards non-finite clauses, but it is smaller for finite clauses.
The raw frequencies underlying Fig. 3 are shown in Table 3. The differences
between finite and non-finite clause placement are consistent across the material
and across different types of non-finite clauses. That is, the overwhelming prefer-
ence of non-finite clauses in both languages and both registers is for end position.
The differences in distribution between initial and end position are statistically sig-
nificant across the material, at p < 0.01 for English fiction and Norwegian news and
p < 0.0001 for English news and Norwegian fiction. Interestingly, prepositional
finites show the same tendency as non-finites: only one out of 14 such clauses in the
Norwegian material was found in initial position.
40
35.7
35 30.6
30 27.3
25
25
20 Finite
15 Non-finite
10 6.5
4.5 4.4
5 2.3
0
English fiction English news Norwegian fiction Norwegian news
Fig. 3 The percentage of clauses occurring in initial position (in contrast to end position)
(6) Time: When he loses his temper with her she runs off (ICE-GB: W2F)
(7) Space: but he was better off where he was, keeping a low profile.
(ICE-GB: W2F)
(8) Manner: Hun så på klokken som om han skulle begynne med det samme.
(OEL1)
Lit: “She looked at the watch as if he should begin at once”
She looked at her watch as though he was going to begin right away.
(OEL1T)
(11) Respect: Han sier han ble kontaktet og advarte Andhøy mot å seile i
området ... (News: VG2)
Lit: “He says he was contacted and warned Andhøy against to sail
in the area…”
4
Examples from the Norwegian newspaper material are accompanied by a translation (produced
by the author) intended to show the structure of the original without being entirely literal.
130 H. Hasselgård
Table 4 The positional distribution of semantic subtypes of adverbial clauses across languages
and registers (percentages)
English fiction English news Norwegian fiction Norwegian news
N = 145 N = 116 N = 227 N = 153
Initial End Initial End Initial End Initial End
Time 9.7 38.6 6.9 30.2 17.2 37.4 6.5 21.3
Space 0.7 1.4 0.9 0.9 0 0.4 0 0.6
Manner 0 4.8 0 5.2 0.4 11.5 0.6 7.7
Contingency 8.3 35.2 15.5 39.7 7.0 21.6 14.4 35.5
Respect 0 0 0 0.9 0 2.6 0 11.0
Comparison 0 1.4 0 0 0.4 1.3 0 2.6
18.7 81.4 23.3 76.9 25 74.8 21.5 78.7
tion; in fact their proportion in Norwegian fiction is the highest in the material. In
news, contingency clauses are the most frequent type found in initial position in
both languages.
Most semantic types of adverbial clauses are rare or non-existent outside end
position in the present material. It is only with time and contingency clauses that
there seems to be a real choice between the positions – at least they are the only
categories that are frequent enough in both positions to allow a real comparison.
The focus of the next two sections will thus be on these two.
in the context one should look for given information, but in practice, given the win-
dow size of the context in the software used, the span was approximately ten sen-
tences (or s-units).
The typical pattern can be expected to be as in example (12), where the adverbial
clause in sentence (i) is anchored (marked as [A]) in the description of the farm
given in the previous context. The matrix clause is predominantly discourse-new
(marked as [N]), although ‘he’ refers anaphorically to ‘Prince Charles’. Note that
the initial anchored clause also gives a framework of interpretation for the rest of the
passage, by specifying the fundamental premise for the ensuing events. In sentence
(ii) the matrix contains references to both the farm and the sale implied in sentence
(i). The adverbial clauses are discourse-new. (i.e. while it can be inferred that a farm
that has become available is for sale, it cannot be inferred that this will happen
‘without going on the market’). Sentence (iii) is much like (ii) in that the matrix
contains references to the preceding context while the sentence-final adverbial
clause has discourse-new information.
Example (13), on the other hand, is text-initial, so both the matrix and the adverbial
clause contain new information. However, it is the information in the adverbial
clause that is developed in the immediately following context, which makes end
position a natural choice.
5
Note that the study of information structure is restricted to time and contingency clauses, which
are the only ones to vary between initial and end position.
132 H. Hasselgård
Eng fiction 19 2
Eng news 21 4
Anchored
Nor fiction 24 8 New
Nor news 26 6
Eng fiction 18 61
Eng news 21 57
Anchored
Nor fiction 14 60
New
Nor news 20 62
The numbers underlying Fig. 4 are small, and percentages may enlarge the dif-
ferences between languages and registers. However, the general trend is clear, and
the patterns in Fig. 4 support the main hypothesis about information structure: ini-
tial adverbial clauses are anchored in the majority of the cases, as illustrated by
sentence (i) in example (12). Anchored initial clauses mainly co-occur with either
discourse-new or anchored matrixes. Discourse-new initial adverbials, in contrast,
typically co-occur with discourse-new matrixes, e.g. in text-initial sentences. There
are more new initial clauses in Norwegian than in English, especially in fiction.
Figure 5 shows the distribution of anchored and new adverbial clauses in end
position, and gives an almost reverse picture of the patterns in initial position: the
information is discourse-new in 75–80% of the cases. Anchored adverbial clauses in
end position co-occur with anchored and new matrixes about equally often. There is
little difference between the languages. However, the registers differ: the proportion
of anchored adverbial clauses in end position is greater in news in both languages.
Information load thus seems to be a good predictor of adverbial clause placement.6
However, the apparently neat patterns involve a potential chicken-and egg problem:
since syntactic subordination may signal downgrading of information, the fact that
a proposition contains anchored information may cause the writer to encode it as a
subordinate clause and place it in initial position.
6
In fact, Fisher’s exact test shows it to be highly significant for the selection of position, at p <
0.0001 for all parts of the material.
Adverbial Clauses in English and Norwegian Fiction and News 133
In any case the investigation of information structure has shown that adverbial
clauses introducing new information are indeed more frequent in end position in
both languages, while those carrying information anchored in the preceding context
are more frequently initial. Similarly, clauses with information that is developed in
the following context are more likely to be final. However, the picture is not consis-
tent: anchored information can occur in end position – and there are more cases of
this than of new information in initial position.
(14) Da han kom fram til trappen, stanset han, tok av. seg pelsluen og hanskene.
(LSC2)
When he got to the steps he stopped and took off his fur hat and gloves.
(LSC2T)
(15) Lente jeg meg langt nok ut og så den andre veien, kunne jeg få et glimt av.
pissoaret nedenfor Fagerborg kirke. (LSC2)
If I leaned out far enough and looked the other way, I could get a glimpse
of the urinals down by Fagerborg Church. (LSC2T)
The principle of temporal iconicism may apply to clauses other than temporal ones
too, as illustrated by (15): the leaning out is not only a condition for seeing the uri-
nals, it also needs to be prior in time. Kortmann (1991: 137) discovered “marked
tendencies for adjuncts/absolutes expressing ‘time before’ or condition to precede
their matrix clause, and for those receiving a ‘time after’, result, purpose, contrast,
addition/accompanying circumstance or exemplification/specification interpreta-
tion to occur in final position”. In similar fashion one might expect conditions to
occur before consequences (Ford and Thompson 1986; Hasselgård 2014b), as in
(15), and cause to be mentioned before effect (although Altenberg (1987) and
Diessel (2008) have shown that this is not necessarily the case).
The analysis of iconicism was manual, based on close reading of each adverbial
clause in relation to its matrix clause. As in the study of information structure, only
time and contingency clauses were considered. Figure 6 shows the proportion of
clauses that reflect what will henceforth be referred to as ‘iconic order’. This order
134 H. Hasselgård
English fiction 57 14
English news 56 26
iconic
Norwegian fiction 46 21 reverse
Norwegian news 49 31
Fig. 6 The iconic principle in the order of adverbial and matrix clauses
is slightly more frequent in English than in Norwegian. As regards the register com-
parison, the iconic order is slightly more frequent in fiction than in news in both
languages, but not significantly so.7
Based on previous findings, e.g. Diessel (2008), it appears that a more fine-
grained semantic division of adverbial clauses is needed for a study of iconic order.
In particular, conditional and causal clauses should not be lumped together, as they
have very different positional patterns (Diessel 2008; Hasselgård 2014b). It is
important to note that iconic order works differently with different types of adver-
bial clauses. For time clauses, iconic order implies that the order of clauses mirrors
the temporal succession of events. Thus a temporal clause will precede its matrix if
it is about an event prior to the matrix event (and vice versa). For conditional clauses,
iconic order means that the protasis precedes the apodosis, i.e. the condition is men-
tioned first. For causal clauses iconic order implies that cause is mentioned before
effect. This might pull causal (because) clauses to initial position and defer purpose
and result clauses to end position.
Table 5 presents the placement of subcategories of contingency and time clauses.
Time clauses have been subdivided according to their temporal relationship with the
matrix clause; i.e. whether they refer to an event occurring before that of the matrix
clause (MC) after it, or simultaneously with it (cf. also Diessel 2008: 473). The
shaded cells mark iconic order; bold type marks the most frequent position.
As Table 5 shows, most time clauses occur in end position in both languages and
in both registers, whether they refer to an event that is prior to, simultaneous with,
or posterior to that of the matrix clause. There is thus no consistent reflection of
iconic order. However, a time adjunct that denotes an event prior to the one in the
matrix clause, as in (16), seems more likely to be initial than one that denotes an
event posterior to the matrix event. However, end position is more common even for
adverbial clauses denoting prior events; example (17) illustrates this.8 For temporal
clauses denoting an event that follows the one in the matrix clause, initial position
is unlikely, albeit not impossible.
7
Significance according to Fisher’s exact test: English news vs. English fiction: p = 0.1006;
Norwegian news vs. Norwegian fiction: p = 0.3894; Norwegian fiction vs. English fiction: p =
0.1235; Norwegian news vs. English news: p = 0.4114.
8
Diessel (2008: 474) reports a slight majority of initial placement of “prior” temporal clauses, and
of the temporal clauses placed in initial position, a clear majority reflect iconic order. However, the
adverbial clauses in end position do not reflect iconicity to the same extent (ibid.: 475).
Adverbial Clauses in English and Norwegian Fiction and News 135
Table 5 Adverbial clause meanings and iconic order (marked by shaded cells). Raw frequencies
(16) Once that is achieved, he still faces the choice of whether to call a General
Election in June ... (ICE-GB:W2C)
(17) Men gamle kelner Olesen dukket opp da Helen kom inn i kafeen. (OEL1)
But the old waiter Olesen appeared when Helen came into the cafe. (OEL1T)
Conditional clauses are the only ones to consistently precede the matrix more often
than they follow it although end position is only slightly less common. The same
tendencies can be observed in both languages and both registers. Many of the
clause-final conditionals occur in dependent matrix clauses (as in (19)), especially
in Norwegian.
Purpose and result clauses occur almost consistently in end position across the
material, in agreement with iconic order, as they convey a possible outcome of the
matrix clause situation. Note, however, that purpose clauses tend to be non-finite in
both languages, which is another strong reason why they should favour end posi-
tion, cf. Sect. 5.2. Examples are given in (20) and (21).
(20) Purpose: Fredsprisvinneren Muhammad Yunus går til retten for å påklage
avskjedigelsen fra Grameen Bank. (News: VL2)
Lit: “Peace prize winner M. Yunus goes to court to appeal against his
dismissal from Grameen Bank.”
136 H. Hasselgård
(21) Result: People would get full counselling before starting the process of
buying so that they were aware of the commitments of home ownership.
(ICE-GB: W2C)
Causal clauses occur predominantly in end position, thus violating iconic order.
However, this was expected on the basis of Altenberg’s (1987) and Diessel’s (2008)
findings as well as the predictions of Fossestøl (1980) and Faarlund et al. (1997).
The typical order is thus as shown in (22).
(22) Og det var. blitt for sent fordi pengene egentlig aldri hadde interessert ham.
(OEL1)
And it had been too late because the money had never really interested him.
(OEL1T)
The present investigation has reaffirmed the fact that register is a factor that cannot
be ignored in studies of grammar and discourse organization. While this is becom-
ing an established truth in usage-based studies of English, it has as yet not been
visible in studies of Norwegian. Furthermore, the frequency information about
Norwegian adverbial clause placement has given a more accurate and nuanced pic-
ture of language use in this area than what has emerged from previous
descriptions.
The cross-linguistic comparison has shown that English and Norwegian are alike
in placing adverbial clauses predominantly in initial and end position while medial
position is rare. End position is the more common choice in both languages and in
both registers investigated. The first hypothesis presented in Sect. 4 was that
Norwegian would use initial position more often than English. The material showed
no consistent pattern: there was a greater proportion of adverbial clauses in initial
position in Norwegian fiction than in English fiction, but the other way round in the
news register. Thus, the register comparison also turned out to have conflicting
results: news has a greater proportion than fiction of its adverbial clauses in initial
position only in English. The hypothesis of news making more extensive use than
fiction of initial position for adverbial clauses was thus true only of contingency
clauses, not of temporal ones. No other semantic types were frequent enough to
show reliable patterns of variation between initial and end position.
It was clear that the syntactic type of an adverbial clause influences its position
in both languages: non-finite clauses occur less freely in initial position. Prepositional
finites (occurring in Norwegian only) follow the same positional tendencies as their
non-finite counterparts. Different semantic categories also have their own positional
preferences in both languages. The preferences are rather similar across languages
and registers. Contingency clauses are slightly more frequent in initial position in
news, and time clauses in fiction.
Adverbial Clauses in English and Norwegian Fiction and News 137
The study of information structure and iconic order concerned only time and
contingency clauses, as these were the only ones to be frequent enough in both ini-
tial and end position to study positional variation. The results show that adverbial
clauses containing anchored information are more likely to be sentence-initial, and
those with new information are more likely to be sentence-final. Initial clauses with
new information are likely to co-occur with new matrix clauses.
The principle of iconic order would predict that causes and conditions are men-
tioned before consequences and that temporal clauses are placed such that the order
of adverbial and matrix clause reflects the temporal succession of events. There was,
however, no clear evidence in the material that iconicism was vital to adverbial
clause placement, except possibly with regard to condition and purpose clauses,
which showed definite preferences for initial and end position, respectively. It is,
however, likely that the positional preferences of semantic categories are more
important than iconic order, since other semantic categories do not seem much
affected by iconicism.
The best predictors of adverbial clause placement thus seem to be finiteness and
semantic category. Among finite time and contingency clauses, information value is
also a good predictor of position. There were surprisingly few cross-linguistic dif-
ferences apart from frequency: Norwegian and English adverbial clauses seem to be
placed according to the same semantic and discourse-pragmatic principles.
The register comparison revealed the following tendencies: the frequencies of
adverbial clauses in both positions differed between registers but in opposite direc-
tions in English and Norwegian. Iconic order was slightly more frequent in fiction
in both languages. Anchored clauses were most common in initial position across
the material, but initial discourse-new clauses were more frequent in fiction than in
news in Norwegian, but more frequent in news than in fiction in English. Discourse-
new clauses were most common in end position in all the subcorpora but surpris-
ingly there was a slightly higher percentage of final anchored clauses in news (in
both languages).
The relatively inconclusive results, mainly due to the small size of the material,
call for further research into the positional variation of adverbial clauses across
languages and registers. Any further analysis of information structure and iconic
order would benefit from a larger sample as well as additional registers and a broader
text distribution.
References
Diessel, H. (2005). Competing motivations for the ordering of main and adverbial clauses.
Linguistics, 43(3), 449–470.
Diessel, H. (2008). Iconicity of sequence. A corpus-based analysis of the positioning of temporal
adverbial clauses in English. Cognitive Linguistics, 19, 457–482.
Enkvist, N. E. (1981). Experiential iconicism in text strategy. Text, 1(1), 97–111.
Faarlund, J. T., Lie, S., & Vannebo, K. I. (1997). Norsk referansegrammatikk. Oslo:
Universitetsforlaget.
Ford, C. E. (1993). Grammar in interaction. Adverbial clauses in American English conversations.
Cambridge: Cambridge University Press.
Ford, C. E., & Thompson, S. A. (1986). Conditionals in discourse: A text-based study from
English. In E. C. Traugott, A. ter Meulen, J. S. Reilly, & C. A. Ferguson (Eds.), On conditionals
(pp. 353–372). Cambridge University Press.
Fossestøl, B. (1980). Tekst og tekststruktur: veier og mål i tekstlingvistikken. Oslo:
Universitetsforlaget.
Hasselgård, H. (2010). Adjunct adverbials in English. Cambridge: Cambridge University Press.
Hasselgård, H. (2014a). Discourse-structuring functions of initial adverbials in English and
Norwegian news and fiction. In Lefer, M.-A. & S. Vogeleer (Eds.), Genre- and register-related
discourse features in contrast, Special issue of Languages in Contrast, 14(1), 73–92.
Hasselgård, H. (2014b). Conditional clauses in English and Norwegian. In H. P. Helland & C. M.
Salvesen (Eds.), Affaire(s) de grammaire (pp. 183–200). Oslo: Novus.
Hetterle, K. (2015). Adverbial clauses in cross-linguistic perspective. Berlin/Boston: de Gruyter
Mouton.
Hwang, S. J. J. (1994). Relative clauses, adverbial clauses, and information flow in discourse.
Language Research, 30(4), 673–705.
Kortmann, B. (1991). Free adjuncts and absolutes in English: Problems of control and interpreta-
tion. London/New York: Routledge.
Kreyer, R. (2007). Inversion in modern written English: syntactic complexity, information status
and the creative writer. In R. Facchinetti (Ed.), Corpus linguistics 25 years on (pp. 187–204).
Amsterdam: Rodopi.
Meier, E. (2001). “Since you mention it”: A contrastive study of causal subordination in English
and Norwegian. MA thesis, University of Oslo. www.hf.uio.no/ilos/forskning/prosjekter/sprik/
pdf/em/HovedoppgEinarMeier22.pdf
Prince, E. F. (1981). Toward a taxonomy of given–new information. In P. Cole (Ed.), Radical
pragmatics (pp. 223–255). New York: Academic Press.
Prince, E. F. (1992). The ZPG letter: Subjects, definiteness, and information-status. In W. C. Mann
& S. A. Thompson (Eds.), Discourse description: Diverse linguistic analyses of a fund-raising
text (pp. 295–326). Amsterdam: Benjamins.
Ramsay, V. (1987). The functional distribution of preposed and postposed IF and WHEN clauses
in written narrative. In R. Tomlin (Ed.), Coherence and grounding in discourse (pp. 383–408).
Amsterdam: Benjamins.
Scott, M. (2014). WordSmith Tools 6. Stroud: Lexical Analysis Software.
Thompson, S. A., Longacre, R. E., & Hwang, S. J. J. (2007). Adverbial clauses. In T. Shopen (Ed.),
Language typology and syntactic description. Volume II: Complex constructions (pp. 237–
300). Cambridge: Cambridge University Press.
Wiechmann, D., & Kerz, E. (2013). The positioning of concessive adverbial clauses in English:
Assessing the importance of discourse-pragmatic and processing-based constraints. English
Language and Linguistics, 17(1), 1–23. doi:10.1017/S1360674312000305.
Adverbial Clauses in English and Norwegian Fiction and News 139
Corpus Material
Diana Lewis
1 Introduction
D. Lewis (*)
Department of English and Lerma Research Centre, Aix Marseille University,
Aix-en-Provence, France
e-mail: diana.lewis@univ-amu.fr
claimed that French has a preference for a greater density of discourse marking
(e.g., Fetzer and Johansson 2010 on causation marking).
This paper takes a look at discourse marking in the genre of political speeches, a
genre of written-to-be-spoken language that is broadly-speaking persuasive in
intent. The study is based on a French-English comparable corpus of speeches.
The paper is organized as follows. Section 2 discusses the additive coherence
relation in the context of discourse coherence. Section 3 gives an overview of the
genre-specific comparable corpus on which the study is based – political speeches –
and describes the procedures. The findings on additive markers across the French
and English speeches are presented in Sect. 4. Section 5 focuses on the uses of two
additive markers that are commonly given as ‘dictionary equivalents’: French en
effet and English indeed. The implications of the findings are discussed in the con-
cluding Sect. 6.
2 D
iscourse Coherence, Information Structure and Additive
Relations
Discourse coherence concerns the level at which the speaker, putting together her
discourse, needs to enable the hearer to build an ongoing representation where each
upcoming ‘idea’ – theme or proposition – finds its place. Information structure
refers here to thematic progression, in the sense of structuring given and new infor-
mation, as well as informational salience: means used by the speaker to foreground
or background ideas, creating an information contour for the discourse.
Both coherence relations and information structure may be encoded in some
linguistic device (such as prosodic pattern, lexical expression/construction or syn-
tactic structure /construction), or may be left implicit for the hearer/reader to prag-
matically infer. Some particular linguistic device may mark simultaneously a
coherence relation and an information structural relation. In fact, some approaches
to discourse tie the two together so that each coherence relation has an inherent
information contour or grounding relation. This is the case, for instance, of
Rhetorical Structure Theory (RST) (Mann and Thompson 1986). Others, such as
Relational Discourse Analysis (RDA) (Oberlander and Moore 2001), distinguish
‘semantic’ coherence relations from ‘functional’ information structure.
Coherence relations (also known as discourse relations or rhetorical relations)
include such notions as ‘contrast’, ‘concession’, ‘result’, ‘elaboration’, ‘exemplifi-
cation’, ‘addition’, ‘justification’ and so on. They refer to the various ways in which
the segments (or groups of segments) of a text or discourse fit into the rest of the text
or discourse; that is, how each part relates to the parts that precede and follow it, and
thus contributes to the overall meaning of the text.
These types of meaning can themselves be thought of as propositional. (In fact,
they are referred to by Mann and Thompson (1986) as ‘relational propositions’, an
area of meaning that is relatively grammaticalized into particles and adverbs, but
Coherence Relations and Information Structure in English and French Political Speeches 143
ply put, it is ‘more in the same vein’. (This use of ‘additive’ differs from that of
other authors such as Halliday (1994), for example.) The relation may be between
two states of affairs (‘content’ use) or between two speaker arguments (‘presenta-
tional’ use); often both types of relation obtain between two ideas (cf Hasselgård
2014: 72). A single occurrence of a discourse marker might therefore be interpreted
as encoding a state-of-affairs relation, an argumentational relation and an informa-
tion structural relation. In (1), for instance, What’s more can be interpreted as intro-
ducing an additional event and an additional speaker argument, as well as signalling
that the upcoming event/argument is more salient (rhetorically stronger for the
speaker) than the previous idea that it links to.
(1) if they had been cheating I would have known. What’s more , I would have
been the first to complain. [BNC CH7, newspaper]
The aim of the study is to compare the usages of additive coherence relation
markers by speakers of the political speech genre in the two languages and to iden-
tify potential discourse constructions built around an additive coherence relation.
Consonant relations in general are expected to be less marked (for example, by a
discourse marker) than dissonant relations. This is because ‘coherence’ in the lay
sense excludes incompatibility: the bare assertion of two apparently incompatible
ideas results in incoherence. Where a proposition may appear to the hearer to be
either at odds with what went before or irrelevant to it, some marker is called for to
at least acknowledge the counterexpectation. But where an idea follows on naturally
and unsurprisingly, it will usually be enough to use discourse continuity intonation,
a discourse continuity marker such as English ‘and’, or simple juxtaposition, for the
coherence to be understood. This can be seen from example (1), where the removal
of What’s more does not render the sequence incoherent. As Patterson and Kehler
point out, “the more difficult recovering the correct relation would be without a con-
nective, the more necessary it is to include one” (2013: 915). Additive markers are
therefore more optional than markers of other relation types.
This notion of uneven marking of relations is compatible too with the uniformity
of information density (UID) hypothesis, according to which predictability largely
explains variability in reduction. That is, the more predictable an upcoming item is,
the more likely it is to be reduced (phonetically, syntactically, discoursally) (Levy
and Jaeger 2007). Asr and Demberg (2012: 84) apply this hypothesis to discourse
marking and observe that easily inferable relations are on average marked more
ambiguously than relations which are less expected, in a fashion that arguably
reflects discourse-level information density smoothing.
Coherence Relations and Information Structure in English and French Political Speeches 145
Starting from lists of potential additive markers in French and English, an overall
picture of marking was drawn up for the texts in the two languages. The lists of
markers were drawn up following consultation of a variety of sources: Danlos et al.
(2015), Roze (2009), the digital resource Dictionnaire des synonymes français,
Quirk et al. (1985) and Roget’s Thesaurus.
Discourse-connective and and et were excluded from the study as they typically
mark discourse continuity rather than addition, and often precede markers of other
coherence relations (cf. And yet, Et pourtant and so on). Donc and so were also
excluded for being still inherently causal, though both can arguably also mark dis-
coursal addition. The additive uses of the markers listed in Table 3 were counted.
Surprisingly, the frequency of besides, a fairly typical marker in English conver-
sation and other genres (cf. Hasselgård 2014), was zero. The most frequent 15
markers in each language are listed with their frequencies in Fig. 1.
French speeches clearly contain more frequent and more varied additive marking
than the English ones. Aussi, ainsi, également, enfin, en effet, par ailleurs, d’ailleurs,
de même, en outre, [et] puis, d’autre part all occur at more than 10 per 100 k words,
in an additive function, across a range of speakers. The English speakers, by con-
trast, rely largely on juxtaposition and on also; the only other frequent markers
being too, indeed, and as well. English additive discourse markers such as in addi-
tion, moreover, similarly, thus, further[more], likewise, what is more, in fact, in the
same way, here again, besides, etc. are rare (<10 per 100 k words). The use of the
French additive markers can be viewed as helping to create, or as reflecting, a par-
ticular style of parallelism, using additive marking to pile up consonant propositions
and create a layered, cumulative case. Each layer of ideas seems to add equal weight
to the overall argumentation, but may be internally structured into more salient and
less salient points. The English speakers, by contrast, rely more on juxtaposition
and structural similarity to create argumentation that is less explicitly cumulative.
Hobbs (1985) discusses parallelism as follows:
Considerations of coherence in general allow us to string together arbitrarily many parallel
arguments. But it is a convention of argumentation for there to be just three, and those
ordered by increasing strength. In political rhetoric, one also hears sequences of parallel
statements, but for maximum effectiveness, they should be more than just the semantic
parallelisms characterized by the theory of coherence. They should also exhibit a high
degree of lexical and syntactic parallelism. (Hobbs 1985: 27)
effectivement
parallèlement
d'autre part
[et] puis
en outre
de même
d'ailleurs
ensuite
par ailleurs
surtout
en effet
enfin
également
ainsi
aussi
20 40 60 80 100 120 140 160 180 200
what is more
then
likewise
furthermore
thus
again
similarly
moreover
and of course
equally
in addition
as well
too
indeed
also
20 40 60 80 100 120 140 160 180 200
Fig. 1 Frequencies of the 15 most frequent additive markers in the French and English speeches
These devices are made quite explicit in the French speeches through both lexi-
cal and syntactic parallelism and the regular framing of arguments by discourse
markers. This kind of parallelism is exemplified in (2), which shows the coherence
markers (in bold, with the additives underlined) and the hierarchical structure
(indentation).
(3) It matters to Britain that EMU should succeed, even if we never join it.
The emergence of a euro-zone in the middle of our largest market, the
Single Market, will directly affect us in this country, whatever we do. We
want EMU to be solid, durable and stable because a euro-zone would
inevitably be our most important trading base. Already growth or recession
on the continent feeds quickly into the UK economy.
If a euro-zone failed, the disruptive effect on us would be enormous, even if
we were outside it.
[Clarke, 18/12/1996]
These speakers of French and English are using quite different rhetorical
templates.
Both en effet and indeed seem to be particularly typical of the genre of speeches.
And both, as sentence adverbials, are used overwhelmingly in the context of conso-
nant discourse relations. Both are anaphoric, dependent for interpretation on a
Coherence Relations and Information Structure in English and French Political Speeches 149
previous idea from a previous segment of discourse being accessible to the hearer.
They can both, therefore, be characterized as typical or ‘central’ additive markers.
Moreover, they are considered dictionary equivalents (e.g., Dictionnaire Le Robert
and Collins 2013). We shall see below, however, that although their functions over-
lap, they cannot be considered functional equivalents in the context of political
speeches.
To identify the probable functions of the discourse markers, the procedure was to
interpret, independently of the marker, the degree of coherence and the most plau-
sible type of relevance holding between the proposition in the host discourse seg-
ment and that in the previous discourse segment. This interpretation was then
compared to the interpretation with the marker.
5.1 Indeed
Instances of such adverbials found pre-verbally (no Aux) in other corpora were
rarely connective.)
The different functions of indeed apparent in the corpus correlate closely with
position in the host (Table 5). Indeed in both final and medial positions is a modal
adverb. Final position corpus occurrences, after an AdjP or AdvP host, are all exem-
plars of the construction <very Adj|Adv indeed>, in which indeed combines with
very to indicate ‘extremely’ (4).
(4) … the rationale for having such a power is clear and we shall want to look
at it very closely indeed . [Lloyd, 09/06/1997]
In medial position (5), indeed stresses the veracity of its host where there may have
been doubt (cf. really, truly, definitely); it can be said to be counterexpectational.
(5) … the indications are that conditional fees are indeed widening access to
justice. [Hoon, 23/09/1997]
In this position indeed may also combine with an adversative marker to form a con-
cessive construction <indeed p, [adversative DM] q> as in (6), or with if to form a
concessive-conditional construction (7). In these constructions it can also be
described as counterexpectational.
(6) Companies are indeed observing those rules, but not always in a way which
positively informs shareholders and employees, or responds to their
concerns. [Becket, 04/03/1998]
(7) Who do you think should run such a bidding system, if indeed you are
persuaded by its attractions? [Aitkin, 15/03/1995]
Initial occurrences, by contrast, are all discourse connective; and by virtue of this
position, indeed acts as a presentative. The hosts are not all full clauses, as exempli-
fied in (8).
(8) Hong Kong stands as a monument to what the human spirit can, indeed will,
achieve. [Rifkin, 12/02/1997]
In the great majority of cases (v. Table 5), the indeed host is a wider, stronger
claim than the preceding one. The examples in (9) are typical. In contexts such as
(9a), the relation is usually expressed in French with au contraire (v. Lewis 2005:
45–46).
(9) a. NATO has not collapsed. Indeed – the best test of success – countries are
queuing for membership. [Portillo, 05/12/1995]
b. The new government in Britain has a clear plan about how it intends to
shape British foreign policy, and indeed to shape the world in which Britain
lives. [Symons, 10/10/1997]
c. Hong Kong, as so often in its history, has defied the pessimistic smart
Alecs. Indeed it has defied the odds. [Major, 04/03/1996]
Coherence Relations and Information Structure in English and French Political Speeches 151
In a few instances, the indeed host largely repeats the previous idea (10), or provides
some detail or additional information about it that exemplifies, clarifies or justifies
it (11). These contexts are the closest to the French contexts in which en effet is
found.
(11) … trade has always been the backbone of Anglo-Tunisian relations. Indeed,
our first formal treaty in 1662 was about commerce. [Hanley, 09/01/1997]
beliefs. This rhetorical function depends on indeed being seen in the wider context
of a discourse construction, <p indeed q> where q is a wider or stronger claim than
p, set in a wider-still context of a thematic chunk of discourse.
German modal particles (see also Schoonjans 2012 for similarities between French
and German particles). Given this kind of data, it looks as though these high-
frequency discourse markers may be at a relatively advanced stage of grammatical-
ization (in the broad sense), and may be part of an emergent discourse-level
schematic construction in which there is a post-verb/auxiliary ‘slot’ for the ana-
phoric marker.
In the speeches corpus, the en effet host is a full clause in every case, unlike
indeed. En effet occurs mainly in declaratives, but also in interrogatives. The
speeches being monologues, the interrogatives are, of course, rhetorical questions.
When the en effet host (the segment to which it attaches) contains a speaker-attitude
predicate, there can be some ambiguity as to whether the marker has (pragmati-
cally) scope over the speaker attitude, over the following proposition, or both. The
position of the discourse marker, along with the context, suggest that in most cases
it is at least the speaker attitude and often both (12).
(12) L’action du gouvernement repose sur l’ouverture d’un débat public. J’ai
en effet la conviction que les solutions ne peuvent être imposées d’en haut
à la société. [Jospin, 25/08/1997]
‘The government’s actions depend on setting up a public debate. I am
en effet convinced that solutions cannot be imposed on society from above.’
often occurs with a less specific elaborative relation, especially a move from the
general (in the previous idea) to the particular (in the host idea). This typically
involves reiterating the thematic element of the idea and providing greater detail
(16) and (17).
In (13) the same idea is expressed in both clauses. The effect of the discourse
marker is to emphasize their equivalence; without it, ‘a lot being at stake’ might
come across as stronger than ‘particularly important’.
Example (14) illustrates the typical justification use, the en effet host being the jus-
tification for the speaker not going into detail.
(14) … le collectif prévoit une diminution voisine de 3,3 milliards d’euros par
rapport à la LFI, sur laquelle je ne m’étends pas: votre rapporteur général
a en effet décrit l’ensemble des évolutions prévues par ce collectif de
manière exhaustive dans son rapport écrit . [Mer, 29/07/2002]
‘… the revised budget involves a reduction of around 3.3bn euros from the
initial budget; I will not go into that in detail: your Rapporteur-general has
en effet described all the changes involved in the revision thoroughly in his
written report.’
In (15), evidence for the first assertion is presented in the second. At the same time,
the evidence provides a justification for making the first statement, so that evidence
and justification are closely linked.
(15) je sais qu’il n’est point nécessaire de vous convaincre que la recherche
universitaire doit aujourd’hui s’inscrire résolument dans un espace
européen. Votre colloque annuel qui s’est tenu voici 2 mois à Bordeaux
était en effet consacré pour une large part à la discussion de la
comunication de la Commission intitulée “ Vers un espace européen de la
recherche”. [Schwartzenberg, 18/05/2000]
‘I know you do not need to be convinced that university research today
must be firmly anchored in a European context. Your annual conference
held two months ago in Bordeaux was en effet largely devoted to
discussion of the Commission paper entitled “Towards a European research
area”’.
The en effet host in example (16) can be interpreted as elaborating in more detail on
what women point out; but also as explaining why new legislation is not the obvious
answer or justifying the speaker’s statement that it is not the obvious answer.
Coherence Relations and Information Structure in English and French Political Speeches 155
(17) a. Ce régime est plus sévère que celui de la loi de 1995. En effet , le seuil au
dessus duquel les condamnations à une peine d’emprisonnement avec
sursis simple ne sont pas amnistiées a été abaissé par rapport à la loi de
1995: il passe en effet de neuf mois à six mois. [Perben, 23/07/2002]
‘This regime is more severe than that of the 1995 act. En effet, the
threshold
beyond which ‘simple’ suspended sentences cannot be amnestied has been
lowered from that of the 1995 act: it has gone en effet from nine months to
six months’
b. … vous vous inscrivez dans une de nos plus anciennes traditions. C’est
en effet au milieu du XVIIIe siècle … que les premiers prix du concours
furent discernés. [Darcos, 02/07/2002]
‘… you are joining of one of our most ancient traditions. It was
en effet in the middle of the 18th century that the first competition prizes
were awarded.’
(18) J’ai évoqué tout à l’heure le paradoxe agricole de notre pays. Mais celui-ci
se double d’un paradoxe rural. D’un côté en effet , nous assistons à un
certain renouveau démographique de nos campagnes. Mais de l’autre, nos
compatriotes ruraux s’interrogent devant la méconnaissance par la France
urbaine de certaines spécificités de leur modes de vie …
[Gaymar,04/07/2002]
‘I spoke just now about the agricultural paradox in our country. But there is
also a rural paradox. On one hand, en effet, we are witnessing a certain
demographic renewal in the countryside. But on the other hand, our rural
compatriots are concerned that urban France is ignorant of the
particularities of their way of life …’
In several cases what is striking is the way en effet occurs as part of a series of
discourse markers that together create a rhetorical frame for a chain of interlinked
ideas, as seen in Sect. 4, each with its anaphoric marker. In (19) the en effet host is
a simple repetition, after a parenthesis, of a previous proposition (‘This law will be
exemplary’ – ‘Our future law will be exemplary’). To maintain coherence, it needs
to be marked as old information, the function of en effet here.
(19) Je ne souhaite pas que cette disposition …puisse masquer le fait que la
France, par l’adoption de ce projet de loi, sera l’un des pays les mieux
armés pour lutter contre la corruption internationale.
Je me prononcerai donc en faveur de l’amendement [1] …
Enfin , j’approuve également l’amendement [2] …
Notre future loi sera ainsi exemplaire, et je tiens une fois encore à remercier
votre Commission et Monsieur Jacky DARNE, votre rapporteur, pour son
utile contribution à l’élaboration de ce dispositif législatif.
Cette loi sera en effet exemplaire, d’abord par son effet dissuasif …
Elle traduira ainsi le souci de la France de combattre sans relâche ce fléau
économique et social que constitue la corruption nationale et
internationale.[Guigou, 29/02/2000]
‘I do not want this provision to be able to conceal the fact that France, in
passing this bill, will be one of the countries best equipped to fight
international corruption.
I will donc vote in favour of amendment [1] …
Enfin, I approve également of amendment [2] …
Our future law will ainsi be exemplary, and I would like once again to
thank your Commission and Mr Jacky Darne, your rapporteur, for their
useful contribution to the drafting of this legislative package.
This law will en effet be exemplary, first of all due to its disuasive effect …
It will ainsi answer France’s concern to fight relentlessly the economic and
social scourge of national and international corruption.’
Coherence Relations and Information Structure in English and French Political Speeches 157
For many occurrences, then, more than one relation plausibly holds between the
conjuncts; for others, there seems to be no relation other than continuity. We suggest
that the range of contexts in which en effet occurs in the political speeches genre
reflects its vagueness rather than polysemy. Across different context types, it implies
consonance and helps validate or in some way reaffirms the previous idea.
To summarize, en effet links its host segment to the previous segment, thereby
creating a two-segment discourse pattern. The en effet host expresses an idea that is
entirely consonant with the previous idea, which it reformulates or expands on with
a more particular, or, more rarely, a broader idea. There is a range of similar rela-
tions with which use of en effet is compatible, and its removal does not result in
incoherence. The frequency and contexts of en effet point to its being highly
bleached, and rather than consider that en effet is polysemous, it better fits these data
to characterize it as vague: we can hypothesize that these relations are contiguous in
conceptual space.
As mentioned above, en effet occurred in full clauses. In this genre, a theme is
typically introduced in general terms in one clause and then fleshed out or expanded
on in the next. Insofar as the en effet host provides the additional detail, it is infor-
mationally subordinate to the previous segment (a ‘nucleus-satellite’ relation typi-
cal of elaboration, in RST terms, or a ‘core-contributor’ relation in RDA terms).
Oberlander and Moore (2001) cite corpus studies showing that, in English at least,
a discourse marker is much less likely to be used when there is nucleus-satellite
(core-contributor) order, since this order is easy to process, and marking is superflu-
ous. All this suggests that there may be reasons other than coherence marking and/
or information structure marking for such frequent occurrence of en effet. And when
seen in wider rhetorical context, it appears that en effet forms part of a network of
markers providing thematic continuity and lending a particular rhetorical rhythm to
the discourse through parallelism.
Two discourse constructions for en effet can be identified in this genre: (i) <p En
effet q> and (ii) the more frequent <p q> where q is <Subj – V/aux – en effet –
Compl>. While the relation is the same for both (p is any proposition and q is pre-
sented as confirming or expanding on p), the information structure differs, reflecting
that of the higher-level constructions (i) <p DM q> and (ii) <p q> where q is <Subj –
V/aux – DM – Compl>. The regularity of the post-verb/auxiliary position, shared
with other very high-frequency connectives, suggests the second is the more
grammaticalized.
5.3 Comparison
Both en effet and indeed are modal adverbs that retain some epistemic sense but
have now taken on discourse structuring functions too. Both are found overwhelm-
ingly in contexts of elaboration in this genre.
Halliday describes ‘elaboration’ as where “one clause elaborates on the meaning
of another by further specifying or describing it” (1994: 225). In paratactic elabora-
158 D. Lewis
tion, the secondary (elaborating) clause may have one of three functions: (i) “to
restate the thesis of the primary clause in different words, to present it from another
point of view, or perhaps just to reinforce the message”, (ii) to develop the thesis of
the primary clause “by becoming more specific about it, often citing an actual exam-
ple” and (iii) to clarify the thesis of the primary clause, “backing it up with some
form of explanation or explanatory comment” (1994: 226). This sense of elabora-
tion comes close to matching the predominant political speech use of en effet, which
is found in all three contexts.
Connective indeed is used more narrowly, either to present a stronger version of
the same claim, or to make a further and stronger claim related to the first claim. Its
initial position and parenthetical syntax are typically presentative. There is thus a
significant difference in the information structuring functions of the two expres-
sions, en effet marking its host as old or given information (from a new aspect or in
more detail), while indeed introduces a new and more surprising claim
(counterexpectation).
A second difference, as we have seen, is that en effet appears to be more gram-
maticalized than indeed, which ties in with its much greater frequency and its
bleached semantics that allows it to occur in a wider range of contexts.
Finally, the markers should be seen in the context of the wider rhetorical patterns
of the genre. En effet contributes, along with other markers, to a pattern of parallel
ideas each explicitly linked to the previous discourse. The English speeches make
more use of juxtaposition, so that indeed does not function as part of a network of
markers.
6 Conclusion
It has been seen that the overall effect of use of additive markers in French political
speeches is to create even-paced stretches of discourse where each segment forms a
link in a well-constructed chain of arguments and where the hierarchical structures
(the rhetorical dependencies) are transparent and conventional. One of these con-
ventions is the regular, almost rhythmic use of additive discourse markers such as
également, de même, en effet, ainsi, all occurring overwhelmingly in the same post-
verb/auxiliary position in the host, acting as the ‘hooks’ attaching each segment to
the previous discourse in a series of parallelisms. Metaphorically-speaking, these
markers can be seen as pinning the content of the discourse to its rhetorical
backcloth.
The use of dedicated connectives – for coordination, subordination and discourse
connectivity – has been linked to literacy. Speakers conjoin fewer consituents than
writers. Non-literate languages rely heavily on juxtaposition and often lack gram-
maticalized coordination or acquire it through language contact (Mithun 1988). The
density of additive marking in the French speeches does convey a literary impres-
sion as well as a degree of formality that is less striking in the more conversational-
sounding English ones. This is no doubt a reflection of the greater distance between
Coherence Relations and Information Structure in English and French Political Speeches 159
literary and conversational French than is the case for English. Additive markers
combine with other coherence markers to form a network that knits the discourse
together into a tightly-structured whole. In the English speeches, by contrast, addi-
tive discourse relations are more often left implicit, and the resulting discourse is
more loosely woven.
As seen in Sect. 2, markers of consonant discourse relations are expected to be
relatively infrequent because discourse coherence can be established by juxtaposi-
tion within a logical ordering. These most frequent French markers, however, seem
to function in this genre as text-structuring devices marking information flow more
than as relational propositions. In political discourse, a regular filling of this French
discourse-marker ‘slot’ seems almost obligatory. The French markers are more fre-
quent, more bleached and arguably more grammaticalized than their English
counterparts.
Further research will need to situate discourse marking in this genre with respect
to other genres and discover to what extent the discourse constructions frequent in
political speeches are used across other genres, and how these constructions may be
evolving.
Note
1. Translation agencies regularly advise their clients that the ‘expansion rate’ in
translation from English to French is between 15 and 20% by word count. See,
for example, <http://www.kwintessential.co.uk/translation/articles/expansion-
retraction.html>,<http://translation-blog.trustedtranslations.com/prices-according-
to-source-word-count-2010-02-25.html>, <https://e2f.com/203/> and <http://
www.andiamo.co.uk/resources/expansion-and-contraction-factors>. Conversely,
translations from French to English are shorter by word count. Armstrong (2015)
discusses “the high expansion rate usually seen in translation from English to
French” (Armstrong 2015: 193).
References
Aijmer, K. (2007). Modal adverbs as discourse markers. A bilingual approach to the study of
indeed. In J. Rehbein, C. Hohenstein, & L. Pietsch (Eds.), Connectivity in grammar and dis-
course (pp. 329–344). Amsterdam: John Benjamins.
Aijmer, K. (2008). The actuality adverbs ‘in fact’, ‘actually’, ‘really’ and ‘indeed’ – Establishing
similarities and differences. In M. Edwardes (Ed.), Proceedings of the BAAL conference 2007
(pp. 111–120). London: Scitsiugnil Press.
Armstrong, N. (2015). Culture and translation. In F. Sharifan (Ed.), The Routledge handbook of
language and culture (pp. 181–195). London: Routledge.
Asr, F. T., & Demberg, V. (2012). On the Information conveyed by discourse markers. Proceedings
of the Workshop on Cognitive Modeling and Computational Linguistics, Sofia, 08 August 2013,
pp. 84–93. Association for Computational Linguistics.
160 D. Lewis
Bertin, A. (2002). L’émergence du connecteur en effet en moyen français. Linx, 46, 37–50.
Charolles, M., & Fagard, B. (2012). ‘En effet’ en français contemporain: de la confirmation à la
justification/explication. Le Francais Moderne, 80(2), 171–197.
Chuquet, H., & Paillard, M. ([1987] 1989). Approche linguistique des problèmes de traduction
anglais<>français, revised edition. Paris: Ophrys.
Couper-Kuhlen, E., & Thompson, S. A. (2000). Concessive patterns in conversation. In E. Couper-
Kuhlen & B. Kortmann (Eds.), Cause, condition, concession, contrast: Cognitive and dis-
course perspectives (pp. 381–410). Berlin: Mouton de Gruyter.
Danlos, L., Colinet, M., & Steinlin, J. (2015). FDTB1, première étape du projet « French Discourse
Treebank » : repérage des connecteurs de discours en corpus. Discours, 17 . doi:10.4000/
discours.9065.http://discours.revues.org/9065
Dictionnaire des synonymes français. Digital resource developed by the CNRS, University of
Lyon 1 and University of Caen. http://dico.isc.cnrs.fr/dico/fr
Dictionnaire Le Robert & Collins français-anglais et anglais-français. 2013. Editions Le Robert.
Fetzer, A., & Johansson, M. (2010). Cognitive verbs in context. In S. Marzo, K. Heylen, & G. de
Sutter (Eds.), Corpus studies in contrastive linguistics, special issue of the International
Journal of Corpus Linguistics, 15(2), 240–266.
Gallagher, J. D. (1995). L’effacement des connecteurs adversatifs et concessifs en français mod-
erne. In M. Ballard (Ed.), Relations discursives et traduction (pp. 201–220). Lille: Presses
Universitaires de Lille.
Guillermin-Flescher, J. (1981). Syntaxe comparée du français et de l’anglais. Problèmes de tra-
duction. Paris: Ophrys.
Halliday, M. A. K. (1994). An introduction to functional grammar. London: Edward Arnold.
Hasselgård, H. (2014). Additive conjunction across languages: ‘Dessuten’ and its correspondences
in English and French. In S. O. Ebeling, A. Grønn, K. Rå Hauge, & D. Santos (Eds.), Corpus-
based studies in contrastive linguistics. Oslo Studies in Language, 6(1), 69–89.
Hobbs, J. R. (1985). On the coherence and structure of discourse (Report No. CSLI-85-37, Center
for the Study of Language and Information). Stanford: Stanford University.
Holmes, J. (1994). Inferring language change from computer corpora. ICAME Journal, 18, 27–40.
Levy, R., & Florian Jaeger, T. (2007). Speakers optimize information density through syntactic
reduction. In B. Scholkopf, J. Platt, & T. Hoffman (Eds.), Proceedings of the twentieth annual
conference on neural information processing systems (pp. 849–856). Cambridge, MA: MIT
Press.
Lewis, D. M. (2005). Mapping adversative coherence relations in English and French. In K. Aijmer,
H. Hasselgård, & S. Johansson (Eds.), Contrast in context. Special issue of Languages in
Contrast, 5(1), 33–48.
Mann, W. C., & Thompson, S. A. (1986). Relational propositions in discourse. Discourse
Processes, 9(1), 57–90.
Mason, I. (2001). Translator behaviour and language usage: Some constraints on contrastive stud-
ies. Hermes, 26, 65–80.
Mithun, M. (1988). The grammaticalization of coordination. In J. Haiman & S. A. Thompson
(Eds.), Clause combining in grammar and discourse (pp. 275–329). Amsterdam: John
Benjamins.
Murray, J. D. (1997). Connectives and narrative text: The role of continuity. Memory and Cognition,
25(2), 227–236.
Nùñez Pertejo, P. (2008). The multifunctionality of ‘indeed’ in contemporary spoken and written
English. English Studies, 89, 716–736.
Oberlander, J., & Moore, J. D. (2001). Discourse cues: Further evidence for the core contributor
distinction. Cognitive Linguistics, 12(3), 317–332.
Patterson, G., & Kehler, A. (2013). Predicting the presence of discourse connectives. In Proceedings
of the conference on Empirical Methods in Natural Language Processing EMNLP, pp. 914–
923. Association for Computational Linguistics.
Coherence Relations and Information Structure in English and French Political Speeches 161
Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A comprehensive grammar of the
English language. London: Longman.
Roget’s Thesaurus. online at http://www.roget.org/
Rossari, C. (2016). L’approbation dans un dialogue devient-elle une concession dans un mono-
logue ? Etude de ‘certes’, ‘en effet’, ‘effectivement’, ‘d’accord’, ‘OK’. In L. Sarda, D. Vigier,
& B. Combettes (Eds.), Connexion et indexation. Ces liens qui tissent le texte. Mélanges pour
Michel Charolles. Lyon: ENS Éditions. http://books.openedition.org/ enseditions/6847.
Roze, C. (2009). Base lexicale des connecteurs discursifs du français. Master’s dissertation,
University of Paris Diderot.
Schoonjans, S. (2012). Topologie contrastive des particules de démodulation. Comparaison de
l’allemand et du français. Leuven Working Papers in Linguistics, 1, 62–76.
Schoonjans, S. (2014). Oui, il y a des particules de démodulation en français. CogniTextes, 11.
Simon-Vandenbergen, A.-M., & Aijmer, K. (2007). The semantic field of modal certainty. A
corpus-based study of English adverbs. Berlin: Mouton de Gruyter.
Taboada, M. (2009). Implicit and explicit coherence relations. In J. Renkema (Ed.), Discourse, of
course (pp. 125–138). Amsterdam: John Benjamins.
Traugott, E. C., & Dasher, R. (2002). Regularity in semantic change. Cambridge: Cambridge
University Press.
Part III
Contrastive Analysis Across Genres of
English
Callbacks in Stand-Up Comedy:
Constructing Cohesion at the Macro Level
Within a Specific Genre
Catherine Chauvin
Abstract The paper is a discussion of the type of cohesive devices that can be
found in stand-up comedy, focusing more specifically on callbacks. Other cohesive
devices are also mentioned so as to provide some background on how stand-up
comedy shows are structured. Stand-up comedy shows are indeed quite generally
ignored in the discussion of genre-related cohesion-building mechanisms, and this
paper aims at filling this gap. The paper uses as theoretical backdrop the functional
linguistics analyses of cohesion, as well as some of the discussions of topic continu-
ity and sequencing done in Conversation and Discourse Analysis. A short compari-
son with some of the devices used in literary narratives is also proposed, using the
tools of French structuralist narratology (Genette’s analepses, in particular), which
allows us to delve further into the specificities of the genre. It is shown that the call-
back technique used in stand-up comedy offers very interesting data on how a dis-
course can be made coherent at a macro level, vs. the inter-sentential one; such
techniques should therefore be included in the repertoire of cohesion-building
tokens when these are discussed across genres.
1 Introduction
The topic we will be dealing with in this paper is that of callbacks in stand-up com-
edy. We will be discussing what they are, the extent to which they may be said to be
constitutive of a specific genre, and, more particularly, what they can bring to the
analysis of the inventory of cohesion-building devices. Cohesion-building devices
have been vastly studied in linguistics, with, of course, Halliday and Hasan (1976)
constituting a seminal background study. Moreover, Conversation Analysis and
C. Chauvin (*)
Department of English, University of Lorraine, Nancy, France
e-mail: catherine.chauvin@univ-lorraine.fr
2 The Corpus
The corpus is a series of full-length stand-up comedy shows, so that coherence can
be examined at the level of a whole show. We focus on the shows of British come-
dians performing in English, the earliest ones being performed in the 1990s and the
most recent one in 2016. A previous study on cohesion-marking in stand-up comedy
was published in 2015 (in French); references to it are included in this chapter.
1
There are a few studies that have been carried out; there are several student papers some of which
can be found online, sometimes without a clear author’s name (http://rudar.ruc.dk/, http://library.
binus.ac.id/; the topic of comedy seems to have become popular for both Master’s Theses and
seminars). Schwarz’s study (Schwartz 2010) does not specifically focus on cohesion; other studies
may not be strictly speaking linguistic, or not linguistic, but they may shed interesting light on the
genre and the problems we are discussing here (e.g., Glick 2007; Bolens 2015). The question of
narrativity will be dealt with here in relation to cohesion; other dimensions of narrativity may not
be included in the paper.
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 167
The shows that are mentioned in the references (27 of them; see references) were
all watched with this question of cohesion-marking in mind, the relevant passages
being noted down, with a few notes on what seemed to be the technique involved.
The average length of a show is about one hour and a half to two hours. The shows
of Bill Bailey (four; see references) and Eddie Izzard (seven; see references, as well
as performed versions of Force Majeure (Reloaded) and Stripped) were more spe-
cifically focused upon, but attention was paid to varying the sources, and references
are made to the other comedians and shows, too (see the next sections). Another
source that was used as a complement to the shows themselves was texts about the
shows: certain interviews, in particular the DVD commentary to Force Majeure
which contains a few remarks on writing, and self-help books or documents written
for young comedians (again, see references), giving advice on how to write and
perform shows. Excerpts are quoted in Sect. 5.1.
Some shows were dealt with in greater depth because there were a number of
interesting examples, Bill Bailey and, in particular, Eddie Izzard, being astute users
of cohesion devices, and, maybe even more, of callbacks.2 Other shows, being more
or less entirely organized around one-liner jokes –Tim Vine’s shows are a case in
point–, are arguably less directly relevant to the study of cohesion-building mecha-
nisms. But, in fact, certain devices can still be present even in such apparently non-,
or a-, coherent routines,3 and a number of shows make use of certain or all of these
devices, as is illustrated by some of the examples used in this paper. Izzard’s shows
are long monologues in which the comedian goes from one topic to another in a
seemingly random way, and they are known for their surrealist streak (Izzard has
famously been called “the lost Python” by John Cleese). Bailey is also a musician,
so the shows alternate between monologues, video-based passages and musical
numbers. Eddie Izzard’s and Bill Bailey’s shows are, of course, therefore partly
idiosyncratic, because they are shows by specific comedians in specific contexts;
but even if they may use the mechanisms in specific ways, these mechanisms can be
found in other shows as well (see, again, the examples discussed in this paper), and
callbacks in particular can be deemed to be characteristic of the genre because: (1)
if some comics use them less often or less strikingly than others, a number of them
do make use of the device in some way or another, and (2) in the documents written
for would-be comedians, they are often mentioned as a good technique to use, as
will be seen in Sect. 5.1. It can be noted, to conclude this section, that a certain level
of idiosyncrasy probably has to be present in some way or another in comedy, as
probably is the case in all creative genres: creative genres do not, or even cannot,
aim at complete reproduction, because of their very creative nature. But a number
of codes also tend to be followed, and certain practices tend to be shared. We intro-
duce some of these in the next section.
2
Harry Hill is also reputed for his callbacks, but we have not got yet to analyzing his shows.
3
Certain one-liners follow each other thematically, for instance, which creates (usually very) short
sub-sections within the shows. Shows made entirely of a succession of one-liners, vs monologues,
are nonetheless the exception rather than the rule in our corpus (also see next section).
168 C. Chauvin
We will only focus here on what is relevant to the analysis; Double (2014) can be
consulted for a more detailed presentation.
Stand-up comedy is a type of comedy that was for a long time particularly asso-
ciated with the English-speaking world, although it is now much more international.
Its birth and development in the U.S. and the U.K. has sometimes been linked to
such local traditions as that of the music hall, or of the vaudeville (Double 2014). It
started around the 1950s–60s in the U.S., and gained prominence in the 1960s–70s.
In the U.K., a creative alternative4 scene also emerged in the 1980–90s and, nowa-
days, many comedians have become household names, some of them (Eddie Izzard,
in particular) filling stadiums and arenas as big as Wembley and the O2.5 Some of
them are television personalities as well as performers. The comedians that are
quoted by others as “models”, or the founding fathers, so to speak,6 tend to be
American: Richard Pryor, for instance, is often quoted as a model by fellow come-
dians. But they may also, albeit less commonly, be British: Billy Connelly is thus
often quoted by fellow comedians as an inspiration. Since we will be focusing on
U.K. shows in this paper, we will be dealing with a relatively long-living, and active
scene.
A stand-up comedy show is a form of live humorous performance during which
the comedian is typically alone on stage, with very few or no accessories except a
microphone, and speaks on a variety of topics for a given length of time. There are
exceptions to this, since Bill Bailey, as was mentioned in the preceding section, is
also a musician and uses instruments on stage, and some (Jimmy Carr, Bill Bailey,
Dylan Moran) also use screens and projections. The use of props nevertheless tends
to be kept to a minimum. A succession of one-liners can constitute the whole of the
show (as in Tim Vine’s case), but more often than not, one-liners, if they are present,
are included in the structure of the show, or can be present in certain dedicated
sequences (Jimmy Carr), with other sequences combining videos and one-liners, or
interaction with the audience. A performance (a show, or gig) can last from a few
minutes, for collective shows and/or for newcomers, to somewhere between 1 and 2
h for an individual performance. During this time span, the comedian speaks, and
entertains the audience. The formats and types of humour are truly varied. Some of
the shows can be rude or crude, but clearly not all shows are: Izzard’s and Bailey’s
are not. There also is a tendency for observational humour to constitute the default
4
Although it is difficult to explain it in just a few lines, “alternative” – the word is used by the
comedians themselves – refers to a form of comedy that is (meant to be) different from what
existed before, in its contents (for instance, maybe, self-reference, i.e., using one’s own life as
comedy material, for U.S. performers in particular, but not only them) or form (improvisation;
performances in small venues such as pubs and clubs; humour is no longer based on jokes…).
5
Wembley Stadium in London sits 90,000 people and the O2, 20,000.
6
Or mothers, although a number of them are men. Examples of women performers are, for instance,
Ellen DeGeneres and Elaine May in the U.S.; in the U.K., Jo Brand and Sarah Millican, who are
mentioned in this paper, are also women.
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 169
basis of a show. The comedian either shares personal anecdotes with the audience
or points out “observed” facts (have you noticed that this always happens when..?),
focusing for instance on weird behaviour or surprising connections. As mentioned
in the preceding section, it is also important for each comedian to create their own
identifiable persona. This comedy form is therefore both varied and unified, but
there are strong marks that “delineate” it and signify that the show is a stand-up
show, like the microphone, the quasi-absence of props, the energetic monologue.
What can be said to be partly constitutive of the genre is also the relationship that
is built between the performer and his/ her audience. Unlike traditional theatre,
stand-up comedy generally does not recognize the existence of a “fourth wall”, the
invisible frontier that separates the stage from the rest of the room and assumes that
the space of the stage is also the space of fiction, a different world that has no direct
interaction with the floor. There is no symbolical opposition between the stage and
the floor in stand-up comedy, which means that interaction with the public is pos-
sible, although, again, the amount of interaction may depend on the personality and
style of the comedian. Some interaction can be forced upon the comedian and not
be elicited when a member of the audience directly hails the comedian (that is called
heckling). The more famous and renowned the comedian and the larger the room,
the less interaction there may also be and the more the space of the stage becomes a
symbolically separate space again. There is a connection between this aspect of
stand-up comedy and its content: what is often discussed on stage is supposed to be
“real life”, and normally not fiction, which is another interesting dimension of the
genre.
The subject matter of shows also clearly varies, as yet another specificity of the
genre that is directly relevant to this study is the fact that comedians move from one
topic to another with constant shifts. Ross Noble, for instance, is known for his
“tangents”. This means that the shows might not always be, or at least appear to be,
structured, which also makes the analysis of cohesion markers and coherence-
building central and interesting. Callbacks, which we will define further down, are
one of the ways in which a continued flow of heterogeneous topics can be made to
form a whole. But it is also clear that no stand-up comedy show is truly or evidently
a “whole”: it is often made up of successive parts that are clearly disconnected.
Performances are both collections of unrelated items and a specific kind of unit,
which we will now explore.
This section will be an overview of some devices that are used in stand-up comedy
shows and can be considered to be cohesive devices. We will just present them here
so that they can provide the background against which callbacks have to be assessed.
The connection between callbacks and these cohesive devices will also be briefly
explored.
170 C. Chauvin
As is well known, cohesion has to do with the linguistic (or, sometimes, paralinguis-
tic; see below) features that make a text a text, and not just a collection of random
utterances, whether spoken or written. The term is therefore used in a similar way to
that found in Cohesion in English (Halliday and Hasan 1976), which famously
listed ways in which the very textuality of a text affects its structure: anaphora/ ref-
erence, ellipsis, substitution, conjunction, lexical cohesion. Cohesive devices are
linguistic devices, perhaps sometimes paralinguistic, too (again, see below), that are
there because a given utterance is part of a whole and so comes after something else
and before something else. Part of the difficulty may consist in drawing up a list of
such devices, although a number of studies have done that, in different ways (e.g.,
Halliday and Hasan 1976; Duchan et al. 1995). Another problem is to show why,
and how, a given device actually creates cohesion. Coherence is to be sought at a
semantic, or logical level; it has to do with how relevant and/or logically sound the
different connections that are made between the ideas, arguments or events are.
Cohesion-building devices may contribute to making a discourse more coherent,
but the connection between the two is notoriously complex: a cohesive text may be
entirely incoherent (?This house is blue, because it likes it), and it is possible to
build a very coherent reasoning with no known cohesive devices at all (He came. I
was happy. We all wept.) Links may be construed without them being necessarily
expressed, and be computed pragmatically, or discursively. The connection or
absence of connection between cohesion and coherence may come to play a role in
stand-up comedy, as it can be a source of humour (Chauvin 2015). This question
will be briefly taken up in Sect. 6.3. The ways in which a given series of utterances
is made to be a spoken text have also been studied in Conversation Analysis, and
will be used as background hypotheses in this study (cf. the presence and construc-
tion of topic continuity). The recognition and discussion of the existence of a level
of subordinate structures (cf. sequences) is also of relevance. A large number of
analyses of “pragmatic” or “discourse” markers have also resulted in the addition of
such elements to the list of cohesive devices; well, oh, so, etc. have all been studied
in relation to the building of cohesion, at the ideational but also the interpersonal
level (for instance, Schiffrin 1987). They will only be mentioned briefly here but a
few examples will be cited in relation to the main questions discussed in the paper.
Cohesion devices have been studied in a number of genres, and, partly, as was
said before, written texts, or conversation. Now, choosing the relevant framework
for stand-up comedy may be a question to be dealt with at the outset. Stand-up com-
edy is neither “text” nor “conversation” – it is prepared speech, but spoken, with
possible room for improvisation. The routines have a clear conversational dimen-
sion (see Sect. 3 and the absence of the fourth wall), too, but are also mostly mono-
logues. Narratives, which could perhaps be considered to be specific kinds of texts,
have been analysed separately by (literary) narralogists and linguists, but stand-up
comedy is not one organized narrative, and comedy shows do not belong to the type
of texts that have been described in such classical studies as that of fairy tales by
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 171
Propp 1970, or even the spoken narratives such as those analysed in Labov and
Waletzky 1967, Labov 1997, 2001, 2004, 2006, or such as the “frog narrative”
(Slobin 2005). The French structuralists, Barthes, Genette, Todorov in particular
(cf., for instance, Barthes 1966; Genette 1966, 1972; Todorov 1966, 1967, 1971),
have dealt with the organization of (again, literary) narrative in ways that turn out to
be partly relevant here, but probably not entirely. The humorous dimension of shows
may as well of course leave its mark on the type of devices that are used. These
links, and differences, have to be kept in mind.
Looking at stand-up comedy shows with the study of cohesion devices in mind
reveals a number of interesting facts. As was said before, one point of clear interest
is the fact that in stand-up comedy, there is very little –at least, obvious– topic con-
tinuity. A stand-up comedy show is not supposed to be “about” one given topic or
even one specific story; stand-up comedy routines are made up of a succession of
unrelated anecdotes with swift and regular shifts, so that they do not have an overt
coherent form. Some comedians may just keep it random, but a number of comedi-
ans do try and build something out of very unsystematic material.
A previous study of the question (Chauvin 2015) has allowed us to describe
some of the devices that are found in shows. Among the techniques that can be
found are:
–– Structural repetition;
–– Incremental topic shift;
–– The use of discourse markers;
–– NP-based topic introduction.
These devices may be used separately or conjointly; we will start by describing
them with the help of a few examples.
Structural repetition is very close to rhetorical (vs. linguistic) anaphora, where a
given word or structure is repeated at the beginning of each verse. In our case, each
sequence7 may be introduced with the same phrasing. Even though the sequences
have nothing or little to do with each other, repetition creates artificial cohesion and
makes the show “feel” like a whole. In Qualmpeddler (Chauvin 2015), Bill Bailey
introduces three very different sequences in his way (example 1):
We will not discuss how the notion of sequences can be applied to shows in detail here.
7
172 C. Chauvin
Force Majeure starts in a similar way: a first topic is openly introduced as a new
topic as he pretends to be thinking about a possible way to start (example 8):
(8) So where should we start this show? Ah, yes –human sacrifice! That’s a good
place to start.
These last two examples may seem to emphasize the fact that shows are basically
discontinuous, but they also mean that sequences may be organized around topics,
and routines organized thematically and formally. Announcing topics in this way
may be abrupt, but this procedure also draws the audience’s attention to topic intro-
duction or change, and functions as a warning. At the discourse level there may be
discontinuity, but this is partly amended at the interpersonal level, as the hearer is
warned that there is going to be an unexpected jump.
So all in all, in spite of its apparent randomness, stand-up comedy does make use
of cohesive devices. A number of these, if not all, may not be specific to the genre,
but the uniqueness of stand-up comedy may be found in the combination of such
features. Now callbacks may be more specific to the genre. Their nature and role
will now be examined.
9
Cf. Phill Jupitus in Quadrophobia (or other shows, cf. QI, Series 10, Episode 3) who uses yeah
–as well as Good thing, and True story– to imitate Eddie Izzard.
10
When the audience does not seem to respond to something, Izzard, speaking to himself, says
something along the lines of Do not ever mention that again, Never use these two together again,
and pretends to write it on his hand for future reference. This has become a well-known gesture and
is used across shows.
174 C. Chauvin
Callbacks are a type of “device” that is, in fact, known to performers; the term is not
introduced theoretically from an outsider’s point of view. Now the term may also be
a cover term that refers to different types of techniques, and some of the implica-
tions of the use of such (a) technique(s) need to be delved into.
As was just said, callbacks may constitute one device or perhaps a family of devices.
A number of websites and books written for would-be or professional comedians
openly mention callbacks as being one of the ways in which a heterogeneous rou-
tine can be made to function as a whole:
A callback is a reference a comedian makes to an earlier joke in a set. Callbacks are usually
made in a different context and remind the audience of an earlier joke, creating multiple
layers and building more than one laugh from a single joke. When used at the end of a set,
callbacks can bring a comic’s routine full circle and give closure to the set. Also Known As:
call back Callbacks, Glossary of Comedy by Patrick Bromley, Comedians Expert, About :
Entertainment, http://comedians.about.com/od/glossary/g/callback.htm
Callback — A punchline that refers, or “calls back,” to a joke or premise from earlier in the
performance […]. One of the most reliable comedy tricks, a callback can elevate a marginal
joke to legendary. “And then he closed strong by tying it all together with a callback to his
opening joke about lupus.”http://www.creatingacomic.com/comedy-glossary/
Callbacks. A callback is when you call back, or mention again, something you brought up
earlier in the act. (Carter 2001)
Call Back. An invaluable trick of the comedian’s trade is the “callback.” Imagine a guest
coming out later in Conan O’Brien’s show wearing Google Glass; the host could get big
laughs by miming a punch. The writer’s version of a callback is a glancing reference to a
detail, metaphor or phrasing from earlier in the piece. The device flatters readers and adds
to the continuity of the work, so give it a try. Please. https://theamericanscholar.org/seven-
things-writers-can-learn-from-stand-up-comedians/#.VVjZerntnBo
The call back. A callback is a reference to something said earlier in a routine or sketch. The
reference is usually a previous joke, but stand-up comics often use callbacks after interact-
ing with the audience –an audience member’s name will be inserted into a later joke. For a
callback to work, the time between the original reference and the callback must be rela-
tively brief. Repeated callbacks can be used (but never more than three times, of course11).
(Helitzer and Shatz 2005: 247)
The author’s are probably thinking of the “rule of three” (three is linked to good rhythm) often
11
The authors of those practical guidelines say that callbacks create continuity. They
also often emphasize another function –they are supposed to produce laughter
because they build a connection with the audience (our bold characters):
Audiences like callbacks because repeated references cause them to feel as if they are
part of a shared experience. (Helitzer and Shatz 2005 : 247)
This gives the routine cohesion and involves the audience, who have to work out what the
comedian is on about. (Ritchie 2012: 12)
It is a clever and inclusive strategy which makes the audience feel more involved in the
show because they have to work it out. (Ritchie 2012: 121)
Callbacks therefore consist in re-injecting material that was mentioned earlier in the
show at a later point in the same show. Let us now describe this using a series of
attested examples.
12
Although the last quotation says that callback is not used in the U.K., many examples were
found. So reincorporation is just a different name for it.
176 C. Chauvin
13
He mentions it in the commentaries of the Force Majeure DVD.
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 177
example (10), the rocket reference is also accompanied by a rapid upward gesture,
but the gesture, this time, does not replicate what was done for the first mention, but
is used to refer to the preceding passage visually as well as verbally, which is partly
different.) The next examples in which simple words are replicated may also some-
times be considered to be emblems when they are in fact a combination of intona-
tion/accent/word rather than just the repetition of a word.
A repeated word, or word form, can, thus, also be used. Bill Bailey, who uses
rhetorical anaphora in Qualmpeddler (see example 1), often also uses word- or
word-form-based callbacks. In these cases the word itself is used as a formal refer-
ence to an earlier part of the show; it is not (just) the content of it but also the phras-
ing that is brought to the fore. An example is “turns out” (example 14). Turns out is
first used in the anecdote that is told about the reality-TV star (see example 2).
When she realizes that the sun and the moon were not one and the same thing, she
is quoted as saying, Turns out they’re not the same thing. Bill Bailey then uses the
phrase “turns out” to introduce another statement, isolating turns out from the rest
of the sentence and making it stand out. Another example is dorsum (example 15):
the same anecdote about the reality-TV star leads to a discussion of the names of
body parts (see, again, example 2), and the name of the top of the foot (the rare,
technical term dorsum) is revealed and discussed. The word is re-injected again at
the end of the show in a reference to “the blood of Christ, the dorsum of Christ”.
Another example, this time not just word- but word-form-based, is the use of
hashtags and acronyms (example 16). In Qualmpeddler, Bill Bailey uses acronyms
early in the show, and then more acronyms are mentioned and created. And at the
end of the show, as a video about the (non-)appeal of the Christian faith to young
people is shown, it is made to conclude on a still that says “The Church OMG! #just-
prayin’”, the use of the hashtag being itself another callback, since at the beginning
of the show, #justsaying is used several times. In Force Majeure (example 17),
Eddie Izzard uses the French phrase Et voilà, explains that it can be used in all sorts
of contexts –for instance, he says as he pretends to be wiping his mouth with the
back of his hand, when you have just “vomited on the head of a child”–, moves on
to other things, and then later in the show concludes one of his sequences by saying,
Et voilà.
Although they may not entirely qualify as a different type of callback, audience-
based callbacks can be used when back references are not woven into the fabric of
the show, but based on something that comes from the audience (this is mentioned
in one of the quotations used in Sect. 5.1, Helitzer and Shatz 2005 : 247). Jimmy
Carr does it often (example 18), as well as Sarah Millican (example 19). The tech-
nique consists in engaging in talk with a member of the audience, preferably from
one of the front rows, or a heckler,14 and get their names and/or some information
14
See Sect. 3 for an explanation.
178 C. Chauvin
on who they are. Then the comedian re-uses their names or part of the information
that was given to them later in the show. The content of the information cannot be
anticipated, but the fact that something is asked of one member of the audience, and
memorized to be used again, has to be.
References to previous shows may also be used in specific cases; the comedian has
to be at least minimally famous and have formed a strong fan base in order to be
able to do that. This could be very close to the use of (external) cultural references,
except that the common culture that is recalled is that of the shows. Eddie Izzard
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 179
does it in Force Majeure, in which a retake of one of his famous routines, the “Death
Star canteen”, is proposed (example 22),15 and some characters that were just men-
tioned in passing but have become pet names for the audience, like “Mr. Stevens,
head of catering”, reappear and take center stage. This device can be seen as the
equivalent of “hit” songs in concerts, and the references are not entirely equivalent
to callbacks, but they partly have the same function(s) since they bring cohesion,
this time, to a whole series of shows rather than just one given show.
The use of structural repetition and of topic branching, may also be broadly
assimilated to callbacks, as they may also have the role of bringing something that
was mentioned before back to mind. The typology that we have just proposed shows
that there are, in fact, different forms of devices involved. A series of cases can be
described and distinguished that are both functionally close and formally different.
We will now examine some of the implications of the existence of such examples.
We will start by discussing what seem to be two possible consequences of the use
of callbacks in terms of cohesion. One has to do with the importance of the macro
level (and its interaction with the micro level); the other one also concerns the imbri-
cation of the local and the general level in terms of interpretation and
understanding.
It seems that one of the essential elements that callback techniques can bring to
the discussion of cohesion-building is the fact that they function at a macro level,
that of a whole “text”, i.e., here, the show, or sometimes (cf. example 22) several
shows, rather than at a linear-intersentential level. Although they do also function
more loosely, a number of traditionally recognized linguistic devices that are used
to create cohesion, like connectors (subordinators, coordinators) or anaphora create
local links between consecutive utterances, or between adjoining parts of a given
text. Of course, they may also join together elements that are further apart, or func-
tion at a level that is larger than that of individual consecutive sentences, but the type
of cohesion that is created here functions at the macro level and is not (entirely)
based on consecution.16
Callbacks also provide evidence of the fact that a full “text”, “conversation”, and
in this case, show, is not (just) a succession of utterances: the fact that they belong
to a whole is woven into their very structure. We may now wonder if in this case, a
This retake on a previous routine is what Bolens 2015 mostly focuses on.
15
Anaphors are of course known to (sometimes) function across whole paragraphs, and anaphor
16
chains also function across a whole paragraph, or text. What makes the kind of techniques we have
described before perhaps specific is that they necessarily function at the level of the show.
180 C. Chauvin
callback is truly linguistic in nature, i.e. if it affects the actual linguistic form of the
show, the language, the structure, the words that are used, or if the callbacks are
(“just”) an(other) story-telling technique. In fact, it has already been shown that the
very wording of the storytelling can be impacted. It can have an influence on the
turn of phrases that are used in the show (see examples 13–17), as is the case with
the word-based callbacks. It may also affect the way in which a given word or
phrase is uttered (example 13; maybe 14, 15, 17), as the intonation/accent/voice
quality may, for instance, mimic that of the previous use so as to make the connec-
tion clearer. The presence of the macro level has at least occasionally, and perhaps
more generally, a true impact on the form of the micro level.
Another impact, which has to do with cohesion but also narrativity and humour,
as will be seen later, is on how the interpretation of certain words or phrases is to
function. A word or phrase that is used as a callback is both present in a given con-
text, and made to refer to a preceding context. This may lead to an interpretation in
which several contexts are, in fact, merged, as opposed to forming an interpretation
that is made independently at a given point in discourse. A word or phrase that is
included in a callback is no longer a word that is used in a single utterance, or in a
single sequence; it is a word that is both used at a given moment and indexed to a
preceding context of use. Although this is something that, again, may be true of all
(or most?) words or phrases, it is specifically relevant in the case of callbacks, in
which the audience is openly encouraged to pay attention to connections. When
something is re-used in an alien environment, it is not just referred to: it may need
to be newly interpreted on the basis of where it occurs in the discourse as well as
what it refers back to. The use of OMG in The Church OMG (example 16) is thus
not interpreted in the same way after a first series of attacks against acronyms in the
first part of the show as it would have been had it been a first occurrence.
17
We will focus on what is directly linked to callbacks here. A more general discussion, which
would necessarily have to be more detailed, will have to be left for elsewhere.
18
In the commentary that is to be found in the Force Majeure DVD.
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 181
of events, that is typical of a narrative (cf. Labov and Waletzky 1967; Labov 1997,
2001, 2004, 2006) is generally not found at the level of the show.
The complexities of ordering in literary narratives were discussed by Genette
(1972) in particular, who noted that the temporal ordering of narratives was often
intricate. He discusses, for instance, the presence of prolepses, which hint at some-
thing that will happen later, and analepses, which refer back to something that was
mentioned before. He also mentions previous references to the question, such as the
use of the term Rückgriffe by Lämmert, for which Genette suggests a translation as
“retroceptions”.19 Analepses and Rückgriffe can in fact be reminiscent of what has
been discussed here under the term “callback” for stand-up comedy. But partly
because of the heterogeneity of the content of shows again, the comparison may not
be entirely viable as there may be no real sustained story line involved. The opposi-
tion that he also makes between homodiegetic and heterodiegetic devices (devices
that function within the story or are, basically, comments from the outside) would
have to be delved into in more detail to see to what extent it may apply to stand-up
comedy shows or not. Genette also argues that some (homodiegetic) analepses are
completive (Genette 1972), as they add something that was left unsaid; others are
called repetitive. Delayed closure (examples 20, 21) may thus be completive in
nature, although random material occurs between the two “parts” of a (sub-)story in
ways that it does not in a more traditional, self-contained narrative. In fact, a typol-
ogy of cases may be devised: sometimes, the material that is re-imported is part and
parcel of the new context in which it is used, which we could call grafting, as in the
Et voilà example in Force Majeure (example 17), where the fresh use of the phrase
does conclude the new section relevantly as well as being a callback. In the cases of
delayed closure (examples 20, 21), something is mentioned again but is not neces-
sarily woven into a new context; it is just a delayed continuation of one preceding
sequence. Now, whether they are graftings or maybe “just” re-injections, the pres-
ence of callbacks generates connections and creates parallels, which, even in the
absence of strong connections, can contribute to the creation of thematic or formal
motives: as he discusses the formal parallels between two sequences of Definite
Article, Glick (2007) uses the terms: “subtle poetic repeating patterns”. They con-
struct something that is, at least, a formal whole, in which an impression of unity is
created. And if interpretation, again, is to be sought at the general level and not just
linearly, this also goes towards creating a unified, though not unique, narrative as
well.
Humour is obviously a central dimension of comedy shows, and the fact that the
shows are supposed to be funny can certainly have an impact on their form(s), and
vice versa. Although this theme will not be developed in full in this paper, it might
19
Bauformen des Erzählens, Stuttgart, 1955, 2nd part (Genette 1972: 95).
182 C. Chauvin
20
We are not using the word “incongruous” or discussing the incongruity theory of humour on
purpose as this would require a specific discussion. Incongruity may create humour, but is, possi-
bly, not the sole source, or a necessarily straightforward source, of humour. We will therefore
deliberately not go into this debate here.
21
It is only partly the case in his example, as this is the beginning of a sequence, and so is known
to be used when a topic is introduced; this is arguably not a “therefore” so. The next example con-
sequently illustrates the kind of problem we are dealing with here in a clearer way.
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 183
(example 22) creates cohesion between shows (it is, partly, delayed closure at the
level of a series of shows), but it also serves other purposes: recognition, bonding,
with the comedian and between members of the audience that feel that they belong
to a group of people who know the same things. These dimensions, which are part
of the theatrical requirements, may also come into play, and lead to the use of cohe-
sive devices, not (just) for their own sake, but for other functional reasons of the
kind we have just mentioned.
7 Conclusion
The aim of this synthetic presentation of callbacks in stand-up comedy was to show
that it was interesting, and possibly important, to include the genre and the
technique(s) in the list of possible cohesive devices. Stand-up comedy has a number
of specificities that seem to make it interesting to study as a genre, in itself, and so
as to see what it can bring to more general debates. As far as cohesive devices are
concerned, we have emphasized the fact that callbacks induce a form of cohesion
that functions at the macro level, but with possible implications on the micro level,
too: cohesive devices can have an impact on the wording of certain forms, or on how
parts of the routines may have to be interpreted at both the local and the general
level. The ways in which shows can be considered to form, and not to form, a whole
has also been discussed. The fact that shows are both heterogeneous and partly
made to be homogeneous means that stand-up comedy offers very interesting
ground for the analysis of such questions, again, both within the genre, and more
generally. Some dimensions at work within stand-up comedy may, as was sug-
gested, be considered to be functional (e.g., how comedy shows are supposed to
function and “work”: bonding with, or within, the audience; possible reference to
past shows; and, of course, humour), which can be related to their presence within
the genre, with certain functional requirements leading to certain uses and practices.
More aspects could have been included, like the fact that the use of topic continuity
makes shows close(r) to “spontaneous” discourse but the use of isolated NPs and
structural anaphora are in fact more typical of formal discourse (Chauvin 2015).
The ways in which reference to something can be made to be more or less evident
can also be discussed further (cf. how “emblems” work, for instance, in a system-
atic, or less systematic, way); and the links between the different techniques and the
ways in which they are implemented by different comedians and in different shows
continue to be explored. Now as was seen, callbacks illustrate problems that can
overlap into literary studies, the analysis of comedy, and, of course, linguistics. It
therefore seems that such techniques clearly ought to be included in the repertoire
of cohesion-building devices when these are discussed across genres.
184 C. Chauvin
References
Web Pages
Alexander, C. J. Creating a Comic. Bombing, killing, and other occupational hazards of stand-up
comedy (blog about « breaking in stand-up comedy »), http://www.creatingacomic.com/
comedy-glossary/. Accessed 17 May 2015.
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 185
Corpus
Laura Hidalgo-Downing and Yasra Hanawi
Abstract The present article compares the stance styles in the two speeches
addressed to the Arab World by US Presidents George W. Bush in Abu Dhabi in
2008 and Barack Obama in Cairo in 2009. The main theoretical concepts addressed
are stance and recontextualization. Halliday’s (An introduction to functional gram-
mar, 2nd edn. Arnold, London, 1994) model of stance is adopted in the present
study to establish stance categories and degrees of subjectivity. Additionally, co-
occurrence with personal pronouns and negation is explored. Corpus methodology
and discourse pragmatic analysis are used in combination. The main claim is that
the higher frequency of markers of stance in Obama’s speech, in particular modal
verbs and negation and their co-occurrence with first person pronouns, evokes inter-
textually his predecessor’s speech and stance towards the Arab World, together with
commonly held assumptions about the relations between the US and the Arab
World. Results show significant differences in the choice and frequency of markers
of modality and negation and their co-occurrence with personal pronouns in the two
speeches. A different stance style characterizes each speech, with an effort in
Obama’s speech to recontextualize and reformulate the predominant discourse and
social practice in US foreign policy.
1 Introduction
The motivation for the present study is the socio-historical impact of the speech
delivered by US President Barack Obama to the Arab World in Cairo in 2009, early
on after his election as US president. His speech, entitled ‘A new beginning’ was
met with great expectations not only in Arab countries but also in Israel and the
whole Western world. According to US opinion polls from 2009, Obama’s speech
had a very positive overall world-wide reception, and created expectations of a new
era of international cooperation and stability. These expectations stood in stark con-
trast with the preceding two decades, which had witnessed the two Iraq wars, the
invasion of Afghanistan and the 9/11attacks of 2001. This preceding period of ten-
sion between the US and Arabic countries had taken place under the Presidency of
George W. Bush, who, after the 9/11 attacks declared the ‘War on terror’ against
what was called ‘the axis of evil’, constituted by Iran, Iraq and Afghanistan.
The present study explores the linguistic resources used by the two politicians to
index their political and personal stances with regard to the topic addressed and
their positioning towards their audience in the context of the Arab World and the
Middle East. The main argument is that the differences in stance styles which
emerge from the quantitative study point to the crucial role played by the analyzed
linguistic resources in the recontextualization of the political relations between the
US and the Arab World. In other words, in order to put forward his ‘New beginning’
in the relations between the US and the Middle East, Obama makes extensive use of
linguistic resources such as modality, the use of first person pronouns and negation,
which refer intertextually to previously held assumptions on these relations.
Negation, in particular, plays a crucial role in the modification and correction of
previous assumptions; additionally, Obama uses modality and first person pronouns
to engage with his audience and open up a common space of collaboration.
The objectives of the study are the following:
1. To identify the main types of stance following Halliday’s (1994) model of types
of subjectivity.
2. To measure the frequency and statistical significance of stance markers, specifi-
cally, personal pronouns, modality markers, mental verbs and negation by means
of corpus tools.
3. To explore and discuss the co-occurrence of the selected features and their dis-
course pragmatic functions as markers of stance in each speech.
4. To argue how the higher frequency of stance markers in Obama’s speech in com-
parison with Bush’s previous speech in the same context (The Arab World)
points at a strategy of intertextual recontextualization of previous discourses,
social practices and held assumptions about the relations between the US and the
Middle East.
Bush and Obama’s Addresses to the Arab World: Recontextualizing Stance… 189
2 Theoretical Background
The present study draws, first, on previous studies on the concepts of stance and
markers of modality and negation; second on the concept of recontextualization as
a process in which a piece of discourse refers intertextually to previous discourses
and social practices and to held assumptions on a particular topic (Linell 2009,
Semino et al. 2013); third, on the literature on political discourse and the signifi-
cance of uses of linguistic features such as modality and personal pronouns in the
indexing of stance (Boyd 2014a, b; Charteris-Black 2004, 2011; Chilton 2004;
Evans and Chilton (2010); Fairclough 1989, 1992, 2010; Fetzer 2013; Marín-Arrese
et al. 2013; Marín-Arrese 2015; Wilson 1990). Finally, corpus linguistic tools are
used to explore the frequency and co-occurrence of stance markers related to the
areas of indexicality, modality and negation, which are then discussed from a quali-
tative perspective (see, for example, Simon-Vandenbergen and Aijmer 2007; Biber
et al. 1999; Charteris-Black 2004; Rayson 2008).
Stance has been the focus of attention of numerous studies in discourse grammar
and discourse pragmatics (Halliday 1994; Biber and Finegan 1988; Biber et al.
1999; Englebretson 2007; Hunston and Thompson 2000; Marín-Arrese et al. 2013;
Martin and White 2005; Thompson and Alba-Juez 2014, among others). What
emerges from these studies is that there is a complex relation between the concepts
of evaluation, stance and positioning in discourse. Numerous proposals have been
put forward in order to identify categories of stance and the linguistic resources
which characterize different stance styles. Biber and Finegan (1988: 93) define
stance as ‘The lexical and grammatical expression of attitudes, feelings, judge-
ments, or commitments concerning the propositional content of a message’. In the
present study we address grammatical stance, in particular as described by Halliday
(1994). Halliday proposes a classification of stance types based on degrees of sub-
jectivity. Table 1 below shows a distinction between subjective explicit stance,
c haracterized by the use of first person pronouns and mental verbs (believe, know,
think), subjective implicit stance, marked by the use of modal verbs, objective
explicit stance, marked by the presence of stance adjectives and adverbs, and,
finally, objective implicit stance, indicated by the impersonal structure ‘it + be +
stance adjective’. We have included the negative form not in brackets as part of the
marking of stance because, as argued above, the frequency and distribution of nega-
tion across stance types is significant for the discourse pragmatic interpretation of
the differences between the two political speeches.
With respect to the relation between negation and stance, it is worth pointing out
that scholars who address the grammatical marking of stance focus on its realization
by means of modality (see Biber et al. 1999; Halliday 1994; Halliday and Matthiessen
2004; Hidalgo Downing and Núñez-Perucha 2013; Thompson 2004; Marín-Arrese
et al. 2013; Givón 1993). From this perspective, negation is considered by several
scholars as one of the language modalities; as explained by Givón, modalization in
language is a cline which goes from strong positive assertion, modality and irrealis,
presupposition, to strong negative assertion (adapted from Givón 1993: 170).
Negation and modality are particularly interesting because of the relative values
introduced by modal terms and because of the capacity of negation to evoke presup-
posed concepts and to introduce strong negative assertions.
Indeed, the use of negation is a well-known strategy in political discourse, in
which, as argued by Jordan (1998), two-part or three-part structures are used in
order to correct a previous assumption and pave the path for a new idea. A classic
example is the opening of Mark Anthony’s speech in Shakespeare’s play Julius
Caesar (Shakespeare 1991):
Example (1) illustrates a two-part structure in which the speaker uses negation to
defeat possible expectations held by his audience and correct them.
Du Bois’s notion of ‘the stance triangle’ is particularly significant for the under-
standing of stance as a dialogic and intersubjective phenomenon which underlies
the process of recontextualization discussed in the present paper. Du Bois argues
that we use language to establish relations with texts, with the topic at hand and with
other speakers (2007: 163). This concept of stance is based, as in other scholars (see
Martin and White 2005), on the dialogic view of discourse (Bahktin 1981). In this
sense, all discourses refer to previously produced texts and discourses intertextually.
The role of negation in this process is particularly significant, since in order to deny
an idea or defeat an expectation, the idea or expectation needs to be mentioned. In
the present article we argue that negation plays a crucial role in the process of recon-
textualization in political discourse as a social practice. Linell (2009) describes
recontextualization as a process in which language is re-used and adapted to new
contexts and situations, including new genres. He distinguishes three types of
recontextualization, intratextual, intertextual and interdiscursive. In the present
study we make use of the second type of recontextualization, intertextual
Bush and Obama’s Addresses to the Arab World: Recontextualizing Stance… 191
recontextualization (see also Boyd 2014a, b). From this perspective, Obama’s
speech in Cairo in 2009 can be seen in the light of a process of recontextualization
which involves a series of significant contextual changes in the genre of the presi-
dential address to a foreign community as a social practice: a change in the US
political program, which is the result of the change of president, a change in time,
from 2008 to 2009, and a consequent change in the socio-political context. Within
Critical Discourse Analysis and Political Discourse Analysis, this process has been
described as one of ‘re-imagining’ a social practice (Fairclough 2010; Boyd 2014a,
b). In the case analyzed in the present paper, recontextalization does not occur
across genres, but within the same genre by two different politicians at two different
moments in time. These differences in personal identity and time shape the recon-
textualization process as one in which what is re-imagined by the world community
is the relation between the US and Arab countries, and, consequently, the US for-
eign policy in international affairs. This is clearly consistent with the title of Obama’s
speech ‘A New Beginning’.
With regard to political discourse, we draw on studies which approach this type
of discourse as social practice, which consequently has ideological implications
(Boyd 2014a, b; Charteris-Black 2011; Chilton 2004, Fairclough 1989, 1992, 2010;
Wilson 1990). Though Bush’s and Obama’s discourses have been the object of
extensive study by numerous scholars (Boyd 2014b), the present article contributes
to current scholarship in this field of study by focusing on the two US Presidents’
approaches to the issue of the Arab World, a particularly conflictive one in the US
policy, and the way their speeches appeal intertextually both to the issue at hand and
to the assumptions and expectations of the audience they address.
The role of features such as modality, pragmatic markers, personal pronouns and
metaphor has been discussed by numerous scholars (see for example, Charteris-
Black 2004, 2011; Chilton 2004; Fetzer 2013; Boyd 2014a, b; Marín-Arrese et al.
2013; Marín-Arrese 2015). However, the role of negation has not received sufficient
attention as a strategy used by politicians to deny previous concepts and simultane-
ously introduce new ideas (for an example of this strategy in scientific discourse see
Hidalgo Downing 2014). A great part of the discussion in the present study focuses
on the interaction between modality, personal pronouns and negation, and on how
this interaction articulates the process of recontextualization in Obama’s speech.
3 Data and Method
3.1 Data
The data consists of the two speeches delivered by US Presidents George Bush Jr.
and Barak Obama to the Arab World. The former was delivered by President Bush
in Abu Dhabi on June 13th 2008 and is 3380 words long. The latter, entitled ‘A New
Beginning’, was delivered by President Obama in Cairo on June 4th 2009 and is
192 L. Hidalgo-Downing and Y. Hanawi
5871 words long. The socio-historical significance of the speeches has already been
described in the introduction to the present article.
3.2 Method
4 Results
Figure 1 below shows the frequency per 1,000 words of the stance categories fol-
lowing Halliday’s (1994) model: subjective explicit (SE), subjective implicit (SI),
objective explicit (OE) and objective implicit (OI), as explained in the section on
Theoretical Background above. The frequency per 1,000 words is calculated with
regard to the number of words in the whole corpus of each speech. The result of the
χ2 test is p < 0.005, which indicates that the overall differences with regard to stance
categories in each political speech are statistically significant.
Taking into consideration each stance category, the greatest difference is revealed
in the category Subjective Implicit stance, in which items are much more frequent
in Obama’s speech than in Bush’s. Additionally, a difference is observed in the pref-
erence of each politician in the objective stance category: while Bush’s speech
shows a preference for objective implicit stance, Obama’s speech shows a prefer-
ence for objective explicit stance.
Figure 2 below shows the various types of modality in the two political speeches:
epistemic and deontic modality. The frequency per 1,000 words is calculated within
the overall category of modal verbs in each corpus. The result of the χ2 statistical test
is p < 0.005, which indicates that the differences in each political speech are statisti-
cally significant.
Results in Fig. 2 reveal that Obama’s speech shows an overall higher frequency
of markers of modality, in particular of epistemic modality, followed by deontic
modality. Deontic modality shows a low frequency in Bush’s speech.
Fig. 1 Types of stance in Bush’s and Obama’s speeches (SE subjective, explicit, SI subjective,
implicit, OE objective, explicit, OI objective, implicit)
194 L. Hidalgo-Downing and Y. Hanawi
Fig. 3 Negation and distribution of negation types in Bush’s and Obama’s speeches
Figure 3 above shows the frequency per 1,000 words of overall negation types
and the distribution of types of negation. Statistical significance is determined on
the total number of negative types in terms of frequency per 1,000 words in each
corpus.
Results in Fig. 3. show that the difference in the frequency of use negation in the
two speeches is extremely significant, with p < 0.001
Bush and Obama’s Addresses to the Arab World: Recontextualizing Stance… 195
Figure 4 shows the frequency per 1,000 words of personal pronouns in the two
political speeches with regard to the total number of pronouns in each speech. The
statistical test shows differences are extremely significant, with p < 0.0001. Results
show Obama’s preference for first person pronouns and Bush’s preference for sec-
ond and third person pronouns.
Figure 5 shows the frequency and distribution of first person pronouns in Bush’s
and Obama’s speeches. Frequency per 1,000 words is calculated within the total
number of first person pronouns. The statistical test shows that results are extremely
196 L. Hidalgo-Downing and Y. Hanawi
significant, with p < 0.0001. While Obama uses the first person pronouns I and we
much more frequently than Bush, Bush shows a preference for the second person
pronoun you.
Figure 6 shows the frequency of second person pronouns per 1,000 words within
the total number of second person pronouns. The statistical test shows that
Bush and Obama’s Addresses to the Arab World: Recontextualizing Stance… 197
d ifferences are extremely significant, with p < 0.0001. Bush’s speech has a much
higher frequency of use of second person pronouns.
As a complement to the quantitative results provided so far, keyword signifi-
cance of lexical words, personal pronouns and modal verbs in each speech is pro-
vided in Table 2 and Table 3 above.
Results in Table 2 show the keyword significance of lexical items, personal pro-
nouns and modal verbs in Bush’s speech. Results show that the second person pro-
nouns you and your are highly significant in terms of keyness, occupying the first
two places in the list. No other markers of stance discussed in the present study
occur as significant in terms of keyness. The lexical items, however, reveal prefer-
ences in Bush’s discourse which stand out against lexical choices in Obama’s
speech. For example, the terms free, freedom and liberty are highly frequent in
Bush’s discourse, in line with the title of his address. The term terrorist, though not
used frequently is a significant keyword, since this term is absent in Obama’s
speech.
Table 3 shows keyness significance of selected lexical items, pronouns and
modal verbs in Obama’s speech. Results reveal the keyword significance of the
stance markers I, first person pronoun, our, plural first person possessive pronoun,
but, concessive, and must, deontic modal verb.
5 Discussion
The present section provides a detailed discussion of the results in the preceding
section by analysing selected concordances from each of the political speeches.
Each of the categories is discussed in turn.
198 L. Hidalgo-Downing and Y. Hanawi
The quantitative results show that President Obama’s speech has a significantly
higher frequency of markers of stance than Bush’s speech. In particular, Obama
makes use of subjective implicit stance, that is, a stance style that is characterized
by the use of modal verbs, followed by subjective explicit stance, which is charac-
terized by the use of first person pronouns and mental verbs. Bush, on the other
hand, shows a preference for subjective explicit stance, followed by subjective
implicit stance. In addition to this main difference in the stance preferences of the
two politicians, the distribution of pronominal forms and their co-occurrence with
the stance markers of modality, together with the frequency of negation, reveals
further differences in intersubjective positioning in the two politicians.
Within the category of subjective explicit stance, Bush shows a preference for the
co-occurrence of the first person plural we (referring to the US government) and
mental verbs, as in examples (1) to (5) below:
(1) We [[believe]] that trade and investment is the key to the future of hope and
opportunity.
(2) We [[believe]] that stability can only come through a free and just Middle East.
(4) Yet we also [[know]] that for all the difficulties, a society based on liberty is
worth the sacrifice.
(5) We [[know]] that democracy is the only form of government that treats
individuals with the dignity and equality that is their right.
(6) We [[know]] from experience that democracy is the only system of government
that yields lasting peace and stability.
Obama, by contrast, shows a preference for the use of the first person pronoun I; the
main difference in the use of this pronoun by Bush and Obama is that while Bush
uses it to refer to his identity as President of the US, Obama uses it both as a referent
to his public identity as President but also as a referent to his personal history. The
use of I as referent to the public identity of the US Presidents is clear in the conven-
Bush and Obama’s Addresses to the Arab World: Recontextualizing Stance… 199
tional expressions of gratitude that characterize the openings and closings of politi-
cal speeches, as in examples (7) and (8), from Bush’s speech below:
(7) [[I]] am honored by the opportunity to stand on Arab soil and speak to the
people of this nation.
(8) And [[I]] appreciate the fact that your country sent a delegate.
The high frequency of use of the pronoun I in Obama’s speech shows the President’s
intention to position himself as personally committed to the task he has taken on
board towards the Arab world; this indicates personal involvement in addition to the
guarantee provided by the use of the public I for his positioning as US President.
This difference can be observed in examples (9) to (12), which illustrate the use of
private or personal I, and examples (13) and (14), which illustrate the use of public
or Presidential I (note that examples (9) and (10) do not illustrate co-occurrences
with mental verbs):
(9) Part of this conviction is rooted in my own experience. [[I]] am a Christian, but
my father came from a Kenyan family that includes generations of Muslims.
(10) We see it in the history of Andalusia and Cordoba during the Inquisition.
[[I]] saw it firsthand as a child in Indonesia.
(12) [[I]] know, too, that Islam has always been a part of America’s story.
(13) The fourth issue that I will address is democracy. [[I]] know there has been
controversy about the promotion of democracy in recent years.
(14) Those are mutual interests. That is the world we seek. But we can only
achieve it together. [[I]] know there are many – Muslim and non-Muslim
– who question whether we can forge this new beginning.
If we consider the uses of the pronoun we, it is worth pointing out that while Bush
makes use of this pronoun exclusively to refer to the US government, as in examples
(1) to (6) above, Obama uses we both to refer to the US government and to refer to
the American people and people from the Middle East countries, including Israel,
Palestine and other Arab countries, as in examples (16) and (17):
(16) These needs will be met only if [[we]] act boldly in the years ahead.
200 L. Hidalgo-Downing and Y. Hanawi
(17) Some suggest that it isn’t worth the effort – that [[we]] are fated to disagree,
and civilizations are doomed to clash.
(18) The people of the world can live together in peace. We [[know]] that is God’s
vision. Now, that must be our work here on Earth. Thank you.
(19) So whatever we [[think]] of the past, we must not be prisoners of it. Our
problems must be dealt with through partnership.
(20) The United States [[will]] always stand with Israel in the face of terrorism.
(21) And as you build a Middle East growing in peace and prosperity, the United
States [[will]] be your partner.
(22) The United States [[will]] continue to support you as you build the institutions
of a free society.
(24) And when that good day comes, you [[will]] have no better friend than the
United States of America.
(25) The United States [[will]] help you build the institutions of democracy and
prosperity.
(27) To the leaders across the Middle East who are fighting the extremists:
The United States [[will]] stand with you as you confront the terrorists
and radicals.
(28) And as you struggle to find your voice and make your way in this world,
the United States [[will]] stand with you.
(29) And in a free and just society, individuals can rise as far as their talents
and hard work [[will]] take them.
(30) The day [[will]] come when the people of Iran have a government that
embraces liberty and justice.
The use of the modal will by Obama also shows commitment to the role of the US
government in the reconciliation process, but, additionally, it shows numerous co-
occurrences with the first person pronoun I, demonstrating Obama’s personal com-
mitment to this process of change.
(31) America [[will]] align our policies with those who pursue peace, and say in
public what we say in private to Israel.
(33) The sooner the extremists are isolated and unwelcome in Muslim communities,
the sooner we [[will]] all be safer.
(34) And [[I]] will host a Summit on Entrepreneurship this year to identify how
we can deepen ties between business leaders.
(35) That is why I ordered the removal of our combat brigades by next August.
That is why we [[will]] honor our agreement with Iraq’s democratically-
elected government to remove combat troops from Iraq.
Obama also makes use of will to make reference to the difficult issues that need to
be addressed by the US and the countries in the Middle East, as in example (36):
(36) It [[will]] be hard to overcome decades of mistrust, but we will proceed with
courage, rectitude and resolve.
While the modal verbs used by both politicians include epistemic will, may, might,
can, could and deontic can, must and should, there is a great difference in the fre-
quency of use of specific modal verbs by each politician. Thus, the modal verbs can,
must and should are significantly more frequent in Obama’s speech than in Bush’s.
202 L. Hidalgo-Downing and Y. Hanawi
The uses of these modal verbs are illustrated in examples (37) to (43) below from
Bush’s speech, together with occurrences of the negative form cannot (examples
(44) to (47):
(37) And in a free and just society, individuals [[can]] rise as far as their talents
and hard work will take them.
(38) All know the lasting stability that only freedom [[can]] bring.
(39) The Palestinian people aspire to build a nation of their own – where they
[[can]] live in dignity and realize their dreams.
(40) We believe that stability [[can]] only come through a free and just Middle
East.
(41) Power is a trust that [[must]] be exercised with the consent of the governed.
(42) And the people of the Middle East [[must]] continue to work for the day
where that is also true of the lands that Islam first called home.
(43) As we demand you open your markets we [[should]] open ours, as well.
The negative form of can, cannot, is used by Bush in co-occurrence with the second
person pronoun you as impersonal you, and to refer to third persons.
(44) You [[cannot]] build trust when you hold an election where opposition
candidates find themselves harassed or in prison.
(45) You [[cannot]] expect people to believe in the promise of a better future
when they are jailed for peacefully petitioning their government.
(46) And you [[cannot]] stand up a modern and confident nation when you do not
allow people to voice their legitimate criticisms.
Examples (48) to (54) illustrate the use of must in Obama’s speech, indicating an
effort on the part of the US and countries in the Middle East to carry out a joint
endeavor to achieve peace and stability. These examples additionally illustrate the
tendency for first person we to co-occur with modal verbs.
(48) This cycle of suspicion and discord [[must]] end. I have come here to seek a
new beginning.
Obama shows a great degree of commitment in demanding all the parties’ involve-
ment by using the modal must to appeal to each of the countries that are addressed:
(56) Israel [[must]] also live up to its obligations to ensure that Palestinians can
live, and work, and develop their society.
(57) Finally, the Arab States [[must]] recognize that the Arab Peace Initiative was
an important beginning.
The modal can, of course part of the logo of the presidential campaign, ‘Yes we
can’, is used extensively by Obama to point out how peace can be achieved in
partnership,
(58) That is the world we seek. But we [[can]] only achieve it together
(59) I will host a Summit on Entrepreneurship this year to identify how we [[can]]
deepen ties between business leaders, foundations and social entrepreneurs
in the United States and Muslim communities around the world.
Can is also used to refer to the doubt or skepticism shown by certain communities
with regard to the process of peace, as in examples (61) and (62):
(61) Muslim and non-Muslim – who question whether we [[can]] forge this new
beginning.
(62) Many more are simply skeptical that real change [[can]] occur.
204 L. Hidalgo-Downing and Y. Hanawi
The results show that markers of stance in these categories are virtually not used by
either Bush or Obama (2 and 3 markers by each politician as shown in the results
section above). This may be interpreted as a preference in the two political speeches
for Subjective stance, which allows for the display of a subjective identity and posi-
tioning which may have a stronger persuasive effect. This is clear in the co-
occurrence of personal pronouns with mental verbs and modal verbs in the two
speeches.
5.2 Negation
The last section of the Discussion addresses the role of negation in the two speeches.
As observed in previous sections, negation is significantly more frequent in Obama’s
speech than in Bush’s. The use of negation in Bush’s speech is illustrated in exam-
ples (63) to (69) below:
(63) They hate your government because it does [[not]] share their dark vision.
(65) History teaches us that the road to freedom is [[not]] always even, and
democracy does not come overnight.
(67) The road to freedom is not always even, and democracy does [[not]] come
overnight.
(68) Most people do [[not]] want war and bloodshed and violence.
(69) They say that the Arab people are [[not]] “ready” for democracy.
The examples above show a use of negation to introduce general statements (the
road to freedom is not always even, and democracy does not come overnight, most
people do not want bloodshed and violence) or to present the dark vision of the rela-
tion between the US and the Middle East, where loaded lexical items such as hate
and terrorist stand out (examples (63) and (64)).
Negation in Obama’s speech is used extensively to correct previous assumptions
on the relations between the US and the Middle East, as well as to correct assump-
tions the US may have about Arabic countries and Arabic countries may have on
the US:
Bush and Obama’s Addresses to the Arab World: Recontextualizing Stance… 205
5.2.1 C
orrecting Assumptions About Relations Between the US
and Arabic Countries
(70) America and Islam are [[not]] exclusive, and need not be in competition.
(71) You must maintain your power through consent, [[not]] coercion.
(72) Just as Muslims do [[not]] fit a crude stereotype, America is [[not]] the crude
stereotype of a self-interested empire.
(73) America and Islam must be based on what Islam is, [[not]] what it isn’t.
(75) But we should choose the right path, [[not]] just the easy path.
5.2.2 C
orrecting Assumptions Arab Countries Are Thought to Hold
About the US
Negation is used to correct assumptions held by Arab countries on the US, as in the
following examples:
(76) Those are [[not]] just American ideas, they are human rights, and that is why
we will support them everywhere.
(77) These are not opinions to be debated; these are facts to be dealt with. Make
no mistake: we do [[not]] want to keep our troops in Afghanistan. We seek
no military bases there.
(78) America does [[not]] presume to know what is best for everyone, just as we
would not presume to pick the outcome of a peaceful election.
(79) In Ankara, I made clear that America is [[not]] – and never will be – at war
with Islam.
5.2.3 C
orrecting Assumptions the US are Thought to have of Arab
Countries
(80) The attacks of September 11th, 2001 and the continued efforts of these
extremists to engage in violence against civilians has led some in my
country to view Islam as inevitably hostile [[not]] only to America and
Western countries, but also to human rights.
5.2.4 R
einforcement of His Personal Commitment and Personal Story
as Exemplary and Positioning of the US Government
Finally, Obama makes use of negation to reinforce his personal commitment to his
new policy and to make reference to his personal experience as a guarantee of his
position towards the Arab World.
(82) No system of government can or should be imposed upon one nation by any
other. That does [[not]] lessen my commitment, however, to governments
that reflect the will of the people.
(83) Much has been made of the fact that an African-American with the name
Barack Hussein Obama could be elected President. But my personal story
is [[not]] so unique.
(84) You must maintain your power through consent, [[not]] coercion; you must
respect the rights of minorities, and participate with a spirit of tolerance and
compromise.
(85) Just as Israel’s right to exist cannot be denied, neither can Palestine’s.
The United States does [[not]] accept the legitimacy of continued Israeli
settlements.
6 Conclusions
The present study has addressed the differences between the stance styles of Bush’s
and Obama’s addresses to the Arab World in 2008 and 2009 respectively. In the
quantitative results and the qualitative discussion, it has been argued that the higher
frequency of subjective stance markers (mental verbs and modal verbs) and their
co-occurrence with first person pronouns and negation in Obama’s discourse can be
interpreted as an attempt to recontextualize and ‘re-imagine’ the position of the US
policy towards the Middle East. The combination of stance markers and negation in
Obama’s speech evokes commonly held assumptions about the relations between
Bush and Obama’s Addresses to the Arab World: Recontextualizing Stance… 207
the US and the Middle East in order to correct these assumptions and propose ‘A
new beginning’ based on cooperation and partnership. Obama’s preference for the
first person pronoun I and his more frequent use of epistemic and deontic modality
show a more personal involvement both with the topic addressed and with his audi-
ence, revealing an attempt to engage actively and personally in a change in the rela-
tions between the US and the Middle East through cooperation and partnership.
Bush’s speech, with a significantly lower frequency of stance markers and nega-
tion and a preference for second person pronouns instead of first person pronouns,
shows a more conventional discourse. The low frequency of negation seems to indi-
cate that there is no need to deconstruct previous assumptions about the status quo
in US international affairs. With regard to the low frequency of modal markers, and
in particular of deontic modals, together with a higher frequency of second person
pronouns, Bush’s speech shows a preference for unmodalized assertions, and con-
sequently, a more authoritative stance. This authoritative stance is reinforced by the
use of the impersonal you and the first person plural we referring to the US govern-
ment, which tend to indicate an avoidance of responsibility on the part of the
speaker.
In brief, while Bush’s stance in his Abu Dhabi speech seems to maintain a previ-
ously accepted status quo in the relations between the US and the Middle East,
Obama’s stance in his Cairo speech seems to be staged as a deconstruction of previ-
ously held assumptions. Obama’s speech can be interpreted as a riskier speech
which relies heavily on the identity and personal projection of the speaker, sup-
ported by the high frequency of first person pronouns referring to his personal iden-
tity, personal I, presidential I, we as the US government and we to refer to all the
parties concerned in the proposed change of international policy. The high fre-
quency of negation suggests that the topic is perceived as controversial and a strong
stance is required. Hence, the politician’s voice is positioned against previously held
assumptions which need to be corrected, though it shows that his frequent use of
markers of modality shows that he is open to other possible alternatives. Indeed,
seen in the light of recent changes in the US political scenario, the present study is
also open to alternative interpretations.
Acknowledgements This study has been carried out as part of the research work of two research
projects, the first one funded by the Ministerio de Ciencia e Innovación (FFI-2008-01471FILO),
and the second one funded by the Ministerio de Economía y Competitividad (FFI-201-30790) to
whom we are grateful.
References
Bakhtin, M. M. (1935 [1981]). The dialogic imagination: Four essays (M. Holquist, Ed. and
C. Emerson & M. Holquist, Trans.). Austin: University of Texas Press.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). The Longman grammar of
spoken and written English. London: Longman.
Biber, D., & Finegan, E. (1988). Adverbial stance types in English. Discourse Processes, 11, 1–34.
208 L. Hidalgo-Downing and Y. Hanawi
Semino, E., Deignan, A., & Littlemore, J. (2013). Metaphor, genre and recontextualization.
Metaphor and Symbol, 28(1), 41–59.
Simon-Vandenbergen, A. M., & Aijmer, K. (2007). The semantic field of modal certainty. A
corpus-based study of English adverbs. Berlin: Mouton de Gruyter.
Thompson, G. (2004). Introducing functional grammar (2nd ed.). London: Hodder Education.
Thompson, G., & Alba-Juez, L. (Eds.). (2014). Evaluation in context. Amsterdam: John Benjamins.
Wilson, J. (1990). Politically speaking. Oxford/Cambridge MA: Basil Blackwell.
The Role of Metadiscourse in Genre Analysis:
Engagement Markers in Undergraduate
Textbooks and Research Articles
Tereza Guziurová
1 Introduction
In the last 30 years or so, metadiscourse has been used quite frequently as a tool for
characterizing various genres, especially the genres of academic discourse. It might
be surprising that its popularity endures despite numerous definitions of the concept
that vary significantly. It is probably not possible to say any more that metadis-
course is “undertheorized”, as Hyland and Tse suggested in 2004; however, the
differences between its conceptualizations are not negligible, but have important
methodological and practical implications for research. The common aspect of a
number of earlier definitions was their stress on the non-propositionality of meta-
discourse. Crismore et al., for example, defined it as “linguistic material, written or
T. Guziurová (*)
Faculty of Arts, Centre for the Research of Professional Language, University of Ostrava,
Ostrava, Czech Republic
e-mail: tereza.guziurova@osu.cz
spoken, which does not add anything to the propositional content” (1993, p. 40).
However, this criterion has not proved to be satisfactory, so writers have instead
looked for a definition of metadiscourse in Halliday’s theory of metafunctions.
Halliday’s ideational metafunction was connected with propositional content; meta-
discourse, on the other hand, was believed to convey either interpersonal or textual
meanings (Vande Kopple 1985, pp. 84–85).
Basically, it is possible to identify two approaches to metadiscourse1 – integra-
tive and non-integrative – depending on whether it includes only text-organizing
elements and elements referring to the text itself, or also the writer’s epistemic and
affective attitude to the text and interaction with the reader. The integrative
(“broad”) approach investigates linguistic elements revealing how the text is orga-
nized, but it also focuses on the writer’s presence in the discourse, the ways in which
he or she comments on the text or expresses his or her attitudes towards it. The
broad approach includes ‘stance’ as a metadiscursive category, usually under head-
ings such as validity markers (hedges, emphatics) and attitude markers. This strand
has been applied in many studies, and it was probably taken furthest by Hyland,
who suggests that “all metadiscourse is interpersonal in that it takes account of the
reader’s knowledge, textual experiences and processing needs” (Hyland 2005,
p. 41).
The non-integrative (“narrow”) approach primarily investigates aspects of
text organization and elements referring to the text itself. This conception of meta-
discourse does not include the writer’s presence in the text in general; rather, meta-
discourse is an explicit expression of a writer’s awareness of the current text.
Therefore, it is defined as “the writer’s explicit commentary on his or her own ongo-
ing text” (Mauranen 1993, p. 154). The second important feature of the non-
integrative approach is the ‘current text’, meaning that references to other texts are
not included within this approach. The narrow approach has been applied (with
some modifications) for example by Mauranen (1993), Schiffrin (1980) and Bunton
(1999).
Drawing on the integrative approach to metadiscourse, this study focuses on one
category, engagement markers, and aims to compare their occurrence and use in two
academic genres, the research article and the undergraduate textbook, within one
discipline – linguistics. Since the analysis is part of a larger research project aiming
at the description of metadiscursive features in the undergraduate textbook, the
chapter also briefly discusses the integrative approach to metadiscourse, represented
especially by Hyland (2005). Despite its popularity among researchers (e.g.
Crismore and Farnsworth 1990; Luukka 1994; Hyland 2005; Bondi 2001; Boggel
2009; Kuhi 2012), the integrative approach has recently been criticized for covering
disparate language phenomena that cannot easily be put under one umbrella term
(text-organizing elements, the expression of stance, writer-reader interaction).
1
The two approaches were probably first distinguished by Mauranen (1993). For a detailed discus-
sion, see, for example, Ädel (2006).
The Role of Metadiscourse in Genre Analysis: Engagement Markers in Undergraduate… 213
Therefore, the study also assesses the potential advantages and drawbacks of this
model.
2 Data and Methodology
The investigation is carried out on the basis of a corpus consisting of seven under-
graduate textbooks and eight research articles written by native speakers of English.
They are all from one discipline, linguistics, in order to avoid the disciplinary varia-
tion in metadiscourse shown by previous studies. The textbooks were published
between 1997 and 2010 and they are all “introductory” textbooks in the sense that
they are regarded as introductions to linguistics or relevant linguistic disciplines
(e.g. phonology, morphology). One or two chapters from each textbook were ana-
lyzed so that the total number of words would be approximately the same (see
Table 1 below). The resulting material consists of over 51,500 words altogether. The
research articles were taken from two well-established journals, English for Specific
Purposes and Journal of Pragmatics, published between 2002 and 2008. Eight
complete articles have been analysed, totalling 53,145 words (see Table 1). All the
articles were selected randomly, but they needed to comply with two criteria: to be
written by native speakers of English, and to be single-authored. The second crite-
rion is particularly important for the present analysis of engagement markers
because one of the aims was to find out how the pronoun we functions in both
genres. Since all the textbooks were single-authored, I tried to compile a parallel
corpus of research articles.
As mentioned above, this study is part of larger research focusing on metadis-
course in the genres of undergraduate textbook and research article. Table 2 below
shows the overall distribution of metadiscourse categories in my corpus. The indi-
vidual categories overlap with Hyland’s (2005), with the exception of transitions in
which I followed Mauranen’s approach (1993) and considered only inter-sentential
connectors. The present study thus applies the integrative approach to metadis-
course, which covers both textual and interpersonal features. However, it does not
aim to discuss all the categories; rather, it focuses on engagement markers which, as
Table 2 indicates, account for the largest proportion of metadiscourse elements in
the textbooks (14.6 items per 1000 words) and they also present the greatest differ-
ence between the genres, with only 5.1 items per 1000 words in the research articles.
The quantitative results are a little higher than the data from Hyland’s research car-
ried out on larger corpora – 8.4 in textbooks and 2.5 in RAs (2005, p. 144; 162), but
the ratio is similar.
As for the individual categories, Hyland (2005, pp. 50–54) characterizes them as
follows:
Transition markers are mainly connectives and adverbial phrases which help read-
ers interpret connections between steps in an argument (e.g. furthermore, in
addition, but).
Frame markers signal text boundaries (e.g. first, next, finally). Items included here
also announce discourse goals (my purpose is) and explicitly label text stages (to
summarize).
Endophoric markers are expressions that refer to other parts of the text (e.g. noted
above, in chapter 1).
Evidentials express the intertextual character of academic writing and are defined
as representations of an idea from another source.
Code glosses supply additional information by rephrasing or explaining what has
been said (e.g. in other words, for example).
Hedges allow writers to withhold complete commitment to a proposition (e.g.
might, perhaps).
Boosters allow writers to express certainty in what they say (e.g. clearly,
obviously).
The Role of Metadiscourse in Genre Analysis: Engagement Markers in Undergraduate… 215
Attitude markers express the writer’s affective, rather than epistemic, attitude to
propositions (e.g. I agree, unfortunately).
Self mention concerns the explicit authorial presence in the text expressed by first-
person pronouns and possessive adjectives (I, exclusive we, me, our etc.).
Engagement markers explicitly address readers in order to focus their attention or
include them as discourse participants (e.g. inclusive we).
It should be noted that the quantitative approach to metadiscourse has some limi-
tations. Characterizing genres by the amount of metadiscourse elements can be
problematic because while the quantitative differences may be significant, it is also
the qualitative differences that seem to be genre-specific. For example, the number
of hedges was expected to be significantly higher in RAs than in textbooks because
the former is more argumentative in nature; it has been described as “the arena for
conflicting views” (Myers 1992, p. 6). While this has been proved, the analysis has
shown that textbooks also include a considerable number of hedges, but of different
type. The most frequent hedging devices in the RAs proved to be epistemic lexical
verbs, whereas the textbooks showed the highest incidence of adverbs, specifically
approximators (see Guziurová 2014). Therefore, the quantitative analysis will only
be a starting point, and I will focus on the different functions metadiscourse ele-
ments fulfil in the two genres. The next section deals with engagement markers and
their use in the corpus; specifically it discusses the use of pronoun you, imperatives,
questions and it focuses on the pronoun we, which proved to be the most frequent
engagement marker.
3 Engagement Markers
Engagement markers are defined as devices that “explicitly address readers, either
to focus their attention or include them as discourse participants” (Hyland 2005,
p. 53). Two main purposes are distinguished: firstly, to meet readers’ expectations
of inclusion and disciplinary solidarity, addressing them as participants in the dis-
course (mainly by the pronouns you, your and inclusive we); secondly, to engage
readers in the discourse at critical points, predicting possible objections and guiding
them to particular interpretations (by questions and directives such as note, con-
sider) (Hyland 2005). Table 3 shows an overall distribution of engagement markers
in the undergraduate textbooks and research articles in the corpus.
The most common device functioning as an engagement marker is the pronoun
we (inclusive we, but see below), accounting for approximately 70% of engagement
markers in both genres, and it will thus be the main focus of this study. The other
engagement markers included the second person pronouns, questions, imperatives
and various comments whose function is to address readers as participants; they will
be discussed shortly in the following passage. Hyland’s list of engagement markers
also includes obligation modals; however, these were mostly regarded as attitude
markers in the analysis since they often seem to express the writer’s attitude to
216 T. Guziurová
Table 5 Distribution of engagement markers (EM) in research articles (raw number/number per
1000 words)
RAs RA1 RA2 RA3 RA4 RA5 RA6 RA7 RA8
EM 9/2.5 15/2.2 4/0.7 11/1.6 0/0 64/8.5 92/11.1 76/10.5
The pronoun you accounts for almost 17% of engagement markers in the text-
books, but only 7% in the RAs. The low frequency especially in the RAs is not
surprising since generic you is considered rather informal (Quirk et al. 1985) and
you with a specific reference refers directly to the addressee but excludes the
speaker. As Kuo (1999) suggests, “from the perspective of the reader-writer rela-
tionship in a journal article, you could sound offensive or detached since it separates
readers as a different group from the writer”. Since the writer appeals to the readers’
approval of the claims he makes, it is not desirable to use exclusive you – it is his
peers in the same disciplinary community he addresses. Second-person you and
your thus occurred only in a single article from the Journal of Pragmatics, in which
the author wanted to illustrate the cooperative principle and opted for an example
showing the alternation of I and you:
(1) In a situation where you wish to borrow my car, ask how it is running and
I say, “I have just had it thoroughly checked,” you would take me to be
implicating that it is in good order. That depends on you supposing I am
being cooperative […].3 (RA8: 1898)
(2) If someone tells you Your brother is waiting outside, there is an obvious
presupposition that you have a brother. If you are asked Why did you arrive
late?, there is a presupposition that you did arrive late. (TB3: 133)
In addition, the author can address his or her readers (students) directly if they
can try and practice certain things by themselves, as in the description of articulators
in the chapter on phonetics or in the production of certain consonants (example 3).
(3) The hard palate is often called the “roof of the mouth”. You can feel its
smooth curved surface with your tongue. (TB5: 9)
However, the pronoun you might also reflect the unequal relationship between
participants. In example 4, the writer fulfils the role of an expert who gives training
(4) If you have not thought about such things before, you may find some difficulty
in understanding the ideas that you have just read about. (TB5: 40)
(5) This was not the role in the fourteenth century of the church (see section
5.1), nor of a French-speaking court. (TB2: 53)
Generally, directives (as Hyland calls them) can be used to guide readers through
the discourse, but more importantly to guide readers’ reasoning, making sure they
understand a point in a certain way (e.g. note, think of, compare).
Finally, questions are regarded as a good strategy to engage readers in a dis-
course. Textbook authors used questions mainly for instructional purposes; they
asked a question and immediately provided an answer, as in example 6. However, in
some cases the questions were not so simple to answer, but their main purpose was
to promote the readers’ interest and show them that the topic is complex. The ques-
tions in example 7 also served as an introduction to a new topic, as they were situ-
ated at the beginning of a chapter. Interestingly, questions appeared several times in
clusters, adding emphasis to the writer’s claims and stressing that despite being an
expert he can only speculate about certain things (example 8). Together with inclu-
sive we, they might evoke an atmosphere of solidarity.
(7) But what is it that makes us doubt the realness of the reality we experience?
Could it be language? (TB6: 70)
The Role of Metadiscourse in Genre Analysis: Engagement Markers in Undergraduate… 219
(8) If we want to test the realness of this reality, how can we? By stepping outside
language? Is that possible? (TB6: 72)
Surprisingly, the research articles in the corpus contained more questions than
textbooks. However, these questions did not function as typical engagement mark-
ers but it seems more appropriate to say that they presented the issues which the
authors aimed to explore.
(9) Surprisingly, however, there has been very little explicit consideration of the
interrelationship between the two concepts. For example, to what extent are
identity and face similar or different? How may theories of identity inform
our understanding of face, and how may they aid our analyses of face?
This paper takes up the challenge of exploring these questions. (RA6: 639)
The authors presented the questions that were addressed in their articles or, alter-
natively, the questions that should be addressed in further research.
3.2 Pronoun We
Much attention has been given to the use of pronoun we in academic writing (e.g.
Kuo 1999; Myers 1989; Harwood 2005; Fløttum et al. 2006). The pronoun we has
been traditionally studied in terms of the different discourse functions it fulfils in
journal articles (e.g. Kuo 1999; Harwood 2005), or in other genres of academic
discourse, even in the context of mathematics classrooms (Rounds 1987). Two
aspects are regularly investigated in connection with we: semantic reference (who
the pronoun refers to) and discourse functions or rhetorical motivations for its use.
This subchapter is going to focus on the semantic reference and the use of pronoun
we mostly in the genre of undergraduate textbook because it was used rather fre-
quently in that genre in the corpus, accounting for 70% of all engagement
markers.
The pronoun we is traditionally perceived as having two semantic functions:
inclusive, in which the addressee is included (I + you), and exclusive, in which the
addressee is excluded (I + they) (Rounds 1987). In addition, special uses of we are
sometimes distinguished, for example ‘inclusive authorial we’ and ‘editorial we’
(Quirk et al. 1985, p. 350). Fløttum et al. (2006) acknowledge the fundamental ref-
erential vagueness of we; it includes the author(s), but there is variation in terms of
whether the reader is included and whether others are, as well as who these potential
others might be (p. 95). Furthermore, the pronoun can be used figuratively, referring
to a single author or even to readers. Rounds (1987) speaks about “semantic remap-
pings” (traditional inclusive and exclusive we are within the range of semantic map-
ping for we) and she recognizes three such cases: (1) we used about the speaker (we
220 T. Guziurová
for I), (2) we used about the readers (we for you), and (3) we whose actual referent
is anyone who does the action.
There were 525 instances of the pronoun we (our, us) in the corpus which were
considered explicit or implicit engagement markers. However, their semantic refer-
ents varied and the inclusive/exclusive distinction was not sufficient to describe all
of them, especially due to the fact that all the textbooks were single-authored.
The range of semantic reference of we is given in Table 6; however, the division
is only tentative since the pronoun can be ambiguous, which might be one of the
reasons why it is so popular in academic writing.
The majority of instances referred to the writer and his readers. Writers orga-
nized their texts, stating what was done or what is going to follow, as in the follow-
ing examples:
(11) In the next chapter, we will investigate what these normal stages are.
(TB3: 166)
(12) As we have seen, the Neogrammarians took the view that linguistics, in so
far as it is scientific and explanatory, must necessarily be historical.
(TB4: 218)
(13) We have all experienced difficulty, on some occasion(s), in getting brain and
speech production to work together smoothly. (Some days are worse than
others, of course.) (TB3:160)
Inclusive pronouns also cluster around longer examples or pictures that are sup-
posed to illustrate a point in the exposition. The chapter on pragmatics starts with a
brief definition and then continues with two pictures which should illustrate that
often more is being communicated than is said (example 14). The passage ends with
a more generalised statement that we (human beings) are actively involved in the
interpretation of what we read and hear.
(14) In the other picture, […] we can recognize an advertisement for a sale of
clothes for those babies and toddlers. The word clothes doesn’t appear in
the message, but we can bring that idea to our interpretation of the message
as we work out what the advertiser intended us to understand. We are
actively involved in creating an interpretation of what we read and hear.
(TB3: 129)
Another strategy is what Kuo (1999) calls “assuming shared knowledge, goals,
beliefs, etc.” (p. 131). He analysed 36 scientific journal articles and found that this
discourse function was the most frequent one for inclusive we in his corpus.
Textbook authors also used this strategy in order to mark that readers’ background
knowledge can be presupposed as well as their ability to follow the arguments.
(16) All the sounds we make when we speak are the result of muscles contracting.
(TB5: 8)
We thus includes the writer, his or her readers and all the other people using lan-
guage. Wales (1996) distinguishes between specific exophoric reference of we,
functioning within the immediate context of situation, and generalised/homophoric
we, functioning within the context of culture (p. 59). The distinction between them
is not clear-cut, but rather resembles a continuum. Wales also points out that “even
generalised reference has a strong inter-personal base, speaker- or addressee-
oriented, reflecting we and you’s origins” (ibid.). The use of generalised we thus
enables us to identify writers and readers as part of a group of human beings (in
linguistics textbooks, language users), which serves an educational aim. Relating
the topic to students, showing how we as human beings use language, helps to make
the exposition more interesting, relevant and approachable.
Another use of we seems to be referring solely to the writer, and thus it would
seemingly be exclusive. This includes instances such as we have emphasized, we
have described, we can say that etc., and since in each textbook there was only one
author, it could be regarded as the so-called ‘authorial we’. However, as Rounds
(1987) has pointed out, these instances could potentially be interpreted by the
addressee as inclusive signs, since it is a common teacher’s strategy to talk about
our discussion, even though it is the teacher who makes the exposition. The teacher
wants to make students part of a potential dialogue, as he or she would do in a
classroom.
Furthermore, the pronoun we in the textbooks referred to the whole community
of ‘linguists’ or ‘experts’. Here writers were usually discussing concepts or speak-
ing as members of a disciplinary community (18).
(18) We use the term speech act to describe actions such as “requesting,”
“commanding,” “questioning” or “informing.” We can define a speech act
as the action performed by a speaker with an utterance. If you say, I’ll be
there at six, you are not just speaking, you seem to be performing the
speech act of “promising.” (TB3: 133)
LINGUISTS + T + STUDENTS
T = teacher
The writers could have opted for a different means of expression – the previous
examples could have been formulated impersonally. Example (18) could have been
rephrased as The term speech act is used […]. It can be defined as […]. The writer
nevertheless decided to use inclusive we. He invites his readers to be part of a scien-
tific discourse, even though they do not belong to an expert disciplinary community
yet. Also, if the writers used passive voice, they would avoid expressing the agent;
however, by using a personal subject, they admit that it is the scientific community
that defines the notion in this way in this particular scientific paradigm, which is
important for students to realize.
Finally, in 5 cases the semantic reference of we seemed to be to students alone
(example 19). It is in fact the students who learned how to write the rules, not the
writer, but to meet the expectations of the learning process as cooperation, the writer
used the first person plural. The roles of the participants seem to be blurred to a
certain extent. This use of we can even be regarded as a little arrogant since it is only
superficial. For example, Quirk et al. (1985, p. 350) point out that in doctor/patient
communication the use of inclusive pronouns (How are we feeling today?) may be
understood as condescending since it actually refers to the patient alone and it is
only cosmetic.
(19) We have actually looked at some such restrictions in chapter 3 (section 3.2),
when we learned how to write lexeme formation rules. We learned that
there could be different sorts of restrictions on what sorts of base an affix
might attach to […]. (TB7: 64)
4
These discourse functions are not connected with the pronoun we alone, but rather result from the
structure in which it occurs, i.e. the semantics of the verb phrase and the context (see Dontcheva-
Navrátilová 2013).
224 T. Guziurová
(20) If, therefore, we argue that face is always interactionally constituted, it will
be necessary to interpret the concept ‘interaction’ very broadly, so that it
includes not only synchronous, face-to-face interaction, but also
asynchronous communication and general public awareness. (RA6: 653)
larly to the textbooks). Nevertheless, the division is only tentative since there were
ambiguous cases resulting, for example, from the fact that the audience primarily
addressed are linguists, that is a part of a disciplinary community. It was sometimes
difficult to decide if the reference is to readers only or to the disciplinary members/
linguists as a whole. Again, there were differences between individual articles, with
the majority of instances occurring in the Journal of Pragmatics, while one article
from the ESP Journal did not contain any first person pronoun at all.
Inclusive we was again used as a discourse guide, referring forward to announce
what is going to come, or backward to remind readers of salient points, which also
allows the writer to make a summary or provide new relations between what was
said earlier in the text and the present point (21).
(21) By contrast, the proper subject for sociology – the social – would appear to
be almost limitless. (As an example germane to this paper, sociology feels
itself qualified to comment on the scientific – as we have seen in the sub-
discipline of ‘sociology of scientific knowledge’ – whereas for physics the
social is quite outside its ambit.) (RA2: 358)
Inclusive we referring to “the writer and readers” also serves the function of
assuming shared knowledge, goals, beliefs etc. (as identified by Kuo 1999). In (22),
the writer presupposes that readers understand the example similarly, engaging
them in the research process at the same time.
(22) In the example above, we understand that all members of the field know that
uniformity along the length of the superconducting wire is one of the key
factors and that this knowledge is shared by, but not limited to, the thesis
writer him/herself. (RA4: 315)
The authors of the articles from the Journal of Pragmatics also used inclusive we
in the examples the function of which is to illustrate their points. Again, it is prob-
ably the topic that enables them to engage readers – the analysis of ‘face’ in (23),
and intonation patterns in (24) which can be read aloud.
(24) In (2a–c), we can choose to place our main emphasis on any of the three
words, to reflect their relative importance in the situation in which we make
the utterance. (RA7: 1544)
books but it did not occur so often (there was not a single occurrence in the ESP
Journal) since the writers probably preferred impersonal expressions.
(26) Among other things, it suggests to us, as applied linguists and teachers,
that we need always to keep an analytical eye not only on our texts, but
also on those who would engage with them […]. (RA2: 362)
(28) It is important to know more about these matters; for if we rely too much on
an exclusively textual approach, we may draw conclusions too readily
about the educational value of such texts, and also about the ways in
which disciplinary cultures are revealed to our students. (RA2: 362)
3.3 Discussion
This section has focused on the use of engagement markers in undergraduate text-
books and RAs. The most common device in the two genres is the pronoun we,
accounting for approximately 70% of engagement markers in both genres. The high
number of engagement markers in linguistics textbooks indicated their interactive
character. Textbook authors addressed readers directly (the pronoun you, direc-
tives), or engaged them in the discourse using inclusive we. The asymmetrical rela-
tionship between the writer as an expert and the readers as novices in the discipline
The Role of Metadiscourse in Genre Analysis: Engagement Markers in Undergraduate… 227
can be seen in the use of pronoun you, which might imply a certain distance between
participants, and the use of directives. On the other hand, inclusive we can mitigate
the distribution of power by drawing students into the shared world of disciplinary
understanding. Another reason for employing engagement markers is educational.
Relating the topic to students, showing how we as human beings use language, helps
to make the exposition more interesting, relevant and approachable.
Generally, the main difference in the use of pronoun we between undergraduate
textbooks and RAs seems to result from the character of the genres themselves.
According to Martin (1997), genres are social processes so they reflect the relation-
ship between the subjects that participate in them. The same forms of engagement
markers have different functions in the two genres. A high frequency of the pronoun
we in textbooks suggests their interactive character, engaging readers in the dis-
course, addressing them as discourse participants. Various uses of the pronoun may
suggest the atmosphere in the classroom, which enables the teacher to use we even
for the actions that he does himself (e.g. we emphasized that). The main aim seems
to be to engage and motivate readers/students since it complies with the main com-
municative purpose of the genre, which is educational.
On the other hand, the pronoun we seems to fulfil other functions besides engage-
ment in RAs. The writer engages the reader but with an additional aim of disguising
himself as the real agent. If the writer says we argue that (as in the example 20
above), when [s]he is the one making the claim, it seems much harder for the reader
to disagree. Similarly, in (29) it is the author who found 19 parallel examples, but
using we enables him to seek agreement.
(29) It is seen that all of the nominalizations of Motte’s translation were also
nominalized processes in Newton’s original Latin, where we find the 19
parallel examples for the 168 word Latin text. (RA1: 352)
As Mühlhäusler and Harré (1990) point out, “the use of ‘we’ instead of ‘I’ also
diminishes the responsibilities of the speaker, since he or she is portrayed as col-
laborating with the hearer” (p. 175). That is also probably why a number of cases of
the ‘authorial we’ function as a hedge at the same time:
(30) We could say that linguistic cooperation can expand like an accordion to
encompass what has been described as ‘extra-linguistic cooperation’ […].
(RA8: 1902)
Even though this strategy appears in textbooks as well, it is not so prominent and
mostly neutralized by other uses of the pronoun.
Different functions of the pronoun we also account for the fact that I decided not
to treat the examples of exclusive authorial we as engagement markers in RAs, but
I did so in textbooks. The division exclusive/inclusive seems too simplified to be
228 T. Guziurová
able to explain all the complex relations, and it can be argued that even these uses
of we for I may be interpreted by the addressee as inclusive signs, especially in the
textbooks (see Rounds 1987 above). On the other hand, they seem to fulfil a differ-
ent function in RAs since they are used rather to disguise the writer than to engage
readers in the discourse, which is why they have not been classified as engagement
markers. Generally, the use of pronoun we may diminish writer responsibility,
which can be shared with his colleagues, readers etc.
It remains to be said that there are other motivations for the use of pronouns in
academic writing which, however, are not the subject of this study. In studying lan-
guage as a meaning potential (Halliday 1978) we ask why certain expressions are
preferred over others in particular contexts. These choices might result not only
from the pragmatic interpretations of functions outlined above, but, more generally,
from the functioning of language itself. For example, the use of active/passive forms
may be conditioned by the information structure of the sentence, specifically by the
principles of end-focus and end-weight. Studies have shown that although the pas-
sive voice plays an important role in academic writing, it is the active voice that
prevails; Dušková, for example, found out that the passive voice accounted for
20.68% of finite verb forms in her corpus of scientific writing (1999, p. 140). She
also points out that “the use of we + the active voice of a transitive verb makes pos-
sible late placement of the rhematic element, the object, which in the passive would
come as the subject first” (ibid.). Similar cases can undoubtedly be found in my
corpus (in both genres), for example:
(32) In the list here, we also find the word acidity, plus four other derivatives of it
(subacidity, nonacidity, hypoacidity, hyperacidity). (TB7: 65)
This chapter is going to end where it started – with the discussion of metadiscourse.
Applying the ‘integrative approach’ represented by Hyland’s model to two aca-
demic genres uncovered several important points. First, the integrative approach
enables the researcher to characterize the genre from different viewpoints since it is
5
See e.g. Wales (1996, p. 65) who discusses different views of the so-called ‘authorial we’.
According to Henry David Thoreau “it should only be used by royalty, editors, pregnant women
and people who ate worms” (ibid.).
The Role of Metadiscourse in Genre Analysis: Engagement Markers in Undergraduate… 229
not limited to text organizing elements or elements referring only to the text. Genres
can thus be studied from all the aspects that do not add to the propositional content
of the text (even though the term propositional content is controversial itself).
Metadiscursive categories are clearly interrelated and the actual expressions in the
text are multifunctional (e.g. it is now important to reflect on), signalling both tex-
tual and interpersonal meanings. Furthermore, the integrative approach covering
different interpersonal aspects seems suitable for the analysis of academic genres
since the expressions of interpersonality are not so frequent and expected in com-
parison with other genres.
On the other hand, the integrative approach has its drawbacks. As already men-
tioned at the end of Sect. 1, it has been criticized for covering different language
phenomena that cannot be easily put under one umbrella term. Hyland defined
metadiscourse as “the cover term for the self-reflective6 expressions used to negoti-
ate interactional meanings in a text, assisting the writer (or speaker) to express a
viewpoint and engage with readers as members of a particular community” (Hyland
2005, p. 37). Furthermore, he considers non-propositionality to be one of the key
principles characterizing metadiscourse, stating that “metadiscourse is distinct from
propositional aspects of discourse” (Hyland 2005, p. 38). However, both of these
criteria can be challenged.
If we look at all the categories in Hyland’s model, it might be possible to classify
them in the following way (see Table 8): endophoric markers, frame markers, code
glosses and transitions can be considered self-reflective in a narrow sense in that
they refer to the current text. However, they differ in the degree of explicitness of
text reflexivity, with endophoric markers being probably the most explicit and tran-
sitions the least.7 Evidentials are not self-reflective but refer to other textual sources,
covering citations, paraphrases etc. Hedges, boosters and attitude markers can be
considered pragmatic categories commenting on the content of the propositions –
they express epistemic and affective stance, thus being self-reflective in a different
sense. It is the writer’s evaluation of the state of affairs expressed in the proposi-
tions. Finally, self mention and engagement markers are primarily addressing
writer-reader interaction.
The non-propositionality as another criterion of metadiscourse also seems prob-
lematic. ‘Proposition’ is a semantic term, which originated in logic, and as such it is
not easily transferable to discourse analysis. The traditional truth-conditional crite-
ria do not apply here because a number of metadiscursive statements can be
described as true or false. However, even if we loosen the criteria and distinguish
“things in the world and things in the discourse, propositions and metadiscourse”,
as Hyland proposed (Hyland 2005, p. 38), there are still many cases which remain
problematic. Considering attitude markers, for example, one of Hyland’s examples
reads: “The basis of the enormous productivity and affluence of modern industrial
societies is their fantastic store of technological information” (Hyland 2005, p. 164).
It is questionable whether “enormous” is an expression of the writer’s attitude.
While Hyland interpreted it as an attitude marker, i.e. as non-propositional (thus
qualifying as metadiscourse), it could equally be argued that it contributes to the
proposition expressed by the text. Similarly, certain hedges are believed to affect the
propositional content, e.g. approximators (somewhat, sort of, approximately).
What seems to be a common denominator in all of Hyland’s categories is the
writer’s explicit presence in a discourse. Although it is undoubtedly a matter of
degree, it can be argued that it is the writer who comments on the form of the text,
he or she also expresses stance towards its content and interacts with the reader.
Considering the Jakobsonian model of language functions, Ädel (2006) regards the
metalinguistic function as the indispensable one in her reflexive model. In Hyland’s
approach (and generally all broad approaches to metadiscourse) it would be the
expressive function which is crucial (although in a more general sense than formu-
lated by Jakobson [1980, p. 82], i.e. “a direct expression of the speaker’s attitude
toward what he is speaking about”).
Generally, it seems that the integrative approaches to metadiscourse have moved
away from text reflexivity as the capacity of a language to refer to or describe itself,
and instead foreground the interpersonal meanings. Whether this approach is justi-
fied today (when the interpersonal aspects of language are conceptualized under the
headings of stance, evaluation or positioning) is difficult to say. However, at least in
academic writing the concept seems to have been useful in showing how writers
7
Degrees of explicitness in text reflexivity are discussed in Mauranen (1993), who also considers
internal connectors to be of low explicitness.
The Role of Metadiscourse in Genre Analysis: Engagement Markers in Undergraduate… 231
project themselves into their discourses in order to structure them, negotiate mean-
ings and engage readers as discourse participants.
References
Corpus
Textbook Chapters
(TB1) Meyer, C. F. (2009). Introducing English linguistics (pp. 2–15). Cambridge: Cambridge
University Press.
(TB2) Knowles, G. (1997). A cultural history of the English language (pp. 46–62). London:
Arnold.
(TB3) Yule, G. (2010). The study of language (4th ed., pp. 127–136; 156–166). Cambridge:
Cambridge University Press.
(TB4) Lyons, J. (2002). Language and linguistics (pp. 216–235). Cambridge: Cambridge
University Press. (Original work published 1981).
(TB5) Roach, P. (2006). English phonetics and phonology (3rd ed., pp. 8–11; 38–47). Cambridge:
Cambridge University Press.
(TB6) Penhallurick, R. (2003). Studying the english language (pp. 70–84; 58–60). Hampshire/
New York: Palgrave MacMillan.
(TB7) Lieber, R. (2010). Introducing morphology (pp. 59–73; 75–85). Cambridge: Cambridge
University Press.
Research Articles
(RA1) Banks, D. (2005). On the historical origins of nominalized process in scientific text. English
for Specific Purposes, 24(3), 347–357.
(RA2) Moore, T. (2002). Knowledge and agency: A study of ‘metaphenomenal discourse’ in text-
books from three disciplines. English for Specific Purposes, 21(4), 347–366.
The Role of Metadiscourse in Genre Analysis: Engagement Markers in Undergraduate… 233
(RA3) Flowerdew, J. (2003). Signalling nouns in discourse. English for Specific Purposes, 22(4),
329–346.
(RA4) Charles, M. (2006). Phraseological patterns in reporting clauses used in citation: A corpus-
based study of theses in two disciplines. English for Specific Purposes, 25(3), 310–331.
(RA5) Crossley, S. (2007). A chronotopic approach to genre analysis: An exploratory study.
English for Specific Purposes, 26(1), 4–24.
(RA6) Spencer-Oatey, H. (2007). Theories of identity and the analysis of face. Journal of
Pragmatics, 39, 639–656.
(RA7) House, J. (2006). Constructing a context with intonation. Journal of Pragmatics, 38,
1542–1558.
(RA8) Lumsden, D. (2008). Kinds of conversational cooperation. Journal of Pragmatics, 40,
1896–1908.