Contrastive Analysis of Discourse

Yearbook of Corpus Linguistics and Pragmatics
Karin Aijmer
Diana Lewis Editors
Contrastive Analysis
of Discourse-
pragmatic Aspects of
Linguistic Genres
Series Editor
Jesús Romero-Trillo, Universidad Autónoma de Madrid, Spain
Reviews Editor
Dawn Knight, Cardiff University, Cardiff, UK
Advisory Editorial Board

Karin Aijmer, University of Gothenburg, Sweden
Belén Díez-Bedmar, Universidad de Jaén, Spain
Ronald Geluykens, University of Oldenburg, Germany
Anna Gladkova, University of Sussex and University of Brighton, UK
Stefan Gries, University of California, Santa Barbara, USA
Leo Francis Hoye, University of Hong Kong, China
Jingyang Jiang, Zhejiang University, China
Anne O’Keeffe, Mary Immaculate College, Limerick, Ireland
Silvia Riesco-Bernier, Universidad Autónoma de Madrid, Spain
Anne-Marie Simon-Vandenbergen, University of Ghent, Belgium
Esther Vázquez y del Árbol, Universidad Autónoma de Madrid, Spain
Anne Wichmann, University of Central Lancashire, UK
More information about this series at http://www.springer.com/series/11559
Karin Aijmer • Diana Lewis
Editors
Contrastive Analysis of
Discourse-pragmatic Aspects
of Linguistic Genres
Editors
Karin Aijmer Diana Lewis
University of Gothenburg Department of English and
Gothenburg, Sweden Lerma Research Centre
Aix Marseille University
Aix-en-Provence, France
ISSN 2213-6819 ISSN 2213-6827 (electronic)

ISBN 978-3-319-54554-7 ISBN 978-3-319-54556-1 (eBook)
DOI 10.1007/978-3-319-54556-1
Library of Congress Control Number: 2017936967
© Springer International Publishing AG 2017

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of
the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,
broadcasting, reproduction on microfilms or in any other physical way, and transmission or information
storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology
now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the
editors give a warranty, express or implied, with respect to the material contained herein or for any errors
or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims
in published maps and institutional affiliations.
Printed on acid-free paper
This Springer imprint is published by Springer Nature

The registered company is Springer International Publishing AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Contents
Introduction...................................................................................................... 1
Karin Aijmer and Diana Lewis
Part I Contrastive Analysis with Parallel Corpora

he Semantic Field of Obligation in an English-Swedish
T
Contrastive Perspective................................................................................... 13
Karin Aijmer
English so and Dutch dus in a Parallel Corpus:
An Investigation into Their Mutual Translatability..................................... 33
Lieven Buysse
hat English Translation Equivalents Can Reveal
W
about the Czech “Modal” Particle prý: A Cross-Register Study................. 63
Michaela Martinková and Markéta Janebová
odal Adverbs of Certainty in EU Legal Discourse:
M
A Parallel Corpus Approach........................................................................... 91
Magdalena Szczyrbak
Part II Contrastive Analysis with Comparable Corpora

dverbial Clauses in English and Norwegian Fiction and News................ 119
A
Hilde Hasselgård
oherence Relations and Information Structure in English
C
and French Political Speeches......................................................................... 141
Diana Lewis
v
vi Contents
Part III Contrastive Analysis Across Genres of English

allbacks in Stand-Up Comedy: Constructing Cohesion
C
at the Macro Level Within a Specific Genre.................................................. 165
Catherine Chauvin
ush and Obama’s Addresses to the Arab World:
B
Recontextualizing Stance in Political Discourse............................................ 187
Laura Hidalgo-Downing and Yasra Hanawi
he Role of Metadiscourse in Genre Analysis: Engagement Markers
T
in Undergraduate Textbooks and Research Articles.................................... 211
Tereza Guziurová
Contributors
Karin Aijmer University of Gothenburg, Gothenburg, Sweden

Lieven Buysse Faculty of Arts, KU Leuven, Brussels, Belgium
Catherine Chauvin Department of English, University of Lorraine, Nancy, France
Yasra Hanawi Department of English, Facultad de Filosofía, Letras Universidad
Autónoma de Madrid, Madrid, Spain
Hilde Hasselgård Department of Literature, Area Studies and European Languages,
University of Oslo, Oslo, Norway
Laura Hidalgo-Downing Department of English, Facultad de Filosofía, Letras
Universidad Autónoma de Madrid, Madrid, Spain
Markéta Janebová Department of English and American Studies, Faculty of Arts,
Palacký University, Olomouc, Czech Republic
Diana Lewis Department of English and Lerma Research Centre, Aix Marseille
University, Aix-en-Provence, France
Tereza Guziurová Faculty of Arts, Centre for the Research of Professional
Language, University of Ostrava, Ostrava, Czech Republic
Michaela Martinková Department of English and American Studies, Faculty of
Arts, Palacky University, Olomouc, Czech Republic
Magdalena Szczyrbak Institute of English Studies, Jagiellonian University,
Kraków, Poland
vii
Introduction
Karin Aijmer and Diana Lewis
Abstract The aim of this issue of the Yearbook of Corpus Linguistics and

Pragmatics is to explore the comparability of discourse-pragmatic characteristics of
genres across European languages, using parallel corpora (aligned translated texts)
and/or comparable corpora (genre-matched original texts). The articles have their
origin in a seminar at the 12th ESSE conference in Kosiče, Slovakia 29 August–2
September 2014 convened by the editors.
Keywords Contrastive linguistics • Parallel corpora • Comparable corpora • Genre
The aim of this issue of the Yearbook of Corpus Linguistics and Pragmatics is to
explore the comparability of discourse-pragmatic characteristics of genres across
European languages, using parallel corpora (aligned translated texts) and/or compa-
rable corpora (genre-matched original texts). The articles have their origin in a
seminar at the 12th ESSE conference in Kosiče, Slovakia 29 August–2 September
2014 convened by the editors.
1 The ‘New’ Contrastive Analysis
Renewed interest in contrastive linguistics over the past couple of decades, together
with increasing availability of specialised digital corpora, have resulted in a new,
usage-based approach to language comparison. The domain of contrastive linguis-
tics centres on the comparison, in synchrony, of two languages. In a break with the
‘applied’ approach to contrastive linguistics of the 1960s and 1970s, which tended
K. Aijmer (*)
University of Gothenburg, Gothenburg, Sweden
e-mail: karin.aijmer@eng.gu.se
D. Lewis
Department of English and Lerma Research Centre, Aix Marseille University,
e-mail: diana.lewis@univ-amu.fr
© Springer International Publishing AG 2017 1

K. Aijmer, D. Lewis (eds.), Contrastive Analysis of Discourse-pragmatic
Aspects of Linguistic Genres, Yearbook of Corpus Linguistics and Pragmatics,
DOI 10.1007/978-3-319-54556-1_1
2 K. Aijmer and D. Lewis
to focus on particular differences in the structural features of two languages with a

view to predicting L2 learner difficulties, recent work has been more descriptive and
theoretical, and it has taken language usage into account, to compare frequencies
and distributions as well as structures. A number of recent monographs (e.g.
Johansson 2007, König and Gast 2009), collections (e.g. Gómez -González et al.
2008, Taboada et al. 2013) and journal issues (v. special issues of Languages in
Contrast) bear witness to the breadth and vigour of this new approach.
The ‘new’ contrastive linguistics has found its place among linguistic typology,
historical linguistics, cross-cultural communication and intralanguage variation,
overlapping somewhat with each of these other domains of enquiry (v. König 2011).
The current collection of papers can be seen as falling at the intersection of contras-
tive analysis and intralanguage variation.
Contrastive linguistics is “concerned with pairs of languages which are ‘socio-
culturally linked’” (Gast 2012: 1). The two languages analysed are spoken by bilin-
guals fluent in both, are mutually translated, and have some comparable socio-cultural
institutions and practices that form a backdrop for comparison. This is particularly
relevant for contrastive genre analysis, since ‘equivalent’ socio-cultural practices
across the two speech communities will allow a genre in one speech community to
be paired with a genre in the other, to provide the tertium comparationis for the
comparison.
2 The Notion of Genre
Language needs to be studied in relation to aspects of the communication situation

and the wider cultural context. These aspects include the textual genre. Genre is
however a problematic concept. There is, for example, no agreement about terminol-
ogy but scholars use different terms such as genre, activity type (cf. Levinson 1979),
register, text type reflecting different perspectives and approaches. The definition of
genre usually includes sociolinguistic and contextual parameters. According to
Bhatia, ‘analysing genre means investigating instances of conventionalised or insti-
tutionalised textual artefacts in the context of specific institutional and disciplinary
practices, procedures and cultures in order to understand how members of specific
discourse communities construct, interpret and use these genres to achieve their com-
munity goals and why they write them the way they do’ (Bhatia 2002: 6). Text type,
on the other hand, is generally used to refer to a group of related texts in a corpus.
The term ‘register’ is used above all in systemic functional linguistics as a ‘contex-
tual category correlating groupings of linguistic features with recurrent situational
features’ (Gregory and Carroll 1978: 4, quoted from Swales 1990: 40).
Genres often show distinctive patterns of frequency and distribution of linguistic
features in relation to other genres or to the wider language. The articles deal with
patterns across English and (an)other language(s), in areas such as modality,
pragmatic markers, speech acts, coherence relations and information structure.
Using a genre-based perspective the authors draw attention to how different dis-
Introduction 3
course strategies, power status, speaker roles associated with the genre can explain
both formal and functional properties of the patterns. The focus is on spoken, writ-
ten or multimodal genres within domains such as political discourse, public com-
munication, journalism, stand-up comedy, academic and professional discourse,
addressing both methodological and theoretical issues. All adopt a usage-based
approach, exploiting a range of corpus material to reveal patterns of form and use in
one or more languages.
Genre-specific recurrent patterns can also be studied contrastively. The contras-
tive point of view highlights the dependence of the patterns on different social and
cultural practices in the compared languages. Languages involved in the compari-
sons with English are Czech, Dutch, French, Polish, Norwegian, Spanish and
Swedish.
3 Parallel Corpora and Comparable Corpora
Central to the contrastive analysis of linguistic phenomena is the use of parallel and
comparable corpora, and this volume illustrates the use of both types. Parallel cor-
pora consist of translations from one language to the other. Comparable corpora
consist of texts in two or more languages which are comparable with regard to
genre, formality, subject-matter, time-span, etc. (Aijmer 2008).
Parallel corpora can be further characterized as unidirectional or bidirectional,
depending on the translation direction. A bidirectional parallel corpus makes it pos-
sible to analyse how words and constructions in one language have been rendered in
the target language and to retrace the process to find the sources of the translations.
Parallel corpora were first used for lexical and grammatical studies but are now also
used as a resource to study discourse and pragmatic phenomena. As illustrated in
this volume, there now exist parallel corpora for many different language pairs usu-
ally with English as either the source or the target language. A parallel corpus is
above all a method to show differences or similarities between lexical elements or
constructions in two or more languages which may not be apparent to intuition.
Another approach is to use the parallel corpus is to test a hypothesis about how a
particular function is expressed in another language by making observations about
correspondences and arrive at a theoretical statement which is empirically based (cf.
Gast 2015). Dyvik (1998) addressed the question how translational phenomena can
be used for the study of meanings. In this perspective the corpus can provide a
resource in lexical semantics by mirroring meanings and functions of an element in
one language in another language. Translators are excellent informants since they
use their judgments to find the appropriate translation as a part of their professional
duties thus avoiding the observer’s paradox (Labov 1972). Translations should be
used with caution, however. The disadvantages of using parallel corpora is that they
may suffer from ‘translationese’ (Baker 1993; Baroni and Bernardini 2005),
source-text influence, the translator’s fingerprints, and uneven translation quality.
The results of the translation analysis should therefore also be tested on the basis of
monolingual corpora in the compared languages. Depending on the research aims,

comparable corpus analysis may also be desirable to avoid certain translation effects
(Baker 1993; McEnery and Xiao 2008:24).
There now exist parallel corpora dealing with many different language pairs.
However, parallel corpora are generally small compared with monolingual corpora
and restricted to genres which exist in translation. While a large number of texts
have been translated from English into other languages there may be fewer transla-
tions in the other direction. It is therefore sometimes necessary to use comparable
corpora rather than parallel corpora, for example if one wants to compare varieties
that are seldom translated.
In contrastive genre analysis, comparable corpora may be needed as well as, or
instead of, parallel corpora in certain cases. One such case is where similar cultural
practices in two linguistic communities give rise to dissimilar genres. Translators
are often faced with passages containing propositions that a target language speaker
would not express at all. As Mauranen (2002) puts it, “not only the expression but
the content of the original needs to be changed”. Instead, the translations in her
corpus “seem largely to transmit source culture linguistic and pragmatic practices”
and display “culturally untypical target language pragmatics” (Mauranen 2002).
Such differences in rhetorical traditions have led to a line of research in contrastive
rhetoric (e.g. Hinds 1990, Connor et al. 2008) based on analysis of native texts. At
the other extreme, certain genres, scientific or technical, may display remarkable
similarities across languages. A genre belongs to the (often highly specialist) dis-
course community that uses it and thereby participates in its ongoing evolution, and
such communities can cross linguistic boundaries. Comparable corpora are well
suited to revealing such patterns of relative frequency and distribution.
The main drawback of comparable corpora lies in corpus design. Texts are com-
parable if they are produced in comparable situations for comparable purposes, but
establishing comparability is problematic. Extreme care is called for in building
comparable corpora, and since true equivalence of genre is rarely possible across
language communities, the analyst must settle for approximation.
4 An Overview of the Volume
The volume is divided into three sections, according to the methodology and the
type of contrastive analysis carried out (cf. Aijmer 2008).
5 Contrastive Analysis with Parallel Corpora
The first section comprises four papers based on parallel (translated-text) corpora.
Karin Aijmer’s contribution discusses obligation across languages and genres.
The starting-point is the observation that in both English and Swedish the meaning
of obligation can be expressed by a modal auxiliary. However must and its Swedish
Introduction 5
cognate måste are not always each other’s translation equivalents reflecting the fact
that there are several grammatical and lexical ways to express obligation. The trans-
lations of Swedish måste into English showed that måste was frequently translated
by the semi-modal have to especially in fiction. If have to and had to were conflated
the frequency would be even higher than for must as a translation choice. Need to,
should and ought to, on the other hand, were all more frequent in non-fiction than in
fiction. In the Swedish translations from English måste was most frequent both in
fiction and non-fiction. Other alternatives are få (‘may’), ska(ll), skulle (‘shall’,
‘should’), behöva (‘need’), bör (‘ought to’). Must and have to express different
meanings as a translation of måste. Have to, especially when qualified by will, is
downtoning and polite, it can have a general or generic meaning and it can indicate
negative evaluation. In Swedish ska(ll)/skulle can be used to express power and få
indicates that an action is unwelcome to the hearer. In non-fiction must, have to and
need (and their Swedish correspondences) are associated with interactional goals
and how these are evaluated as good or bad. By using impersonal structures with a
collective we as the grammatical subject or an agentless passive the speaker can get
the message across to the hearer with maximum hedging.
The aim of Lieven Buysse’s contribution is to examine the mutual translat-
ability of Dutch dus and English so. The corpus used is the Dutch-English com-
ponent of the Dutch Parallel Corpus (1997–2009). The texts belong to five text
types: fictional and non-fictional literature, journalistic texts, instructive texts,
administrative texts, and external communication. The corpus has been balanced
for text type and for translation direction and amounts in all to five million words.
Since it is a bidirectional corpus all the examples of so translated to dus (and dus
from Dutch into English) were included as well as ‘back-translations’. The func-
tional ranges of these two discourse markers were shown to be remarkably simi-
lar – the polysemies of dus and so overlapped almost completely – but there were
significant differences in frequency and distribution, with dus being both more
frequent overall and more associated with inference than so, which occurred more
typically in resultative contexts. Significant genre effects were found: as well as
being unevenly distributed across genres, dus and so also tended to occur with
different functions in different genres and were differently distributed according
to language. Thus the study not only demonstrates how semantic equivalence does
not result in translation equivalence, with only a sixth of source-text dus being
translated by so in the corpus, but also how genre constrains the markers differ-
ently in the two languages.
Michaela Martinková and Markéta Janebová wanted to investigate the evi-
dential and epistemic senses of the Czech particle prý by studying the functions of
prý reflected in the correspondences in another language. The authors used the
English-Czech and Czech-English sections of the Czech National Corpus- InterCorp,
which is a multilingual parallel corpus of texts written in 39 different languages
with their Czech counterparts. Their study focused on three registers which were
represented in the corpus: fiction, journalistic texts and spoken language. The
journalistic texts are represented by the PressEurope database (2009–2014). The
spoken language in InterCorp comes from Proceedings from the European
Parliament and a corpus of Subtitles. Czech as source and as target language were
not differentiated in order to obtain a sensible amount of text from the European
Parliament. The sub-corpora vary in size and come from different periods. Moreover
there were more translations than target texts in the corpus.
The authors found different patterns of prý usage and different frequencies
according to genre as well as different patterns of translation. From these translation
patterns, and with the great majority of correspondences being evidential, the
authors concluded that the epistemic uses of prý are context-bound: the interpreta-
tion of the particle as conveying doubt may arise in the context as an inference from
the context.
Magdalena Szczyrbak discusses the correspondences between English modal
adverbs of certainty and their Polish correspondences in argumentative legal writ-
ing. The material used for the study consists of 30 Opinions of Advocated General
at the European Court of Justice, issued between 2011 and 2013 comprising about
576,000 words. The data has been drawn from source texts in English and their
Polish translations. The English texts were written by a native speaker of English,
whereas the translations were made by professionals having Polish as their native
language. At the outset the most frequent modal adverbs were identified in the
English sub-corpus and then the equivalents in the Polish sub-corpus were
determined.
The genre of Opinion was chosen because it was assumed that it would be rich
in persuasive devices. Modal adverbs of certainty have been shown in previous stud-
ies to be useful rhetorical devices inextricably linked to stance and argumentation.
They are for instance used both to foreground and background legal arguments and
to demonstrate power and authority. The modal adverbs studied were indeed, neces-
sarily, of course, clearly and obviously and their Polish correspondences. The trans-
lations were used to study both the conventional meanings of the adverbs and the ad
hoc meanings associated with the particular genre. It is shown that there were
noticeable differences between the English adverbs and their Polish correspon-
dences with regard to the degree of persuasiveness and that the author’s presence
was less visible in the Polish translations. Omission of the modal adverb in the
translation was shown to lessen the rhetorical force of the translated text and its abil-
ity to influence the reader.
6 Contrastive Analysis with Comparable Corpora
The second part of the volume contains two papers based on comparable corpora.
Hilde Hasselgård’s study compares adverbial clause placement in English and
Norwegian cross-linguistically and across the genres fiction and news. End position
was the most common alternative in both English and Norwegian in both registers.
In the initial position there were both language and register differences. It is shown
that initial position was proportionally more frequent in fiction in English than in
Norwegian. However there was a higher frequency of initial clauses in news in both
Introduction 7
languages. The best predictors of adverbial clause placement were shown to be

finiteness and semantic property. Moreover, the positional preferences associated
with the semantic category of the adverbial were more important than iconic order.
With regard to information status initial adverbial clauses were ‘anchored’ in the
preceding discourse in 75–90% of the cases in both languages. However there were
more discourse-new initial clauses in Norwegian than in English, especially in fic-
tion. In final position the picture was the reverse. The information associated with
the adverbial clause was discourse-new in 75–80% of the examples.
The genre of political discourse is addressed in the chapter by Diana Lewis.
Using a comparable corpus of French and English ministerial speeches, she com-
pares patterns of discourse marking across the two languages. The focus is on the
marking of Additive relations, which are expected to be the least marked relations.
The French speeches are found to contain dense networks of Additive discourse
markers that both provide conventional frameworks or templates for the discourse
and regulate information flow. In the English speeches, by contrast, Additive dis-
course markers are sparse, and speakers rely mainly on also or on simple juxtaposi-
tion in such contexts. The very frequent French markers are seen to be bleaching
and grammaticalizing into information-structuring devices. A comparison of dic-
tionary equivalents en effet and indeed illustrates the French/English difference: en
effet appears to be grammaticalizing into a presentative within a wider ‘discourse
construction’ in which it is a quasi obligatory element, while indeed is rare. Overall,
English political discourse relies more on content allowing coherence relations to
be inferred, while French political discourse uses discourse markers to ‘frame’ the
content into more conventional, formal structures.
7 Contrastive Analysis Across Genres of English
The final section consists of three studies of particular English genres.

Catherine Chauvin’s chapter addresses cohesion in a rather particular genre --
that of stand-up comedy. It is a genre of relatively few speakers, each of whom cre-
ates a micro-genre by which to become an ‘identifiable persona’. Providing cohesion
to a stand-up comedy act is a highly skilled art, as the routine typically consists of a
series of self-contained gags, each on a new topic. ‘Callbacks’ (references to earlier
jokes) not only provide one way of creating some cohesion across the range of dis-
connected topics, to make the routine function as a whole, but are also themselves
humorous. They are shown to operate at the ‘macro’ level, helping to build the
entire act into a single cohesive whole. The chapter also explores the humorous
effects that can be achieved through the use of cohesive devices in contexts that are
clearly heterogeneous.
The study by Laura Hidalgo-Downing and Yasra Hanawi compares the differ-
ent stance styles used by presidents Bush and Obama in speeches addressed to the
Arab world. The former speech which was delivered by President Bush in Abu
Dhabi in 2008 is 3308 words long. The latter speech entitled ‘A New Beginning’
was given by President Obama in Cairo in 2009 and is 5871 words long. The study
is both quantitative and qualitative. A search was made for the frequency of personal
pronouns, modality markers, mental verbs and negation in both speeches using a
Concordancer. The quantitative comparison showed that Obama’s speech had a
higher frequency of modality, especially epistemic modality, negation and first per-
son pronouns. In Bush’s speech negation was infrequent and you was more frequent
than other pronouns. It is argued that Obama’s speech can be interpreted as an
attempt to ‘recontextualise’ the position of the US policy towards the Middle East.
Negation in Obama’s speech is for example used to correct assumptions about the
US by Arabic speakers or about the relations between the US and the Arabic coun-
tries. Obama’s frequent use of modal auxiliaries indicates his personal involvement
with the topic addressed. Bush’s speech, on the other hand, shows a more conven-
tional discourse style characterized by a low frequency of stance markers and nega-
tion and a preference for second person pronouns. The preference for unmodalized
assertions further underlines an authoritative speaking style.
Metadiscourse has been frequently used to characterise academic genres. Tereza
Guziurová draws on the ‘integrative’ or broad approach to metadiscourse in order
to compare the distribution and uses of the engagement markers we and you, imper-
atives, questions in academic textbooks and research articles. The discussion focuses
on the pronoun we since this proved to be the most frequent engagement marker in
the data accounting for about 70% in both genres. We was used with a wide range of
semantic reference with different discourse functions depending on the genre. The
majority of examples of we in both genres referred to the writer and his/her readers.
The main reason for using the pronoun in the textbooks is that it draws students into
the shared world of disciplinary understanding. Another reason is that we helps to
make the exposition more interesting, relevant and approachable by referring to
people in general as language users. In research articles the writer uses we with the
aim of disguising him/herself as the agent. The study also discusses the potential
advantages and drawbacks of the integrative approach.
References
Aijmer, K. (2008). Parallel and comparable corpora. In A. Lüdeling & M. Kytö (Eds.), Corpus
linguistics. An international handbook (Vol. 1, pp. 275–291). Berlin: de Gruyter Mouton.
Baker, M. (1993). Corpus linguistics and translation studies: Implications and applications. In
G. Francis, M. Baker, & E. T. Bonelli (Eds.), Text and technology: In Honour of John Sinclair
(pp. 233–252). Amsterdam: John Benjamins.
Bhatia, V. K. (2002). Applied genre analysis: A multi-perspective model. Ibérica, 4, 3–19.
Baroni, M., & Bernardini, S. (2005). A new approach to the study of translationese: Machine-
learning the difference between original and translated text. Literary and Linguistic Computing,
21(3), 259–274.
Connor, U., Nagelhout, E., & Rozycki, W. V. (Eds.). (2008). Contrastive rhetoric: Reaching to
intercultural rhetoric. Amsterdam: John Benjamins.
Introduction 9
Dyvik, H. (1998). A translational basis for semantics. In S. Johansson & S. Oksefjell (Eds.),
Corpora and cross-linguistic research. Theory, method, and case studies. Amsterdam/Atlanta:
Rodopi.
Gast, V. (2012). Contrastive analysis: Theories and methods. In: B. Kortmann & J. Kabatek (Eds.),
Linguistic theory and methodology. (WSK-Dictionaries of Language and Communication
Science). Berlin: Mouton de Gruyter. http://www.personal.uni-jena.de/~mu65qev/papdf/
CA.pdf
Gast, V. (2015). On the use of translation corpora in contrastive linguistics. A case study of imper-
sonalization in English and German. Languages in Contrast, 15(1), 4–33.
González, G., de los Angeles, M., Lachlan Mackenzie, J., & Alvarez, E. G. (Eds.). (2008). Current
trends in contrastive linguistics. Amsterdam: John Benjamins.
Gregory, M., & Carroll, S. (1978). Language and situation: Language varieties and their social
contexts. London: Routledge & Kegan Paul.
Hinds, J. (1990). Inductive, deductive, quasi-inductive: Expository writing in Japanese, Korean,
Chinese, and Thai. In U. Connor & A. M. Johns (Eds.), Coherence in writing: Research and
pedagogical perspective, Alexandria (pp. 87–109). VA: TESOL.
Johansson, S. (2007). Seeing through multilingual corpora: On the use of corpora in contrastive
studies. Amsterdam: John Benjamins.
König, E. (2011). The place of contrastive linguistics in language comparison. ms. http://www.
personal.uni-jena.de/~mu65qev/e-g-ontrasts/papers/koenig_2011.pdf
König, E., &. Gast (2009). Understanding English-German contrasts (2nd ed.). Berlin: Schmidt.
Labov, W. (1972). Sociolinguistic patterns. Oxford: Blackwell.
Levinson, S. (1979). Activity types and language. Linguistics, 17, 365–379.
Mauranen, A. (2002). Where’s cultural adaptation? A corpus-based study on translation strategies.
In: B. Silvia & Z. Federico (Eds.) CULT2K, special issue of inTRAlinea. http://www.intralinea.
org/specials/article/1677
McEnery, T., & Xiao, R. (2008). Parallel and comparable corpora: What is happening? In
G. Anderman, & M. Rogers (Eds.), Incorporating corpora: The linguist and the translator
(pp. 18–31). Cleveland: Multilingual Matters.
Swales, J. M. (1990). Genre analysis: English in academic and research settings. Cambridge:
Cambridge University Press.
Taboada, M., Suárez, S. D., & Alvarez, E. G. (2013). Contrastive discourse analysis: Functional
and corpus perspectives. Sheffield: Equinox.
Part I
Contrastive Analysis with Parallel Corpora
The Semantic Field of Obligation
in an English-Swedish Contrastive Perspective
Karin Aijmer
Abstract The article examines how genre (fiction and non-fiction) affects the dis-
tribution and uses of the modal auxiliaries must/måste in the obligation meaning and
their more or less grammaticalized alternatives in English and Swedish. In both
languages the obligation markers were associated with specific contexts of use
depending on genre. In fiction the obligation markers were frequent with first and
second person subjects. Must was used for exhortations. Have to was used with
generic subjects and instead of must for more general recommendation. In Swedish
there was no corresponding distinction. Must usually pointed forwards to something
desirable in the context of EU debates. Have to, on the other hand, was also found
in negative contexts in the non-fiction data. Swedish måste was used both about
positive and negative obligation. In Swedish få was an alternative to måste when the
imposition was not in the hearer’s interest.
Keywords Obligation • Genre • Parallel corpus • must/måste
1 Introduction
In both English and Swedish the meaning of obligation can be expressed by a modal
auxiliary (must or Swedish måste). This is in line with ‘a significant cross-linguistic
trend for languages to have a category of grammatical expression forms, usually
called the “modal” auxiliaries’ (Nuyts 2016: 13). However must and måste are not
always each other’s translation equivalents reflecting the fact that there are a large
number of grammatical and lexical alternatives to express obligation.
The English modal auxiliaries have attracted a great deal of interest because of
their changing patterns over time. Less attention has been given to the codification of
certain functions which can take place in particular genres. However, Lewis (2015)
With many thanks to Bengt Altenberg for reading an earlier version of the text.
K. Aijmer (*)
University of Gothenburg, Gothenburg, Sweden
e-mail: karin.aijmer@eng.gu.se

DOI 10.1007/978-3-319-54556-1_2
14 K. Aijmer
has drawn attention to the specific distribution of obligation markers in political

speeches in English and French. As Lewis (2015: 153) puts it, ‘[w] here there is a
“marked” or atypical distribution of modal markers in a particular genre, there may
also be an atypical distribution in the “equivalent” genre in another language.’
The present study can be described as a contrastive corpus-based genre analysis
of modal expressions meaning obligation. The aim of the study is to examine how
genre (fiction and non-fiction) affects the distribution and uses of the modal auxilia-
ries must/måste and their more or less grammaticalized alternatives in English and
Swedish. The comparison of the modal forms will be carried out on the basis of
translation texts in the English-Swedish Parallel Corpus.
The structure of the paper is as follows. Obligation and necessity are defined in
Sect. 2. Section 3 presents the material and the methodology associated with the use
of parallel corpora to study modality across varieties. The marking of obligation and
the frequencies of the obligation markers is described in Sect. 4. Section 5 contains
the analysis of the obligation markers in fiction texts in English and Swedish.
Section 6 deals with the English and Swedish obligation markers in non-fiction.
Section 7 contains a concluding discussion.
2 Obligation and Necessity
It would be hard to give a dictionary description of the semantic notion of obligation

that could be used as the basis for comparing elements in the two languages.
Obligation is associated with modality and with the modal auxiliaries. However,
modality is a broad notion which is difficult to define. It has traditionally been sub-
classified into root (obligation, permission) and epistemic modality (eg Coates
1983). Van der Auwera and Plungian (1998: 83) start from a different perspective
and suggest a division into four different semantic domains. Participant-internal
modality (cf Nuyts 2016: 34 ‘participant-inherent dynamic modality’) refers to the
speaker’s internal needs. It is illustrated by the verb need to:
Boris needs to sleep ten hours every night for him to function properly.
Participant-external modality refers to ‘circumstances that are external to the
participant, if any, engaged in the state of affairs and that make this state of affairs
either possible or necessary’ (van der Auwera and Plungian 1998: 80) (cf Nuyts
2016: 34 ‘participant-imposed dynamic modality’). In the example below the exter-
nal circumstances making something necessary are referred to explicitly:
To get to the station, you have to take bus 66.
A special case of participant-external modality is deontic modality. The use of
the term ‘deontic modality’ is ‘supposed to be fully unproblematic’ (van der Auwera
and Plungian 1998: 83). Deontic modality ‘identifies the enabling or competing
circumstances external to the participant as some person(s), often the speaker, and/
or as some social or ethical norm(s) permitting or obliging the participant to engage
The Semantic Field of Obligation in an English-Swedish Contrastive Perspective 15
in the state of affairs’ (van der Auwera and Plungian 1998: 81). Must is for example
deontic in the example below:
John must leave now
with the definition: ‘as far as the person with authority and /or the norm goes,
John’s leaving is necessary’ (van der Auwera and Plungian 1998: 83). The norms
here can be societal norms as well as moral assessments or judgements of desirabil-
ity (Nuyts 2016: 37).
An additional semantic domain is epistemic modality. The epistemic meaning of
must/måste has been defined in terms of a judgment by the speaker rather than in
terms of obligation: ‘a proposition is judged to be uncertain or probable in relation
to some judgment’ (van der Auwera and Plungian 1998: 81). The epistemic mean-
ing is illustrated by:
John must have arrived
Must (and måste) are available in all the domains. However all the epistemic
examples have been excluded from the investigation. They were less frequent than
the examples with obligation meaning and are mainly restricted to fiction.
In the present study I will focus on the importance of genre for understanding the
different frequencies and uses of the linguistic forms expressing obligation.
Following Biber (1988: 68) I will use the term ‘genre’ ‘to refer to text categoriza-
tions made on the basis of external criteria relating to author/speaker purpose’. The
genres used in the present study represent both fiction and non-fiction.
3 Material and Method
The data are taken from the English-Swedish Parallel Corpus (ESPC) (Altenberg
and Aijmer 2001). The ESPC contains original texts in English and Swedish with
their translations, altogether 2.8 million words making direct comparisons between
the languages possible. The texts represent both fiction and non-fiction texts in
equal proportions. Fiction texts consist of dialogues. Non-fiction is a hyperonym
covering the subject areas memoirs and biography, geography, humanities, natural
sciences, social sciences, applied sciences, legal documents, prepared speech
(Altenberg et al. 2001). I will use translation paradigms as the starting-point and
then compare the most frequent markers of obligation in different contexts of use in
English and Swedish.
16 K. Aijmer
4 The Marking of Obligation in English and Swedish
4.1 English Obligation Markers in a Translation Perspective
The corpus examples were selected in the following way. First all the examples of
måste and must were extracted from the original texts with their translations. On the
basis of the translations we can compare how obligation is expressed in the two
languages (in either fiction or non-fiction). Måste is, for example, not always trans-
lated as must but a large number of alternatives are found. At a second stage, I
examine the contexts and functions of the most important markers of obligation in
the two languages in both fiction and non-fiction
On the whole, both the auxiliaries were more frequent in non-fiction than in fic-
tion. Moreover, they were more frequent as obligation markers than as epistemic
auxiliaries (see Tables 1 and 2).
In non-fiction Swedish måste had obligation meaning in 96.3% of the examples
to be compared with must in 87.4% of the cases.
The smaller number of examples of must in the English texts is interesting against
the background that it has been claimed that must has declined in frequency within
a 30-year period during the last century and that it has been replaced by other ‘gram-
maticalizing’ elements (Leech et al. 2009).
Table 3 shows the correspondences of the Swedish måste in English (translations
of Swedish originals into English) and Table 4 (in Sect. 4.2) the correspondences of
English måste in Swedish (the translations from the English originals into Swedish).
Must, taking into account all its uses, was more frequent in non-fiction than in
fiction (see Table 2). This difference can be partly explained by the fact that there
are more occurrences of must with epistemic meaning in fiction (32.9% of the
examples were epistemic in fiction to be compared with only 12.6% in non-fiction).
Moreover, as noticed by the diachronic linguist, must has been replaced by have to
in many of its uses (Leech et al. 2009). A genre-type explanation of the discrepancy
is that must has a number of functions in non-fiction texts which are not paralleled
in the fiction texts.
Table 1 Epistemic and Fiction Non-fiction

obligation meanings of
Epistemic 109 (24%) 24 (3.7%)
Swedish måste
Obligation 345 (76%) 626 (96.3%)
Total 454 (100%) 650 (100%)
Table 2 Epistemic and Fiction Non-fiction

obligation meanings of
Epistemic 69 (32.9%) 42 (12.6%)
English must
Obligation 141 (67%) 292 (87.4%)
Total 210 (100%) 334 (100%)
Table 3 The English translations of Swedish måste (SO ->ET). Obligation meanings
ET fiction ET non-fiction Total
must 112 (30.4%) 357 (57.0%) 469 (48.3%)
have to 84 (22.8%) 85 (13.6%) 169 (17.4%)
had to 88 36 124
should 5 46 51
need to (or other forms with need)a 9 32 41
(have) got to 12 4 16
ought to 4 11 15
is (was) -ed 3 5 8
is necessary, essential – 8 8
be going to, will 4 1 5
be forced, be compelled, be made, be taken to – 4 4
x makes sb do sth 3 – 3
ø 4 – 4
other 6 27 33
Examples occurring once or twiceb 11 10 21
Total 345 626 971
a
Not all examples with need in the translations are semi-modals (cf ‘I need somone to talk to’).
b
The following examples occurred once or twice in either fiction or non-fiction: had better, neces-
sarily, of necessity, be in need of, be a need to, appreciate the need to, I should like to say, there is
no other way but, I cannot help but, be enough to, be due to, it was natural for X to do sth, it should
be incumbent on X to do sth, couldn’t possibly, emphatic do, it’s time, the imperative
Obligation can be expressed in many different (grammatical and lexical) ways

although with different frequencies Must is the prototypical obligation marker in
English (and måste in Swedish). If we look at the translations we see that måste is
translated as must in almost 50% of the examples (more often in non-fiction texts
than in fiction texts). However the translator may also choose a different translation
which is more appropriate in the context. Translators make their own analysis of the
context and select a translation which best mirrors the meaning of the modal expres-
sion in the original text.
Obligation can for example also be expressed by the semi-modals have to, had
to, got to, need to as shown by the translations. Semi-modals are not full modals but
are verb constructions which have been moving along the path of grammaticaliza-
tion and have gradually acquired an auxiliary-like function (cf. Leech et al. 2009:
91). Other translation alternatives were modal adverbs (inevitably, necessarily),
modal adjectives (necessary, essential). The markers can be ‘strong’ (be compelled
to, be forced to) or ‘weak’ (had better, ought to, should). Must was also rendered as
an imperative with a ‘directive’ function.
The semi-modal have to (without a formal equivalent in Swedish) was found in
17% of the examples. The uses of had to can be syntactically motivated. Had to is
for example the past tense of must (and have to). (If I had conflated have to and had
18 K. Aijmer
to the frequency would have been even higher.) Other frequent obligation markers
are (have) got to, need to and should.
Several other expressions have different frequencies in fiction and non-fiction.
Need to, should and ought to are strikingly more frequent in non-fiction than in fic-
tion. Have got to, on the other hand, occurs above all in fiction.
4.2 Swedish Obligation Markers in a Translation Perspective
The Swedish modal auxiliaries meaning obligation in my material are måste, få

(may), ska/skulle (shall, should), bör/borde (ought to) (see Table 4). They are
referred to as deontic modal auxiliaries in the Swedish reference grammar Teleman
et al. (1999). The most frequent obligation marker is måste. The Swedish obligation
markers have a different lexical origin than the English markers. Få and ska/skall
(unlike must/måste) have other modal meanings as well which do not appear in the
translations of must. Få is a modal auxiliary with the meaning of permission (‘may’)
which has acquired the meaning of obligation and ska/skall can be ambiguous
between deontically modal meaning and future meaning.
Få was more frequent in fiction than in non-fiction while ska(ll) appears more
often in the more formal non-fiction texts. Table 4 shows the correspondences of
must only. If I had looked instead at the translations of have to I would have found
some differences.1
Table 4 Correspondences of English must in the ESPC (EO-> ST)

Fiction Non-fiction Total
måste 90 (63.8%) 234 (80.1%) 324 (74.8%)
få (‘get’, ‘may’) 28a (19.9%) 18b (6.2%) 46 (10.6%)
ska(ll) 9 18 27
bör/borde (‘should’) 2 5 7
vara tvungen, tvingas (‘be obliged’) 3 1 4
imperative 2 – 2
kan – 2 2
tarvas (‘need’) 1 – 1
är en nödvändighet (‘is a necessity’) – 1 1
Ø 4 10 14
other 2 3 5
Total 141 292 433
18 examples were negated
a
15 examples were negated

b
1
Have to was for example translated into behöva (‘need to’) in three examples.
The translation paradigms provide only raw data. They do not provide any infor-
mation about the contexts in which must or its alternatives are chosen as a transla-
tion. In the following sections I will discuss must and its most frequent variants have
to and need and make comparisons with the Swedish correspondences. The
following research questions will be asked: Are the factors determining the distribu-
tion and uses of obligation markers in English and Swedish the same? Are the fac-
tors the same in fiction and non-fiction?
5 Obligation Markers in English and Swedish Fiction
5.1 English Obligation Markers in Fiction
Must and måste can be regarded as ‘close relatives’ but they were not always trans-
lated into each other. Must was translated as måste in 74.8% of the cases but the
correspondence in the other direction was much lower (because of the existence of
English variants such as have to). In this section I will discuss must, have to and
need to as competitors in the fiction texts.
Must and have to often overlap in meaning. For example, the translator may have
chosen must but could also have opted for have to without any difference in mean-
ing. However, there are some contexts where must and have to seem to be doing
different things. With a first person subject the speaker is strongly involved in the
verbal action:
(1) Your mother is lucky she did not choose to eat corned beef on a Saturday
night. On Saturday nights we are extremely busy. Now I must go. A nurse
will be coming along soon.” (ST1)
Er mor kan skatta sig lycklig att hon inte valde en lördagkväll för att äta
corned beef. På lördagkvällarna är vi ytterst upptagna. Nu måste jag gå.
Det kommer snart en sjuksyster.” (ST 1T)
When have to is used the obligation requiring an action from the speaker is imposed
by external circumstances (non-deontic meaning). In (2) the speaker has been
watching the galleries for a long time and now feels obliged by the look of them to
‘work up to them’.
(2) Galleries are frightening places, places of evaluation, of judgment.

I have to work up to them. (MA1)
Gallerier är skrämmande platser, platser för värdering, för bedömning.
Jag är tvungen att hetsa upp mig för dem. (MA1T)
However although have to is not deontic it can be used instead of must with a first
person subject to soften the imposition of the action on the hearer. Will (‘ll) in com-
bination with have to makes the imposition more vague and less strong by placing
the action in the future:
20 K. Aijmer
(3) I ‘ll have to think on it and perhaps take a few soundings before I decide
where I can best place it. (FF1)
Jag måste fundera på det och kanske höra mig för här och var innan jag
bestämmer vart jag ska skicka det. (FF1T)
Obligation expressions have been associated with ‘performativity’ and situations

where a person is in control of the verbal action. Must with a second person subject
was used for positive actions such as ‘exhortation’ or admonition (expressing the
speaker’s strong wish that the action will take place).
(4) You must allow me this chance in Provence to make up my mind. (BR1)
Du måste låta mig få den här chansen att bestämma mig i Provence. (BR1T)
In (5) the speaker is using have to rather than must because it is less impositive
and therefore more polite. Have to treats the action as negative (face-threatening)
and therefore in need of hedging. Placing the imposition in the future (you’ll have)
is another hedging strategy (cf. I’ll have to):
(5) Reliving, mentally, the events of three days earlier, Andrew said “You ‘ll
have to make allowance for my having been a little dazed at the time.” (AH1)
Andrew gick i tankarna igenom händelserna tre dagar tidigare och sade: “Du
måste tänka på att jag var litet förvirrad just då.” (AH1T)
However in other examples have to does not overlap with the deontic must but refers
to external circumstances (it is important or crucial that you hurry if you’ll get the
colour off the hair):
(6) Matilda said, “I ‘d give it a good wash, dad, if I were you, with soap
and water.
But you ‘ll have to hurry.”(RD1)
Matilda sa: “Om jag vore som du så skulle jag gå och tvätta igenom det
ordentligt, pappa, med tvål och vatten.
Men du blir tvungen att snabba på.”(RD1T)
Out of 34 examples with you as the subject 19 were translated by a generic pronoun
in Swedish making the obligation more vague or general (expressing little speaker
involvement).
(7) “It ‘s electronic,” Annette said weakly.

“You have to put in the right numbers…” (DF1)
“Den är elektronisk”, sa Annette med svag röst.
“Man måste använda de rätta siffrorna…” (DF1T)
In (8) the translator has used behöver (‘need’) to mark what needs to be done (put-
ting less imposition on the hearer):
(8) You only have to drive through the West Midlands to see that if we are in the
Super-League of top industrial nations, somebody must be moving
the goalposts. (DL1)
Man behöver bara köra genom West Midlands för att se att någon måste
ha flyttat på målsnöret för att placera oss i superligan av
industrinationer. (DL1T)
The obligation markers can come with a certain ‘evaluative prosody’ depending on
whether they are associated with something positive or negative (good or bad, desir-
able or undesirable) (Partington 2015).
When the subject has no control over the action have to can come to express
evaluation rather than obligation (Myhill and Smith 1995). In example (9), for
example, have to is chosen to suggest that sitting in the front is something
negative:
(9) He gets carsick and I do not, which is why he has to sit in the front. (MA1)
Han blir bilsjuk och det blir inte jag, det är därför han måste sitta i framsätet.
(MA1T)
In (10) the big bad wolf has to go somewhere else for his dinner (against his will).
(10) The big bad wolf has to go somewhere else to get his dinner; these little
piggies are home free.” (SK1)
Den stora stygga vargen får leta efter sin middag någon annanstans, dom
tre små grisarna har klarat sej.”(SK1T)
Need to and should (or ought to) and their Swedish correspondences encode a
weak deontic meaning (the speaker is open to the possibility that the obligation may
not result in an action). Unlike must these markers do not involve self-imposition (in
the first person) but communicate the speaker’s felt needs to do something
(participant-internal meaning). In non-fiction texts on the other hand need to and
should (ought to) were more frequent and sometimes translated with måste (signal-
ling strong obligation) (Sect. 6.1).
In the following sentence need to conveys that the subject did not feel the need
to sit down:
(11) “She did n’t need to sit. (PDJ1)

“Hon behövde inte sitta. (PDJ1T)
22 K. Aijmer
Need to can also signal the speaker’s positive attitude to the carrying out of the
action. With a generic second person subject need to can, for example, be inter-
preted as a recommendation:
(12) All you need to do is be prudent and not go there again. (RR1)
Allt man behöver göra är att vara försiktig och inte gå dit igen. (RR1T)
In (13) the speaker (a morgue attendant) uses need to rather than must or have to
with directive force:
(13) “We ‘ll need to know what arrangements you want made,” he said. (SG1)
“Vi behöver få veta hur ni vill arrangera begravningen”, sade han. (SG1T)
The authority imposed by the obligation marker is softened by the use of we (rather
than I) and by placing the time when the speaker needs to know in the future (cf. the
use of need to in non-fiction in Sect. 6.1).
5.2 Swedish Obligation Markers in Fiction
The Swedish modal auxiliaries meaning obligation in the data analysed are måste,
få (‘may’), ska/skulle (‘shall’, ‘should’), bör/borde (‘ought to’) (referred to as deon-
tic modal auxiliaries in the Swedish reference grammar Teleman et al. 1999). Få is
also an auxiliary with the meaning permission (=may) and ska/skall has developed
future meaning. Translations can show whether they have been interpreted as hav-
ing an obligation meaning.
According to Teleman et al. (1999: 296), få can have ‘approximately the same
meaning as måste in situations where it is clear that the action referred to in the
sentence is not in the hearer’s interest’ (my translation). This makes it different from
permission (the action is in the hearer’s interest). Let us consider some example
sentences with obligation få and their translations into an obligation marker in
English:
If the subject is the second person the verb has the illocutionary force of a
speaker-initiated directive. In (14) får conveys that the hearer does not intend to
open the gate willingly (it is not in his interest to do so):
(14) Nu kom Torsten ut i Johans synfält.

Han hade en kratta i handen och han gick och drog upp ränder i
gårdsgruset.
— Du får öppna grinn! ropade Vidart.
— Håll käft! skrek Torsten. (KE1)
Then Torsten came into Johan’s line of vision.
He had a rake in his hand and started raking the yard gravel.
“Open the gate!” Vidart shouted.
“Quiet!” Torsten yelled. (KE1T)
The constraint imposed can be associated with something negative. ‘Going to

hospital’ is regarded as something bad (unpleasant) and the directive therefore as
open to objections:
(15) Men han hade varit medvetslös en god stund och Birger ville inte ta nån
risk.— Du får åka till lasarettet, sa han.
Det hade Vidart ingenting emot.(KE1)
But he had been unconscious for quite a while, so Birger was taking no risks.
“You must go to hospital,” he said.
Vidart had no objections, but he was worried about the milk. (KE1T)
In (16) the action is treated as unwelcome to the hearer (‘you must show me the
harbour even if it involves some extra effort for you’). Få is therefore used with
persuasive force:
(16) - Gärna, svarade MacDuff på min inbjudan.

Men först får du visa mig hamnen.
Om jag inte har sagt det förut så är jag lots till yrket.
Hamnar är min speciella passion och hobby. (BL1)
MacDuff accepted my invitation.
“But first,” he said, “you must show me the harbour.
If I have n’t told you before, I ‘m a pilot by profession.
So I have a special interest in harbours.” (BL1T)
In all the examples of få some kind of negative evaluation takes place. Forgiving and
forgetting (an injustice) are a necessary evil if one is to survive.
(17) För man får glömma och förlåta om man ska överleva och förresten hade
priset på potatisen stigit till nästan två kronor för en tunna. (KE2)
One must forgive and forget if one is to survive, not to mention that the
price of potatoes had risen to nearly two kronor a barrel. (KE2T)
Behöva (‘need to’) is found in different patterns with different meanings.2 When
the subject is the first person the verb refers to a need felt by the speaker:
(18) Men jag behöver prata med dig några minuter. (HM2)
“I need to talk to you about something.” (HM2T)
The source of the need can be internal or external. In (19) the translator has chosen
have to indicating that the source is external (for example that the speaker needs the
cassette to make recordings) and to soften the imposition of the action:
(19) - Jag behöver ta med mig kassetten, sa han. (HM2)

“I ‘ll have to take the cassette with me,” he said. (HM2T)
Behöver was found as a translation equivalent of have to but not of must.

2
24 K. Aijmer
In (20) behöver man makes the utterance into a recommendation (translation: ‘you
have to’):
(20) Dom här plastmattorna behöver man bara skölja lite. (SC1)
You only have to wipe these plastic ones.” (SC1T)
Ska(ll) is also used in specific contexts of usage. As a deontic modal auxiliary

ska/skall grammaticalizes a degree of modal desirability (Teleman et al. 1999: 312).
It is used when the speaker makes a commitment constraining his/her future action.
It often involves power (the speaker has authority over the hearer and is in a position
to exercise control). In (21) the speaker is a parent speaking to a child:
(21) Du ska hem och äta! (ARP1)

“You must go back and eat. (ARP1T)
The ‘manufacturer’ tells the employee:
(22) - Du ska alltid ha en lista över personliga tillhörigheter i väsklocket. (RJ1)

“You must always keep a list of personal belongings taped to the inside
of your suitcase. (RJ1T)
The obligation can also be anchored in a certain social or functional norm (duty,
custom, order, normality, appropriateness) (Teleman et al. 1999: 316). In (23) the
reference is to what is important or essential:
(23) Det viktiga är inte att bestämma tidpunkten, knappt ens att resa.
Det viktiga är att man kan resa när tiden är inne.
Men förberedelserna skall vara genomtänkta. (BL1)
The essential thing is not to determine a time to leave, scarcely even
to make a voyage at all; it is being able to leave when the right
time for departure comes.
But the preparations must be carefully made. (BL1T)
To sum up, must expresses strong obligation associated with the speaker’s author-
ity (deontic meaning). Have to was used in several different contexts besides
expressing participant-external obligation compelled by the circumstances. It was
used instead of must in some contexts to express more politeness. Have to was also
used with a loss of the obligation meaning to negatively evaluate an action. Swedish
få (originally with permission meaning) was used with obligation meaning alternat-
ing with måste. In all the examples with få some kind of negative evaluation was
expressed. Swedish ska(ll) makes explicit deontic meanings where the source norm
involves personal or institutionalized authority. Need to and Swedish behöva are
used for favourable evaluation with a weak deontic meaning.
Figure 1 summarises the meanings of the modal markers of obligation in fiction
texts.
Obligation
participant- participant-
internal external
must deontic imposition

need to (exhortation (by circumstances,
admonition norms)
self-imposition)
evaluation evaluation strong weak

negative positive
must should ’ll have to

have to
have to ought to
Swe får
Fig. 1 The meanings of modal obligation markers in fiction
6 Obligation Markers in English and Swedish Non-fiction
The distribution and use of obligation markers is closely associated with genre or
text type. It is therefore interesting to study them in as many different text types as
possible. The non-fiction texts were atypical in that epistemic meaning was rare (cf
Sect. 4). Moreover must/måste was more frequent than in fiction. As in fiction a
large number of (grammaticalized or lexical) expressions were used to express
obligation.
Must and måste were the most frequent obligation markers in non-fiction. Have
to is ‘marked’ in non-fiction where it is ranked below must. On the other hand need
to, should (and ought to) were strikingly more frequent in non-fiction than in
fiction.
6.1 English Obligation Markers in Non-fiction
In the non-fiction texts must, have to, should and need to are used in a similar way.
They can for example have either strong or weak impositive force depending on the
context and they can express the speaker’s favourable or unfavourable attitude to the
realization of the verbal action. Prosodies can change depending on the syntactic
environment of the marker as well as the discourse type or genre the obligation
markers appear in. This is particular clear when we make a comparison between
26 K. Aijmer
fiction and non-fiction as in this work.In non-fiction texts speakers/writers use obli-
gation markers primarily as ‘an engine of persuasion’ (Partington 2015: 280). The
markers are directed towards an event in the future which is evaluated either posi-
tively or negatively and they are used by the speaker as a strategy in order to influ-
ence a potential audience. The evaluative meaning of the obligation marker depends
on the meaning of the marker (need/behöva, should/bör have for example positive
meaning) or on extralinguistic features of the discourse. Political discourse is char-
acterized by a number of special features. According to Lewis (2015: 171), ‘it is
often very carefully crafted, every nuance being analysed, and is designed for a
wider audience than the immediate hearers; it aims to impress and persuade and
may have a hortatory function; it has a ceremonial function that favours rhetorical
routines; and above all it deals largely with unrealized affairs.’
In (24) the Swedish translator has indicated (by means of bör ‘should’) his/her
interpretation that granting periods of rest and adequate breaks will have a positive
effect on ensuring the safety and health of Community workers:
(24) Whereas, in order to ensure the safety and health of Community workers,
the latter must be granted minimum daily, weekly and annual periods
of rest and adequate breaks; whereas it is also necessary in this context
to place a maximum limit on weekly working hours; (EEA1)
För att trygga hälsa och säkerhet för arbetstagare inom gemenskapen bör
arbetstagarna ges dygnsvila, veckovila och semester av en viss minsta
längd samt tillräckliga raster. I detta sammanhang är det även nödvändigt
att sätta en övre gräns för veckoarbetstiden. (EEA1T)
Not surprisingly it is often difficult to decide whether an obligation marker is

used with a positive or negative value or when ‘its evaluative potential’ is switched
off (cf Partington 2015: 289). Example (25) represents a less typical example of
must since it refers to what is negative for the subject (he must take responsibility
for the failure of the Allied army):
(25) Montgomery himself must accept responsibility for one major Allied
misfortune at this time: he asked for, and received, the support of the US
First Army to secure his right flank. (MH1)
Montgomery får själv ta på sig ansvaret för ett av de allierades stora
misslyckanden vid denna tidpunkt. Han begärde och fick stöd av USA:s
Första armé för att säkra sin högra flank. (MH1T)
We need to is especially appropriate to express that the action imposed represents

a desired goal. The translation (måste) shows that in non-fiction the obligation
imposed is interpreted as strong (rather than as a weak) recommendation:
(26) We need to see these plans implemented as quickly as possible. (EBOW1)

Vi måste förverkliga dessa planer så fort som möjligt. (EBOW1T)
Table 5 Must , have to, need and Swedish måste with different types of subject
must have to need to måste
Collective we 48 39 10 60
Collectives (countries, institutions) 47 13 4 29
Abstract nouns 66 10 5 48
Passives 144 5 13 52
I 7 11 12
You 3 4 –
Other (including it, there, this) 31 4 1 1
Generic pronoun (one, they, everyone, people Swe ‘man’) 11 1 1 32
TOTAL 357 85 234 234
As shown by this example the obligation markers can also combine with other rhe-
torical strategies such as the use of ‘impersonalization’ (see Table 5). We in example
(26) is the collective or vague ‘we’ referring to ‘we in this country’, ‘we in the
European Union’, etc. When the grammatical subject was not we it was for example
a third person subject with a passive construction. Other examples are collective
nouns such as ‘Countries of the European Union’ or ‘Swedes living and working
abroad’. Abstract nouns are for example ‘the development of trade’, ‘evaluation’.
Table 5 only includes examples from the category speeches in the European
Parliament and political debates in English and in Swedish:
The low frequency of you as subject is noticeable. Rather than saying ‘you must’
which has a strong impositive force, the collective we (eg we need) is used as a tactic
to soften the imposition. A comparison between English and Swedish shows that
Swedish texts use a generic pronoun man which only rarely has a correspondence in
English (one).
With a passive following the modal marker and a third person subject no direct
reference is made to the speaker and hearer.3 The evaluative potential of the marker
can be exploited for persuasive effects. In (27) the use of need to helps the interpre-
tation that the action (matching the flexibility of EU member states by certain crite-
ria) is judged to be favourable (needs to be done). The imposition is only expressed
weakly since it is not directed to a special individual.
(27) The flexibility for Member States needs to be matched by a range

of indicators to identify need. (EMCC1)
Flexibiliteten för medlemsstaterna måste matchas av en grupp indikatorer
som skall identifiera behovet. (EMCC1T)
With need to the obligation is also represented as being in the best interest
of ‘us’.
3
Compare also Nokkonen (2006: 60) who describes these uses ‘as cases which are still clearly
deontic, but they are not very subjective in nature’.
28 K. Aijmer
With have to and must the obligation can also refer to something which is
regarded as negative or unpleasant. According to Lewis (2015), have to (in political
debates) makes a negative evaluation of the verbal action. Here is an example from
my data:
(28) But we have to implement the nuclear as well as the fossil fuel provisions
of that agreement.
We are taking far too long to decide whether to support the completion
of the Khmelnitsky and Rovno reactors.
I can tell the House that the number of Russian scientists and engineers
in the Khmelnitsky area has greatly increased in recent weeks.
We have to make up our minds.
Are we going to complete those reactors to Western standards or are we
going to leave it to the Russians and let the Memorandum of Understanding
go down as a dead piece of paper? (EADA1)
Men vi måste genomföra både kärnkraftssidan och den fossila bränslesidan
av den överenskommelsen.
Vi tar alldeles för lång tid på oss för att besluta huruvida vi skall stödja
färdigställandet av Khmenilitskij- och Rovno-reaktorerna.
Jag kan berätta för parlamentet att antalet ryska vetenskapsmän och
tekniker i Khmenilitskijområdet har ökat betydligt under
de senaste veckorna.
Vi måste bestämma oss.
Skall vi färdigställa de här reaktorerna med väststandard eller skall vi
lämna det till ryssarna och låta avsiktsförklaringen bli ett
dött papper? (EADA1T)
In (28) the speaker is talking about our ambivalent attitude to nuclear power.
Implementing the provisions of the nuclear agreement is however a necessary evil.
We have to make up our minds although this is unwanted.
6.2 Swedish Obligation Markers in Non-fiction
The Swedish obligation marker måste is more frequent than must is in English
reflecting the fact that it has few competitors. Like must it co-occurs with imper-
sonal subjects (no special agent is intended). Depending on the context it can indi-
cate strong or weak obligation, participant-internal and participant-external
meaning, express ‘positive’ and ‘negative’ evaluation of the action imposed. The
reference of the grammatical subject is vague. However, it is possible to present the
obligation as not being in the best interest of the general public, workers, members
of the European Union etc.:
The strength of imposition (and evaluation) is not an inherent meaning of måste
but depends on the context:
(29) Det måste bli en omprövning av de traditionella attityderna gentemot äldre

och de roller som man vill ge dem. Speciellt gäller detta på
arbetsmarknadsområdet. (EISC1)
We need to review our traditional attitudes towards senior citizens and
rethink the roles we expect them to play in society. This applies particularly
to the world of work. (EISC1T)
The translator has chosen we need which suggests a positive evaluation. It is desir-
able that we (in the European Union) review our attitudes towards senior citizens.
Måste is vague between different types of evaluative prosody. In (30) the transla-
tor has interpreted the speaker’s attitudes to the activity variously by using either
have to or must. The imposed obligation will be unpleasant for those who prefer a
quick education to a broad education or life-long learning (have to). Must, on the
other hand, implies that integrating working life and education will be for the gen-
eral good.
(30) Den andra faktorn är att vi måste se till att skaffa en utbildning som går att
använda under lång tid när vi skaffar oss en utbildning. Det måste vara en
bred grundutbildning, eftersom samhället förändras i allt snabbare takt.
Det går inte att ha snabba utbildningar. Vidare måste det också vara ett
livslångt lärande. Arbetsliv och utbildning måste helt integreras. (EAND1)
The other factor is that we must ensure that when we obtain an education
we obtain one which can be used for a long time. There has to be a broad
basic education, because society is changing ever more rapidly. It is not
possible to have a quick education. Furthermore, there has to also be
life-long learning. Working life and education must be fully integrated.
(EAND1T)
In (31) the translator has rendered får as must (strong obligation):
(31) På nuvarande kunskapsnivå får frågan om hälsoeffekter betraktas som

obesvarad. (BJ1)
At the present level of knowledge, the question of health effects must be
considered unanswered. (BJ1T)
Får is generally negative as shown by the context. The imposition of the obliga-
tion will have a negative effect on ‘us’ (members of the European Union):
(32) Vi får räkna med att biståndet i krisländerna kommer att tvingas
fungera i en korrupt miljö under lång tid. (CO1)
We must accept that development assistance in crisis countries will have to
operate in a corrupt environment for a long time to come. (CO1T)
30 K. Aijmer
Seeing every technique in the light of other techniques is regarded as negative (there
is no other alternative; it follows that it is bad):
(33) I det senare fallet skapas ett metodkomplex ur en uppsättning tekniker

(där alltså enskilda tekniker mycket väl kan modifieras och där varje
teknik får ses i relation till varje annan) mot bakgrund av ett problem
och grundläggande föreställningar. (BB1)
In the latter case a methodological apparatus is created out of a set of
techniques (where every single technique may very well be modified,
and where every technique must be looked at in the light of all other
techniques) from the background of a problem and in the light
of basic conceptions. (BB1T)
Behöver can be used with the same meaning as ‘strong’ obligation markers:
(34) Under de första månaderna då risken för avstötning och infektion efter
organtransplantation är som störst behöver patienterna undersökas
polikliniskt en till två gånger per vecka. (ORG1)
During the first few months, when the risk of rejection and infection after
organ transplantation is greatest, the patients must be examined at the
outpatient clinic once or twice a week. (ORG1T)
While måste is vague between many different interpretations (it does not for
instance refer to a specific source) skall refers to an institutionalized source norm
such as legal regulations:
(35) Arbetslokal skall vara så utformad och inredd att den är lämplig från
arbetsmiljösynpunkt. (ARBM1)
Work facilities must be arranged and equipped in such a way as to provide
a suitable working environment. (ARBM1T)
Figure 2 summarises the meanings of the modal obligation markers in

non-fiction.
7 Conclusion
The study raises a number of issues having to do with genre and with on-going
changes in the modal system in English reflected in functional overlaps. In my data
the obligation markers were the same in fiction and non-fiction but we also saw
some genre preferences. Have to was for example more frequent in fiction than in
non-fiction (which is in line with its taking over some of the functions of must). In
both English and Swedish the obligation markers can be associated with specific
contexts of use. In fiction the obligation markers were frequent with first and second
person subjects. Must was used for exhortations (speech acts implying a high degree
Obligation
participant- participant-
internal external
––
deontic Imposition
(by circumstances,
norms)
evaluation evaluation strong weak

negative positive must
––
must must have to
have to need to need to
Swe får should Swe måste
Swe ska/ll ska/ll
behöver
Fig. 2 The meanings of the modal obligation markers in non-fiction
of insistence). Have to could be softened or hedged by will. It was therefore used if

the action imposed could be evaluated negatively (as going against the hearer’s will
or threatening the hearer’s face). Have to was used in many different contexts for
example with generic subjects and instead of must to evaluate something
negatively
Obligation markers such as must, have to, need to ‘carry with them a set of sug-
gestions on how to use them’ positively or negatively (Partington 2015: 292).
‘Prosodies’ can depend on the text type or genre; they can depend on the particular
marker or the environment in which the marker is used. In the non-fiction texts the
obligation markers are generally found in positive contexts (cf Lewis 2015).
Speakers use obligation markers to recommend an action as being for the ‘general
good’. By using impersonal structures with a collective we as the grammatical sub-
ject or an agentless passive the speaker can get the message across to the hearer with
maximum hedging. Such genre-related developments or uses can be viewed as ten-
dencies rather than rules.
Need to, should and their Swedish correspondences express ‘positive obligation’
(the action will be favourable to the hearer/audience if actualized). Must usually
pointed forwards to something desirable in the context of the EU debates. Have to,
on the other hand, is also found in negative contexts in the non-fiction data (the
obligation imposed by the circumstances leaves no choice which results in a nega-
tive evaluation). Swedish måste was used both about positive and negative obliga-
tion. In Swedish få was an alternative to måste when the imposition is not in the
hearer’s interest (from this follows that it is usually bad).
The method used in this study has involved going from form-to-function since I
have started with must and looked at its translation into another language. The trans-
lation approach gives a rich representation of the elements which are part of the
32 K. Aijmer
semantic field and their frequencies. A complementary approach which should be

explored in the future is to start with function and describe how this function is real-
ized by modal obligation markers in comparable registers in the two compared
languages.
References
Altenberg, B., & Aijmer, K. (2001). The English-Swedish parallel corpus: A resource for contras-
tive research and translation studies. In C. Mair & M. Hundt (Eds.), Corpus linguistics and
linguistic theory. Papers from the 20th International Conference on English Language Research
on Computerized Corpora (ICAME 20) Freiburg im Breisgau 1999 (pp. 15–33). Amsterdam/
Philadelphia: Rodopi.
Altenberg, B., Aijmer, K. and M. Svensson. 2001. The English-Swedish Parallel Corpus (ESPC).
Manual of enlarged version. http://www.sol.lu.se/engelska/corpus/corpus/espc.html#size
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Coates, J. (1983). The semantics of the modal auxiliaries. London: Croom Helm.
Leech, G., Hundt, M., Mair, C., & Smith, N. (2009). Change in contemporary English: a gram-
matical study. Cambridge: Cambridge University Press.
Lewis, D. (2015). A comparable-corpus approach to the expression of obligation across English
and French. Nordic Journal of English Studies, 14(1), 152–173.
Myhill, J., & Smith, L. A. (1995). The discourse and interactive function of obligation expressions.
In J. Bybee & S. Fleischman (Eds.), Modality in grammar and discourse (pp. 239–292).
Amsterdam: John Benjamins.
Nokkonen, S. (2006). The semantic variation of NEED TO in four recent British English corpora.
International Journal of Corpus Linguistics, 11(1), 29–71.
Nuyts, J. (2016). Analyses of the modal meanings. In J. Nuyts & J. van der Auwera (Eds.), The
Oxford handbook of modality and mood (pp. 31–49). Oxford: Oxford University Press.
Partington, A. (2015). Evaluative prosody. In K. Aijmer & C. Rühlemann (Eds.), Corpus pragmat-
ics. A handbook (pp. 279–303). Cambridge: Cambridge University Press.
Teleman, U., Hellberg, S., & Andersson, E. (1999). Svenska Akademiens grammatik. Stockholm:
Norstedt.
van der Auwera, J., & Plungian, V. A. (1998). Modality’s semantic map. Linguistic Typology, 2,
79–124.
English so and Dutch dus in a Parallel Corpus:
An Investigation into Their Mutual
Translatability
Lieven Buysse
Abstract English so and Dutch dus have been characterised as the highly frequent
prototypical markers of ‘result’ or ‘inference’ in their respective languages. This
study investigates the functional scope of both based on close scrutiny of the
bidirectional Dutch-English component of the Dutch Parallel Corpus, a 10 million
word sentence-aligned corpus of translated texts. Seven functions are identified in
the ideational, interpersonal and textual domains. The mutual correspondence of the
two markers is mapped in a combined quantitative and qualitative analysis of how
they are translated into the target language for each of their functions, as well as of
how they are backtranslated (e.g. which Dutch forms have so as their translation
equivalent in English?). The results show a high overall correspondence, yet with a
slight translation bias, in that the correspondence is considerably higher when so is
translated into Dutch than when dus is translated into English.
Keywords Pragmatic markers • Parallel corpus • Translation • Dutch • English •

so • dus
1 Introduction
Pragmatic markers are typically words or phrases that do not belong to the proposi-
tional message of an utterance, and are hence semantically and syntactically
optional, but contribute to it in various subtle ways, such as by expressing speakers’
attitudes to their interlocutors or to the message, or by making plain which relations
hold between an utterance and its co-text or context. Ever since Schiffrin’s (1987)
seminal work on “discourse markers” these linguistic items have featured highly on
the agenda of researchers in pragmatics and discourse analysis. One of the most
daunting challenges in the description of pragmatic markers is posed by their
polyfunctionality, which raises questions as to their functional scope as well as to
L. Buysse (*)
Faculty of Arts, KU Leuven, Brussels, Belgium
e-mail: lieven.buysse@kuleuven.be

DOI 10.1007/978-3-319-54556-1_3
34 L. Buysse
the (inter)relatedness of their functions. Translations of pragmatic markers have

been suggested as one way towards resolving such issues (see e.g. Aijmer and
Simon-Vandenbergen 2003; Aijmer et al. 2006; Johansson 2006; Degand 2009).
Although more and more languages have been involved in the domain, Dutch has
largely remained a blind spot (yet see more recent work by Hogeweg (2009) and
Niemegeers (2009) on the modal particles wel and maar, and Buysse (in press) on
the Dutch equivalents of English question tags). There are, nevertheless, linguistic
items in Dutch that closely resemble forms with at first sight similar functions in
other languages. One of these is dus, which appears to function in much the same
way as so does in English. The current study, therefore, sets out to juxtapose these
two forms in a bidirectional parallel corpus with Dutch and English source and tar-
get texts as an initial exploration into their functional similarities and differences.
Both so and dus have an ambiguous grammatical status (Biber et al. 1999;
E-ANS 2012). Like co-ordinating conjunctions, they can join two main clauses, as
in (1). Like linking adverbials, on the other hand, they can occur sentence-initially
as well as co-occur with other coordinators, as in (2) and (3) respectively.
(1) Ik ben niet veel geniale mensen tegengekomen in mijn leven, dus ik weet
waarover ik praat als ik hem geniaal noem. [‘…, dus I know what I’m
talking about when I call him genious.’]
I haven’t met many brilliant people in my life, so I know what I’m talking
about when I call him brilliant.1
(Fiction, grue-002593, Dutch-English)
(2) So we expect some understanding from ArcelorMittal in this respect too.

Dus ook op dat vlak verwachten wij enig begrip van ArcelorMittal.
[‘Dus also in that respect we expect some understanding from
ArcelorMittal.’]
(External communication, arc-002044, English-Dutch)
(3) That there, there were only the stones in, the walls to hear me – and herself,
who they kept dumb as a stone, and so could tell no-one.
Dat daar alleen de stenen in de muren me konden horen – en zijzelf, die
moet zwijgen als een steen, en dus niets kan doorvertellen. [‘…, en dus can’t
pass anything on.’]
(Fiction, wat-002588, English-Dutch)
The main formal difference between so and dus is the position they can occupy
in the clause. In all previous examples both take clause-initial position, which is the
1
Unless stated otherwise, all examples have been drawn from the Dutch Parallel Corpus (see Sect.
2). For these examples the source text fragment appears first and is followed by the target text frag-
ment. An additional literal translation has been provided in square brackets for the Dutch clause in
which the relevant marker occurs. Each example ends with the basic metadata in rounded brackets:
the text type, the text number in the corpus, and the translation direction).
English so and Dutch dus in a Parallel Corpus 35
sole position for so. Dus, however, also takes mid-position, as exemplified in (4),
and even occupies clause-final2 slots, as in (5).
(4) De hoge positie van Nederland wordt dus verklaard door het lage percentage
leerlingen dat onder niveau 2 scoort.
[‘The high position of the Netherlands is dus explained by the low
percentage of pupils that score below level 2.’]
(External communication, vla-001191, Dutch-English)
(5) Groot mag dus, als het maar opvalt en origineel is.
[‘Large is okay dus, as long as it stands out and is original.’]
(External communication, wst-000768, Dutch-English)
On a semantic level too the two markers show similarities and differences. They
have both been attributed the status of prototypical causal or resultative markers,
and have been attested as highly frequent items in spoken as well as written lan-
guage. So has particularly been studied as a pragmatic marker in discourse-functional
approaches (see e.g. Schiffrin 1987; Redeker 1990; Müller 2005; Lam 2009; Buysse
2012), whereas dus has received most attention from cognitive linguists (see e.g.
Pander Maat and Sanders 1995, 2000; Pander Maat and Degand 2001; Stukker et al.
2008), who have been particularly interested in mapping coherence relations in the
causal domain. One main assumption is clearly shared by both research strands, viz.
that causal markers can function in three domains: the ideational, interpersonal, and
textual domains.3 Ideational relations connect states of affairs in the world described
in the discourse, as in sentence (a) in (6), where so relates a state of affairs (he is
home) to another state of affairs from which it results (he is sick): John’s being home
was caused by his being sick. In the interpersonal domain markers relate “the illo-
cutionary meaning of one of the discourse units with the locutionary meaning of the
other” (Degand 2001: 79). In sentence (b), for example, the second proposition is a
claim inferred from the state of affairs expressed in the first segment: the speaker
claims that John is home on the basis of the observation that his lights are burning.
Textual relations, finally, are discourse-organising relations (e.g. a list or a digression),
which may also take the form of a speech act, as in sentence (c), where the first
proposition sparks a request for information in the second.
2
So too has been observed to occur in a position resembling clause-final slots, such as turn-final
position. In such cases so indeed does not explicitly preface a segment, but an implied segment can
be retrieved from the context (Schiffrin 1987; Müller 2005; Buysse 2014), which is clearly differ-
ent from clause-final tokens of dus.
3
The terminology used here is that of Halliday and Mathiessen’s (2004) metafunctions. Note that
many studies on dus have followed Sweetser’s (1990) terminology, distinguishing between con-
tent, epistemic and speech act domains (e.g. Pander Maat and Sanders 2000; Stukker et al. 2008),
and that other proposals have also been put forward such as Redeker’s (2006: 354) “components
of discourse coherence” (ideational, rhetorical, sequential). For our present purposes the finer
details of these approaches and their mutual differences are of minor relevance.
36 L. Buysse
(6) a. John is sick, so he is home.

b. John’s lights are burning, so he is home.
c. John’s lights are burning. So is he home?
(Schiffrin 1987: 211)
The distinction between sentences (a) and (b) corresponds to the well-established
theoretical division between semantic and pragmatic relations (cf. Van Dijk 1979).
Applied to so the distinction can be interpreted as one between so marking a relation
of ‘result’ versus marking one of ‘inference’ (Schiffrin 1987; Müller 2005; Buysse
2012).
As for the textual domain, quite a few studies on so have focused on particular
discourse functions or highly specific contexts. Johnson (2002), for example, looks
at so-prefaced questions in police interviews, whereas Norrick (2008) identifies so
as a conversational response token in a story-telling context, and Bolden (2009)
examines its potential to mark an utterance as having been on the conversational
agenda for some time. More comprehensive functional mappings of so, such as
those devised by Müller (2005), Lam (2009), and Buysse (2012), have exposed a
wide variety of textual functions, including marking a summary, signalling a return
to the main discourse unit (e.g. after a digression), initiating (a part of) a conversa-
tion, starting a new sequence in a story, and marking self-correction.
Observations of similar functions for dus have been scarce. Nevertheless, Evers-
Vermeul (2010) discusses two “discourse marker” uses of dus (2010: 153), both of
which are to do with information status. First, dus may indicate that the information
contained in the segment in which it occurs is already somehow available to the
interlocutor (Evers-Vermeul 2010: 161); second, it may have a double function of
marking a conclusion as well as marking that the conclusion is obvious (since it can
be inferred from the prior co-text) or logical (since it is the only sensible conclusion
one could draw) (2010: 167). In another study Degand (2011) identifies two meta-
discursive functions of clause-final dus in spoken Dutch, viz. reformulation and
floor-yielding, both of which have also been noted for so.
Similar to most other pragmatic markers in English, yet unlike dus in Dutch, so
does not only appear as a pragmatic marker, but may also appear, as Müller (2005)
points out, as an adverb of degree or manner (e.g. she’s so great), a pro-form (e.g. I
think so), in fixed expressions (e.g. and so on), and to express purpose (often in the
form of so that).4
In short, the different angles from which so and dus have been approached in
previous research do not allow for a systematic comparison of these two markers
that nonetheless appear to exhibit many formal and functional resemblances, not in
the least their status as prototypical markers of ‘result’ or ‘inference’ in their respec-
tive languages. The aim of the present study is, therefore, to map functional simi-
4
Note, though, that diachronic research has shown that dus used to have an anaphoric function with
a meaning similar to thus or in this way, which gradually got lost between the 16th and 19th cen-
tury (Evers-Vermeul 2010).
larities and differences between so and dus by looking into their mutual translatability
in a parallel corpus.
Section 2 describes the data used for this investigation. In Sect. 3 the functional
translation correspondence of dus and so is discussed by focusing on each of the
markers’ functions as attested in the corpus. A quantitative analysis of the corre-
spondence between dus and so is provided in Sect. 4, with conclusions drawn in
Sect. 5.
2 Data
The data for this study have been extracted from the bi-directional Dutch-English
component5 of the Dutch Parallel Corpus or DPC (Macken et al. 2011; Paulussen
et al. 2013), a sentence-aligned corpus of translated texts. The texts that constitute the
DPC were published between 1997 and 2009, and belong to five text types: fictional
and non-fictional literature, journalistic texts, instructive texts, administrative texts,
and external communication (such as press releases, brochures and corporate maga-
zines). The corpus has been balanced proportionally for translation direction as well
as for text type, resulting in 500,000 words for each text type for each translation
direction (e.g. Dutch-English and English-Dutch), which amounts to a corpus of 5
million words for the purposes of this study. All instances of so and dus were extracted
automatically from the corpus (together with the aligned target-text sentences), and
subsequently checked manually to remove any double entries and irrelevant tokens.
The latter especially pertained to those instances where so is an adverb of manner or
degree, a pro-form, part of a fixed expression, or a marker of purpose (see Sect. 1). As
this is a bi-directional corpus, different angles can be looked at. It is not only possible
to consider how so has been translated from English into Dutch (and dus from Dutch
into English), but also backtranslations have been taken into account, which means
that I have searched for so in English target texts and traced the correspondences for
these tokens in the Dutch source texts (and vice versa for dus).
The main drawback of working with a corpus of translated texts is their inherent
bias for written registers while pragmatic markers are rather more typical of spoken
registers. As Johansson (2006) aptly points out, though, the target texts are the result
of a thorough process of translation in which translators have independently inter-
preted source texts, and “[w]hat we are studying is the result of this interpretation
(and recreation) process” (2006: 117). This can shed light on how a pragmatic marker
functions in the source language, and how this function can be conveyed in the target
language. Moreover, many texts in the DPC may have appeared in written mode, but
were either meant to be spoken or otherwise reflect spoken language. Many admin-
istrative texts, for example, are transcripts from meetings at the European Parliament
5
The DPC also has a bidirectional Dutch-French component, but this was not included in the pres-
ent study.
38 L. Buysse
or official speeches by government ministers. The fiction component of the corpus in

its turn contains many instances of direct speech (which has already proved a valu-
able resource in studies by Aijmer and Simon-Vandenbergen 2003; Johansson 2006;
Degand 2009; Denturck and Vandepitte 2009), as do the journalistic texts, many of
which are interviews or columns. It should also be noted that, contrary to markers
with a predominant interpersonal function (such as well or you know), so and dus are
highly frequent both in spoken and written language. Nonetheless, it should be
borne in mind that (i) some functions of these pragmatic markers may not be attested
in the present corpus because they are particular to specific spoken registers, and that
(ii) the texts show a bias towards more formal registers.
3 Functional Translation Correspondence of Dus and So
In the DPC 697 tokens of so and 1229 of dus fulfilling a pragmatic marker function
have been identified and analysed. These tokens fall into seven functional categories
(Table 1), each of which will be discussed in this section. The categories are largely
those identified in Buysse (2012) for so, adjusted on the basis of the present corpus
analysis as the original classification was based on spoken data whereas the DPC
consists mainly of written data. One category has been added that is specific to dus,
viz. Reiteration.
3.1 Mark a Result
Schiffrin (1987: 191) describes so as a “marker of result”, clearly indicating that this
meaning relation is at the heart of so’s functional spectrum. Subsequent investiga-
tions (Fraser 1990; Müller 2005; Lam 2009; Buysse 2012) have confirmed this. Dus
has been claimed to be more typical of “epistemic” relations (i.e. inferential rela-
tions) than of “content causal” (i.e. resultative) relations (e.g. Pander Maat and
Sanders 2000; Pander Maat and Degand 2001; Stukker et al. 2008, 2009), although
Table 1 Functions of so and dus in the Dutch Parallel Corpus

so dus
Function N % N %
1. Mark a result 218 31.3 196 16.0
2. Mark inferential relations 98 14.1 338 27.5
3. Draw a conclusion on a textual level 162 23.2 320 26.0
4. Boundary marking 96 13.8 41 3.3
5. Start a new sequence 63 9.0 34 2.8
6. Elaboration/restatement 60 8.6 265 21.6
7. Reiteration 0 0.0 35 2.8
Total 697 100.0 1229 100.0
it does occur in such resultative contexts as well. In (7), for example, Broccoli’s
poor vision was caused by the fact that he was not wearing his glasses, and in (8) the
fact that they cannot chop wood is caused by the absence of chainsaws.
(7) Broccoli had zijn bril niet op, dus hij kon niet goed zien. [‘…, dus he could
not see well.’]
Broccoli wasn’t wearing his glasses, so he couldn’t see much.
(Fiction, gru-002593, Dutch-English)
(8) They have no chainsaws, so they cannot chop wood.

Ze hebben geen kettingzagen, dus kunnen ze geen hout hakken. [‘…, dus
they can’t chop any wood.’]
(Journalistic texts, ind-001746, English-Dutch)
Stukker et al. (2008) contend that when dus is used outside of its habitual epis-
temic (or inferential) context the rhetorical effect of “speaker foregrounding” (2008:
1306) is produced: since epistemic relations inherently involve cognitive processes,
in using dus the speaker/author indicates that s/he is somehow involved in the estab-
lishment of the relation between the segments. For example, in excerpt (7) the two
segments are clearly causally related, yet by the mere use of dus the perspective
shifts subtly in that the second segment could be viewed as the speaker/author’s
personal observation at the time of speaking/writing rather than as an objective
report of a past event. Similarly, in (8) the dus-prefaced proposition is presented as
the speaker/author’s observation rather than as an objective statement of fact.
The ‘resultative’ category takes up 28.6% (N = 93) of all tokens of so that have
been translated into Dutch, and 33.6% (N = 125) of its backtranslations.6 The rates
for dus are considerably lower, with only 13.8% (N = 124) of its translations into
English and 21.6% (N = 72) of its backtranslations. On the whole, so and dus are by
far each other’s preferred translation correspondents in the resultative category
(Table 2).7 It would seem that this correspondence is tighter when English is the
source language: if zero correspondence is ignored (viz. all source text tokens that
do not have a correspondent in the target text and vice versa), a majority of
‘resultative’ tokens of so are translated with dus, and two thirds of target text tokens
of ‘resultative’ dus have so as their correspondent; with translations in which Dutch
6
For want of precise numbers as to the overall size of the various components of the DPC (e.g.
translations Dutch-English, translations English-Dutch, etc.) absolute numbers cannot be normal-
ized to e.g. 1000 words. Instead the percentages have been calculated within each component.
7
Tables 2, 3, 4, 5, 6, 7 and 8 always mention the three most frequent correspondents. The other
markers are summarized in a single line, indicating how many other markers there are and what the
total frequency of this group of markers is. For example, in Table 2so is translated by 8 other mark-
ers than those making up the top three, totalling 14 tokens of such other markers. If several markers
in the top three have the same frequency, they are all mentioned in the same rank. For example,
Table 3 indicates that daarom and zodat each occur seven times (“2x7”), and that makes them the
second most frequent correspondents of so in Dutch target texts. The percentages are based on the
total number of tokens minus zero correspondences.
40 L. Buysse
Table 2 Top three of correspondents of ‘resultative’ so and dus in translations and backtranslations
in absolute numbers (N) and adjusted percentages (excluding tokens with zero correspondent)
so> Dutch target text Dutch source text >so
marker N Adj.% marker N Adj.%
dus 40 54.1 dus 43 37.7
daarom (‘that’s why’), zodat (‘so 2× 2 × 9.5 daarom (‘that’s why’) 20 17.7
that’) 7
en (‘and’) 6 8.1 zodat (‘so that’) 9 8
8 other 14 18.9 9 other 42 37.2
zero 19 zero 11
Total 93 Total 125
dus> English target text English source text >dus
so 43 45.3 so 40 65.6
therefore 23 24.2 therefore 6 9.8
consequently 7 7.4 as such, hence, since, 4 × 4 × 3.3
thus 2
10 other 22 23.2 7 other 7 11.5
zero 29 zero 11
Total 124 Total 72
serves as the source language the shares are still very high (37.7% and 45.3%) but
also considerably lower than when English is the source language.
3.2 Mark Inferential Relations
An inference can be paraphrased as: “from state of affairs X I conclude the follow-
ing: Y” (Buysse 2012: 1768). It has been well established that both so and dus are
considered the prototypical markers of inference in their respective languages (see
Sect. 1). For example, in (9) the speaker/author deduces from the state of someone’s
fingernails in the first segment that this person cannot hold an occupation on the
land or in a factory in the second segment. The difference with a ‘resultative’ rela-
tion is clear: the fact that someone is not working on the land or in a factory is not
caused by their long fingernails.
(9) Fingernails rather long, so you’re not on the land or in a factory.

Vrij lange nagels, dus u werkt niet op het land of in een fabriek.
[‘…, dus you don’t work on the land or in a factory.’]
(Fiction, wat-002589, English-Dutch)
Dus has the ability to occur in a particular context that is alien to so, by stating
an inferential conclusion that is presented as obvious in that the reader/interlocutor
is expected to rely on shared background knowledge to retrieve the grounds on
which the conclusion is based. Evers-Vermeul (2010) refers to this function of dus
as that of an “accessibility marker” (2010: 171), labelling the stated conclusion as
accessible to the hearer/reader regardless of its occurrence in the discourse. For

example, in (10) dus is meant to have the recipient of the message infer from what
they know of the Iraq situation (after the invasion by the so-called coalition of the
willing) that the conditions for a successful British approach were never met.
Interestingly, the marker was added by the translator in this case.
(10) While the Americans were unleashing mayhem to the north, the British
were methodically applying Lugard-style colonialism in Basra. They
formed alliances with sheikhs, bribed warlords and won hearts and minds
by going unarmoured. There was optimism in the air. British policy
demanded one thing, momentum towards local sovereignty and early
withdrawal. There was no such momentum.
Terwijl de Amerikanen herrie schopten in het noorden, pasten de Britten in
Basra heel methodisch oude koloniale principes toe: ze gingen allianties aan
met plaatselijke sjeiks, kochten krijgsheren om en wonnen de sympathie van
de bevolking door zich in ongepantserde voertuigen op straat te begeven. De
Britse aanpak vereiste maar een ding: dat er werk gemaakt werd van Iraakse
soevereiniteit en van een vroege terugtrekking der troepen. Maar dat is
erdusnooit van gekomen. [‘But that dus never happened.’]
(Journalistic texts, sta-002559, English-Dutch)
In (11) this inferential prompt is even more outspoken. The excerpt has been taken
from a statement by the Dutch Prime Minister, responding to the murder of a contro-
versial politician earlier that day. Halfway through the statement he points out that it
reflects his personal sentiments (rather than his Cabinet’s). Dus presents this state-
ment as obvious because it can be inferred from, for example, the tone the speaker
has used so far or from the situation in which the statement is being delivered.
(11) Maar dat alles schiet natuurlijk door je kop op een moment, zoals nu, dat dat
nieuws tot je komt. Dat je er steeds meer van doordrongen bent van wat er in
Nederland is gebeurd. In Nederland, een verdraagzaam land, met natuurlijk
politieke tegenstellingen, zoals in iedere democratie. Dat is democratie. Maar
wel met respect voor elkaar, respect voor elkaars mening. Respect voor
elkaars mening houdt natuurlijk ook in dat je elkaar daarop kunt bestrijden,
maar met woorden, niet met kogels. Wat hier gebeurd is, is onbeschrijflijk.
Dit zijndusmijn persoonlijke ontboezemingen. [‘These are dus my personal
sentiments.’] Ik kan het niet anders zeggen. Ik ben er kapot van.
But it all runs through your mind at a moment like this, when you hear news
like this. As it begins to sink in that this has happened in the Netherlands, a
tolerant country, with differences of political opinion, of course, like any
democracy. That is the nature of democracy. But with respect for each other,
respect for each other’s views. Respecting each other’s views means of
course that you can come into conflict with each other, but with words, not
with bullets. What has happened here is indescribable. These are my
personal feelings. I cannot say it any other way. I am devastated.
(Administrative texts, kok-001321, Dutch-English)
42 L. Buysse
Table 3 Top three of correspondents of ‘inferential’ so and dus in translations and backtranslations
dus 32 76.2 dus 35 85.4
Daarom (‘that’s why’), 3× 3 × 4.8 dan ook (‘as a result’) 2 4.9
daardoor (‘because of that’), 2
dan (‘then’)
en (‘and’), op dezelfde wijze (‘in 4× 4 × 2.4 ook (‘also’), vandaar dat 4 4 × 2.4
this way’), waardoor (‘by 1 (‘hence’), zo (‘well’), ×
which’), wat maakt (‘which zodat (‘so that’) 1
makes’)
zero 9 zero 6
Total 51 Total 47
therefore 88 47.6 so 32 50.8
so 35 18.9 therefore, thus 2 2 × 15.9
×
10
thus 32 17.3 and 3 3 × 4.8
16 other 30 16.2 8 other 8 12.7
zero 58 zero 32
Total 243 Total 95
The inferential category accounts for 15.7% (N = 51) of source text tokens of so,
and 12.6% (N = 47) of its backtranslations. These ratios are considerably higher for
dus, with 27.1% (N = 243) of its source text tokens indexing an inferential relation
as well as 28.5% (N = 95) of its backtranslations. As Table 3 shows, dus is almost
the exclusive correspondent for so, both for its translations into Dutch and for its
backtranslations. A similar situation holds for the backtranslations of dus, but not
for the inferential tokens of dus that are translated into English, where so changes
places with therefore. Interestingly, 23.9% (N = 58) of inferential tokens of dus are
not translated into English at all, and 33.7% (N = 32) of its backtranslations were
added by the translator as there is no corresponding marker in the English source
text. This is less the case for so with 17.7% (N = 9) of translations and 12.8% (N =
6) of backtranslations not having a correspondent.
3.3 Draw a Conclusion on a Textual Level
Apart from marking an inferential type of conclusion between two propositions, so

and dus may also draw a conclusion based on larger stretches of discourse, indicat-
ing that the grounds for the claim contained in the segment that they mark should be
retrieved from a group of segments in the prior co-text. This may take the form of
(i) a summarizing conclusion stating the upshot of that section of the discourse, (ii)
an opinion that rests on preceding argumentation, or (iii) a request motivated in the
prior discourse.
First, the summarizing function has been attributed to so in prior research (cf.
Redeker 1990; Müller 2005; Buysse 2012), and can be confirmed for dus as well in
the DPC. The idea that a successful approach to the climate issue rests on two pillars
has been developed at length in the prior co-text of example (12). To bring this
section of the text to a close this argumentation is summarized to a single claim,
marked with dus.
(12) Naar mijn overtuiging is energie de alfa en omega van de discussie. Energie
is de motor achter ontwikkeling. De drijvende kracht achter een beter leven
voor honderden miljoenen mensen. In developing countries weten grote
groepen mensen met hard werken een betere toekomst te verwerven voor
zichzelf en hun kinderen. Zij ontsnappen aan de armoede en de ellende. Dat
gaat niet zonder energie. Het International Energy Agency schat dat het
energiegebruik van de developing countries de komende halve eeuw met
230 procent zal stijgen. Zij zijn dan goed voor meer dan de helft van het
wereldwijde energiegebruik. Elke succesvolle aanpak van het
klimaatprobleem dientdusgebaseerd te zijn op twee pijlers: energie is de
sleutel en nauwe samenwerking tussen developed en developing countries
is cruciaal.
[‘Any successful approach to the climate problem ought to be dus based on
two pillars: …’]
In my opinion, the whole discussion revolves around energy. Energy is the
driving force behind development. The key to a better life for hundreds of
millions of people. In developing countries, large numbers of people work
extremely hard to secure a better future for themselves and their children.
That’s how they escape poverty and hardship. But it all costs energy. The
International Energy Agency expects that total energy consumption in the
developing world will rise by 230% over the next 50 years. That will be
more than half of total global consumption. So, any successful approach to
climate change must be built on two central ideas: one: energy is the key,
and two: close cooperation between developed and developing countries is
essential.
(Administrative texts, bal.-001248, Dutch-English)
The second type of textual conclusion does not summarize what precedes as
much as it takes the prior co-text as the grounds to voice an opinion and thereby end
a section or turn. Excerpt (13) has been drawn from the proceedings of the European
Parliament. In a debate on immigration policy an MEP describes the situation and
ends with a so-prefaced segment stating that countries ought to work together to
address the challenges sketched in the prior co-text.
44 L. Buysse
(13) As Mr. Duquesne implied in his contribution, if the recent happenings have
shown us anything, it is that we cannot turn a blind eye to events around the
world and hope that they will go away. The problems of people wanting
asylum, the situation and the plight of the people of poor and troubled
countries all over the world are our concerns and they manifest themselves
on our doorsteps, on our shores and in our parliaments if we do not address
them. (…) This is the plight of desperate people seeking desperate
measures to start a new life. But these people are not resorting to this sort
of action lightly; they are escaping from terror, war, torture, rape, vile
regimes posing as governments and, of course, in some cases, poverty.
Sothere can be no more appropriate time for countries to be working
together to confront these humanitarian challenges.
De heer Duquesne zei het al in zijn bijdrage: als de recente gebeurtenissen
ons iets hebben geleerd, is het dat we gebeurtenissen in de wereld niet
zomaar kunnen negeren en dan maar hopen dat het probleem vanzelf
verdwijnt. De problemen van asielzoekers, de erbarmelijke omstandigheden
waarin mensen in arme en noodlijdende landen over de hele wereld
verkeren, zijn ook onze zaak en als we er niets aan doen, zullen we er van
dichtbij, aan onze eigen kusten en in onze eigen parlementen, mee worden
geconfronteerd. (…) Zo wanhopig zijn deze mensen dat ze op deze
hachelijke wijze een nieuw leven willen beginnen. Maar deze mensen doen
dit niet zomaar; ze zijn op de vlucht voor terreur, oorlog, martelingen,
verkrachting, verachtelijke regimes die zich regering noemen, en in
sommige gevallen natuurlijk ook armoede. Er isdusgeen beter moment
voor landen om deze humanitaire problemen met vereende krachten aan te
pakken. [‘There is dus no better moment for countries to address these
humanitarian problems with joint forces.’]
(Administrative texts, erp-000443, English-Dutch)
Third, in interaction so can also mark a speech act of request (cf. Fraser 1990;
Schiffrin 1987; Müller 2005), in which case it relates the request (or, by extension,
a directive) to a preceding motivation or justification. In the DPC dus takes on this
role as well, but contrary to the other two types of textual conclusion, so and dus are
rarely each other’s correspondents in this function. Apart from zero correspondence,
therefore is the most likely alternative in English, and in Dutch dan ook (literally
‘then also’, which translates best into English as ‘therefore’ or ‘hence’) and daarom
(‘that is why’) stand out.
Typically, a lengthy turn is rounded off with a call to (specific members of) the
audience to perform an action based on the argumentation developed in the prior
co-text, as in (14) and (15), both of which have also been taken from a parliamentary
debate. The speech act tends to be explicitly marked by phrases such as I urge you/
ik dring er bij u op aan, I call upon you/ik doe een beroep op u, I ask/ik vraag, etc.
(14) I am glad that the rapporteur has eventually agreed that we need a
compromise on agriculture, so I urge you all to vote for Amendment No 11.
Ik ben blij dat de rapporteur er uiteindelijk mee heeft ingestemd dat er een
compromis nodig was met betrekking tot de landbouw. Ik dring erdan
ookbij u op aan om vóór amendement 11 te stemmen. [‘I urge dan ook you
to vote in favour of amendment 11.’]
(Administrative texts, erp-000447, English-Dutch)
(15) Daarom, collega’s moeten wij ervoor zorgen dat wij voldoende hulp aan
deze regio bieden. En wat doen we? In de begroting 2002 schroeven we de
begrotingsmiddelen voor deze regio terug. Ik roepdusde collega’s van de
Begrotingscommissie en de hele plenaire vergadering op om die begroting
weer recht te zetten en aan Centraal-Azië te geven waar het recht op heeft.
[‘I call dus on the colleagues of the Budgets Committee and the entire
plenary session to rectify this budget and give Central Asia what it is
entitled to.’]
That is why we need to ensure that we lend sufficient support to that region.
And what do we do? In the 2002 budget, we cut back the budgetary
resources allocated to that region. Ithereforeurge the MEPs of the
Committee on Budgets, together with the entire plenary session, to rectify
that budget and to give Central Asia what it is entitled to.
(Administrative texts, erp-000450, Dutch-English)
Both so and dus mark a textual conclusion in over one fifth of their translations
into the other language (with respectively 26.5%, N = 86, and 27.5%, N = 246). In
backtranslations the shares for this function amount to over one quarter (20.4%, N
= 76 for so; 22.2%, N = 74 for dus). The unchallenged preferred correspondent for
so in translations into Dutch as well as backtranslations is dus (Table 4). This is
reciprocated for the backtranslations of dus, but not for its translations into English,
where therefore ranks highest, accounting for a majority of translations, followed by
over one fifth for so.
46 L. Buysse
Table 4 Top three of correspondents of so and dus as textual conclusion markers in translations
and backtranslations in absolute numbers (N) and adjusted percentages (excluding tokens with
zero correspondent)
dus 42 59.2 dus 45 66.2
daarom (‘that’s why’) 11 15.5 daarom (‘that’s why’) 12 17.7
dan ook (‘as a result’) 8 11.3 dan ook (‘as a result’) 8 11.8
7 other 10 14.1 3 other 3 4.4
zero 15 zero 8
Total 86 Total 76
therefore 106 52.2 so 42 68.9
so 45 22.2 thus 8 13.1
thus 17 8.4 therefore 3 4.9
17 other 35 17.2 6 other 8 13.1
zero 43 zero 13
Total 246 Total 74
3.4 Marking Boundaries Between Discourse Sections
Boundary markers are signposts for and between larger units in the discourse, and
as such help the recipient of the message to follow the thread of the discourse. In the
DPC we can distinguish three types of boundary markers: (i) pivotal transitions
between adjacent sections, (ii) return to the main discourse unit, and (iii) introduce
questions.
First, so and dus can mark boundaries between adjacent discourse sections,
introducing sentences that serve a transitional or pivotal goal between two larger
discourse segments. For example, in (16) the author rounds off his discussion of a
first problem in his exposé with a transitional sentence, marked by dus/so, before
moving on to issues of secondary importance, whereas in (17) a new section is
started with a transitional sentence that refers back to the previous section.
(16) Als we diverse kustlijnen in Europa bekijken (…) dan liggen op een afstand
van een paar honderd kilometer twee, misschien wel zes, zeven, acht hele
grote havens waar de concurrentie vooral voortspruit uit die afstand tussen
de havens. Die is namelijk kort. (…) Dit leidt echter tot een valse
concurrentie waarvan we eigenlijk niet gediend zijn. Het gaatdusin eerste
instantie om dat probleem. [‘It concerns dus in first instance this problem.’]
If we look at the various coastlines in Europe (…), there are two, maybe
six, seven or eight major ports located within a few hundred kilometres of
each other, where competition is mainly generated from that small distance
between the ports. (…) However, this leads to distorted competition which
does not really benefit us. Sothat is the first problem.
(Administrative texts, erp-000458, Dutch-English)
(17) Eerst even een misverstand rechtzetten: het Nederlands is geen kleine taal.
Het is de moedertaal van ruim 15 miljoen Nederlanders en 6 miljoen
Vlamingen; en nog eens 400.000 Surinamers maken er dagelijks gebruik
van. Er zijn ruim 6.000 talen in de wereld en op de ranglijst staat het
Nederlands tussen de 45ste en 50ste plaats. (…) Zij plaatsen het Nederlands
zelfs in de top twintig.
[section heading] Een gezonde taal
Het Nederlands verkeertdusin stralende gezondheid. Toch is er zorg nodig
als we dit zo willen houden, want talen kunnen snel terrein verliezen.
[‘Dutch is dus in radiant health.’]
Let us begin by correcting a misconception: Dutch is not a small language.
There are more than 6000 languages in the world and Dutch is ranked
somewhere between 45th and 50th (…) They place Dutch in the top 20.
[section heading] A healthy language
So, Dutch is in fine fettle. However, care is required if we want to keep it
that way, since languages can easily lose ground.
(Journalistic texts, vla-002265, Dutch-English)
At the start of a new section of a text, so is either translated by dus or not trans-
lated at all. Dus, on the other hand, has a wider range of correspondences besides so,
such as then – as in (18) – thus, therefore, clearly, and it can be seen that.
(18) ELAt zit er warmpjes in met zijn risicokapitaal, van zaaigeld tot
risicokapitaal. Dat geld wordt beheerd door mensen met talent om jonge
bedrijven te begeleiden in hun groei. (…) Gedurende vele jaren, zeker
sedert de “golden sixties”, vestigden zich honderden buitenlandse bedrijven
tussen Leuven-Eindhoven en Aken. Die beweging is niet stilgevallen.
[section heading] Topklasse
ELAt beschiktdusover vele troeven om een belangrijke rol te spelen in het
wereldwijde internationale net van kenniseconomieën. [‘ELAt possesses
dus many assets in order to play an important role in the worldwide net of
knowledge economies.’]
ELAt is awash with capital, from seed money to risk capital. The money is
managed by talented individuals with a view to supporting young
companies in their growth phase. (…) Over a period of many years, and
particularly since the “golden sixties”, hundreds of foreign companies have
set up shop in the area between Leuven-Eindhoven and Aachen, and the
trend continues apace.
[section heading] Top class
ELAt,then, has many assets enabling it to play a major role in the global
international network of knowledge economies.
(Journalistic texts, vla-002265, Dutch-English)
Second, dus and so can act as “pop marker[s]” (Polanyi and Scha 1983: 265),
which indicate a transition to a “main idea unit” (Schiffrin 1987; Müller 2005;
Buysse 2012) or a “return to a main point” (Lam 2010: 662), whereby a relation is
48 L. Buysse
indexed between two non-adjacent discourse segments. In (19) the narrator of the
story gets distracted by his memory of another character’s scent and elaborates on
that before returning to the main focus of the narration.
(19) Ze stelde voor iets praktisch te doen. “Misschien kunnen we koekjes

bakken,” zei ze en masseerde haar linkervoet. “Heb jij weleens koekjes
gebakken?” “Nee,” zei ik, “nog nooit. Mijn moeder bakt af en toe, maar
nooit koekjes.” Hoewel ze voor zover ik wist niet rookte, rook mijn
toekomstige vrouw naar sigaretten en feesten. Het was een indringende
lucht die om haar heen hing. Ik hield dat toen voor de geur van existentiële
eenzaamheid. Die existentiële eenzaamheid is natuurlijk flauwekul, maar
de geur ervan niet, die bestaat. Soms ruik ik hem weer. De geur van rook,
zweet, een vleugje urine, de geur van feesten die te lang zijn doorgegaan,
de gastvrouw ligt allang in bed, maar de laatste gasten maken nog altijd
geen aanstalten om te vertrekken. Sommigen vertrekken nooit, ja, een half
leven later. Duswe zijn koekjes gaan bakken. [‘Dus we baked cookies.’]
“Maybe we could bake cookies,” she said, massaging her left foot. “Have
you ever baked cookies?” “No,” I said, “I never have. My mother bakes
things sometimes, but never cookies.” She didn’t smoke as far as I knew,
but my wife-to-be always smelled of cigarettes and parties. It was a
penetrating odour that dung to her. Back then I liked to think it was the
smell of existential loneliness. Existential loneliness is a lot of hooey, of
course. But not the smell, that really exists. Sometimes I’ll still catch a
whiff of it. The smell of smoke, sweat, a hint of urine, the smell of parties
that have gone on too long, the hostess long retired but the last die-hards
making no move to leave. Some never leave, at least not until half a lifetime
later. Andsowe baked cookies.
(Fiction, gru-002592, Dutch-English)
Third, so prefaces questions that have a boundary marking function, such as

follow-up questions in an interview, as in (20), or questions marking a transition
between two sections of the discourse, as in (21) where so marks the transition
between a background section of a speech and the main part in which different
policy measures are expounded. In none of these is so translated by dus.
Correspondents include: dan (‘then’), en (‘and’), de vraag is nu (‘the question now
is’), and nu (‘now’).
(20) And yet in his own life he goes to great lengths to avoid company, even
though he does get lonely. “If I didn’t I’d be superhuman. I’m sure even the
Pope gets lonely.” Sowhy does he choose to be alone? “Well, you see, I
consider that to be a privilege. (...)”
Maar in zijn eigen leven heeft hij er alles voor over om gezelschap te
vermijden, ook al voelt hij zich wel degelijk eenzaam. “Anders zou ik een
supermens zijn. Ik weet zeker dat zelfs de paus eenzaam kan zijn.” Waarom
wil hijdanalleen zijn? “Wel, ik beschouw dat als een voorrecht. (...)" [‘Why
does he dan want to be alone?’]
(21) In Nederland zijn we ervan overtuigd dat ook de overheid één van de
partners is bij het opvoeden van kinderen. Deze houding betekent een breuk
met het verleden. Waar in het verleden de autonomie van het gezin bijna
onaantastbaar leek, kiest de huidige regering voor een andere benadering.
(…) Het belang van ieder kind. Dus van de 95% van de Nederlandse
kinderen waar het goed mee gaat en die tevreden zijn met hun leven. Maar
ook van de 5% die het moeilijk hebben. Wat doen wedanzoal? Om te
beginnen stellen we het kind centraal. [‘What sort of things do we dan do?’]
My government believes that the state also has a responsibility when it
comes to raising children. That’s a major change in attitude. In the past,
family autonomy was largely unquestioned. But the present government
wants to change that. (…) I’m talking about all children’s interests. Not just
the 95% of Dutch children who are doing well and are happy. But also the
5% who have problems. Sowhat are we doing? To start with, we are putting
children first.
(Administrative texts, bal.-001241, Dutch-English)
The only context in which dus surfaces in an interrogative clause is when the
question merely seeks confirmation of the inference the speaker/author has drawn
from the preceding co-text, as illustrated in (22).
(22) “Enkele jaren geleden stelden wij vast dat wij al 30 jaar teveel
verschillende onderdelen verwerkten in onze vrachtwagens. Dat was duur
en onpraktisch, dus hielden wij ons designsysteem kritisch tegen het licht
en bedachten wij een nieuw en beter alternatief. Vandaag produceren wij
bijvoorbeeld drie verschillende cabines, maar daarin monteren wij steeds
dezelfde spoiler.”
Het gaatdusom een vorm van standaardisering? [‘It concerns dus a form of
standardisation?’]
“We realised that for more than 30 years we were using too many different
components in our trucks. This was expensive and impractical, so we took
a critical look at our design system and worked out a better one. Today, for
example, we produce three different base cabs but they all have the same
windscreen.”
Sostandardisation is the name of the game?
(External communication, arc-002053, Dutch-English)
The boundary marking function has been attested in 11.7% (N = 38) of transla-
tions of so and 15.6% (N = 58) of backtranslations. The shares for dus are consider-
ably more modest with 3.7% (N = 33) of translations and 2.4% (N = 8) of
backtranslations exhibiting this function. The overview of correspondents for so and
dus in Table 5 indeed suggests that so is used more often for boundary marking than
dus. Interestingly, almost half of boundary marker tokens of so in an English source
text do not have a correspondent in a Dutch target text and at the same time so was
used as a boundary marker 22 times in an English translation without there being a
50 L. Buysse
Table 5 Top three of correspondents of so and dus as boundary markers in translations and
backtranslations in absolute numbers (N) and adjusted percentages (excluding tokens with zero
correspondent)
dan (‘then’) 11 55.0 dus 12 33.3
dus 6 30.0 dan (‘then’) 10 27.8
daarom (‘that’s why’), zo (‘well’), de 3× 3 × 5.0 nu (‘now’), zo 2 × 2 × 11.1
vraag is nu (‘the question now is’) 1 (‘well’) 4
0 other 0 0.0 4 other 6 16.7
zero 18 zero 22
Total 38 Total 58
so 12 48.0 so 6 75.0
therefore 5 20.0 then 2 × 2 × 12.5
1
thus 2 8.0
6 other 6 24.0
zero 8 zero 0
Total 33 Total 8
Dutch correspondent in the source text. This might point at a tendency whereby
boundaries between large sections of discourse are marked more explicitly in
English than in Dutch.8 The overall numbers for this category are fairly small, how-
ever, so caution should be taken in drawing any firm conclusions.
3.5 Start a New Sequence
So holds the potential to mark sequential relations “between successive elements in

a chain of events” (Redeker 1990: 373–374), and our analysis shows that dus does
so too. Excerpt (23) has been taken from an interview with a Holocaust survivor,
who recounts how she became an interpreter for the British army. The propositions
preceding the so-prefaced clause do not contain the explicit cause or reason under-
lying the speaker’s becoming an interpreter. She also skips a few essential steps in
the process, such as her actually taking the decision to come forward as an inter-
preter and the army’s decision to accept her as one.
8
Similar observations have been made about the correspondence between Swedish and English
(see Altenberg 2007).
Table 6 Top three of correspondents of ‘sequential’ so and dus in translations and backtranslations
dus 18 66.7 dus 7 31.8
zo (‘well’) 3 11.1 dan (‘then’), en (‘and’), zo (‘well’) 3× 3 × 13.6
3
dan (‘then’) 2 7.4 daarom (‘that’s why’), toen (‘then’) 2× 2 × 9.1
2
4 other 4 14.8 2 other 2 9.1
zero 9 zero 5
Total 36 Total 27
so 7 87.5 so 18 90.0
therefore 1 12.5 and, instead 2× 2 × 5.0
1
zero 4 zero 2
Total 12 Total 22
(23) Renata, who could speak English, became an interpreter with the British
army, and suggested her sister enlist as well. “She said to me, ‘Why don’t
you become an interpreter too?’ I said: ‘I can’t speak English.’ She said, ‘It
doesn’t matter.’ SoI became an interpreter and we were part of the British
army.”
Renata, die Engels kende, werd tolk voor het Britse leger en stelde haar zus
voor zich ook aan te melden. “Ze zei me: ‘Waarom word je ook geen tolk?’
Ik zei, ‘Ik ken geen Engels.’ Ze zei dat dat niets uitmaakte. Duswerd ik tolk
en we maakten deel uit van het Britse leger." [‘Dus I became an
interpreter…’]
Sequential tokens account for 11.1% (N = 36) of translations of so into English

and 7.3% of backtranslations (N = 27), and for 1.3% (N = 12) of translations of dus
and 6.6% (N = 22) of backtranslations. Dus and so are each other’s preferred cor-
respondents in this category, as Table 6 shows. Other markers used by translators
are typical examples of sequential markers, such as dan (‘then’) and en (‘and’).
3.6 Elaboration/Restatement
‘Elaboration’ occurs when “one clause elaborates on the meaning of another by

further specifying or describing it” (Halliday and Matthiessen 2004: 396), and so
has indeed been witnessed in prior research to fulfil this function (Schiffrin 1987;
52 L. Buysse
Rendle-Short 2003; Müller 2005; Buysse 2009, 2012; and Lam 2009, 2010). In (24)
the information on the company’s result is further enhanced by indicating that it is
similar to that of the third quarter of the year before. The use of so in English and
dus in Dutch suggests that this information can be inferred from the wide context of
the company’s figures (albeit not from the preceding co-text). Similarly, the addition
that a diagnosis can now be made without using film in (25) can be inferred from the
context (e.g. one’s knowledge of current practice as well as from the obvious fact
that a computer screen does not require printing) but is worth mentioning as it is
likely to be one of the main advantages of the new technology.
(24) Assuming the value of the US dollar does not further decline relative to the
euro, the Company expects to achieve an operating result before
amortization of consolidation goodwill between euro 7 and 12 million,
socomparable to the third quarter of 2002.
Uitgaande van de veronderstelling dat de waarde van de US dollar niet
verder zakt in verhouding tot de euro, verwacht de groep een operationeel
resultaat vóór afschrijving van de consolidatiegoodwill tussen 7 en 12
miljoen euro, dusvergelijkbaar met dat van het derde kwartaal van 2002.
[‘…, dus comparable to that of the third quarter of 2002.’]
(External communication, bco-002443, English-Dutch)
(25) Next to this there are the graphics controllers that feature state-of-the-art
technology for the rendering of reliable and accurate images for diagnosis
on the screen, sowithout using film.
Verder zijn er ook de grafische borden die instaan voor heel nauwkeurige
en volledig betrouwbare beelden gebruikt voor diagnose op het scherm,
duszonder gebruik van film. [‘…, dus without use of film.’]
(External communication, bco-002368, English-Dutch)
An ‘elaborative’ context in which dus in particular occurs quite commonly is

when the elaborative segment takes the form of a paraphrase of the prior segment,
as in (26).
(26) Het betreft tevens een vrij soepele formule: gedurende de eerste 6 maanden
van elk jaar (dusvan 1 januari tot 30 juni) kunt u de terugbetaling vragen
van uw Record-aandelen. [‘…(dus from 1 January to 30 June)…’]
In addition, it is a relatively flexible formula: during the first 6 months of
each year (i.e.from 1 January to 30 June) you can apply to redeem your
Record shares.
(External communication, ing-001886, Dutch-English)
Of all tokens of so that were translated into Dutch 6.5% (N = 21) had an ‘elabora-
tive’ function as well as 10.5% (N = 39) of its backtranslations. The picture looks
altogether different for dus, which has 23.7% (N = 212) of its translations in an
‘elaborative’ role and 15.9% (N = 53) of its backtranslations. When so is translated
Table 7 Top three of correspondents of so and dus with an elaborative function in translations and
backtranslations in absolute numbers (N) and adjusted percentages (excluding tokens with zero
correspondent)
dus 10 71.4 dus 10 30.3
bijvoorbeeld (‘for example’), en (‘and’), zo 4 × 1 4 × 7.1 zo (‘in this way’) 8 24.2
(‘in this way’), zodoende (‘as such’)
daarom (‘that’s 3 9.1
why’)
0 other 0 0.0 10 other 12 36.4
zero 7 zero 6
Total 21 Total 39
therefore 52 35.6 so 10 38.5
i.e. 25 17.1 thus 6 23.1
in other words 15 10.3 hence 3 11.5
21 other (incl. so, N = 10) 54 37.0 7 other 7 26.9
zero 66 zero 27
Total 212 Total 53
into Dutch, dus is by far its most frequent correspondent, but this certainly does not
hold when ‘elaborative’ dus is rendered into English, with so accounting for not
even 4% of corresponding tokens (Table 7).
3.7 Reiteration
There is one functional category that only involves dus, viz. when it marks (a part
of) an utterance as having been stated before at some point in the prior co-text. This
function differs from others in that it does not resume a topic that was temporarily
suspended (as a pop marker would), restate a claim (as with elaboration/restate-
ment) or mark a claim that is to be inferred from the prior co-text. Rather, it subtly
indicates to the reader/hearer that the writer/speaker somehow finds it relevant to
deliberately reiterate a proposition that has already been stated before. For example,
in (27) a section in which a sentence from an Old Dutch manuscript is analysed is
followed by a section on the origins of the letters of the alphabet. In the former sec-
tion the author has already commented on the spelling of w, which also has rele-
vance for the latter section. In the list of examples in that section brief reference is
made to the aforementioned case of w. Dus prompts the reader to recall this example
from its earlier discussion.
54 L. Buysse
(27) Verder maakt het woord ‘uuerk’ duidelijk dat de letter ‘w’ ontstaan is door
een samenvoeging van twee keer een ‘u’. [‘that the letter ‘w’ originates
from a combination of ‘u’ twice’.] In het Engels noemen ze de ‘w’ ook
‘double u’. De ‘w’ kwam pas in de elfde eeuw in gebruik. (...)
[section heading] Herkomst van onze alfabetletters
Veel van onze hoofdletters zijn ontstaan uit tekeningen (pictogrammen). De
M bijvoorbeeld is (…) Maar er zijn ook letters uit andere letters ontstaan.
Zo is de G ontstaan uit de C; er is simpelweg een streepje bijgezet. En de
W isdustwee keer een U. [‘And W is dus U twice.’] De U zelf is trouwens
een geronde V.
The word ‘uuerk’ (‘werk’ in modern Dutch) clearly shows that the letter
‘w’ has its origins in ‘u’ written twice and joined together, as is implied by
the English name for the letter, ‘double u’. The ‘w’ did not come into use
until the eleventh century. (…)
[section heading] Origin of the letters of the alphabet
Many of our capital letters come from drawings (pictograms). The M, for
example, (…) But there are also letters that have come from other letters.
For example, the G came from the C; a little dash has simply been added.
And the W is U written twice. The U itself, incidentally, is a rounded V.
(Non-fictional literature, ons-000476, Dutch-English)
This function is not very common, as it only accounts for 2.9% (N = 26) of
source text tokens of dus into English and 2.7% (N = 9) of backtranslations. Over
two thirds of these tokens do not have a correspondent in English (18 and 6 tokens,
respectively). In translations therefore occurs 3 times, i.e. 2 times, and as, but, and
to do this once each; in backtranslations and, instead and therefore occur once each.
4 Q
uantitative Analysis of Correspondence between So
and Dus
4.1 Correspondents
The overall correspondence between so and dus differs depending on the translation
direction (Table 8). In total 148 tokens of so are rendered in Dutch with dus, which
amounts to nearly 46% of source text tokens of so and slightly over 44% of target
text tokens of dus. These numbers are even considerably higher, with 59.7% and
60.9% respectively, when zero correspondences are ignored. Clearly, when decid-
ing on a Dutch correspondent for so, translators opt for dus in a majority of cases.
In absolute terms the numbers are similar when Dutch source texts are translated
into English, with 152 tokens of dus translated into English, but in relative terms the
differences are more outspoken: almost 41% of target text tokens of so are transla-
tions of dus, which compares neatly with the number for the English-Dutch
Table 8 Overall top three of correspondents of so and dus in translations and backtranslations in
absolute numbers (N) and adjusted percentages (excluding tokens with zero correspondent)
dus 148 59.7 dus 152 48.6
daarom (‘that’s why’) 22 8.9 daarom (‘that’s why’) 38 12.1
dan (‘then’) 13 5.2 en (‘and’), zo (‘well’) 2 × 14 2 × 4.5
22 other 65 26.2 30 other 95 30.3
zero 77 zero 59
Total 325 Total 372
therefore 278 41.5 so 148 60.9
so 152 22.7 thus 26 10.7
thus 63 9.4 therefore 21 8.6
48 other 177 26.4 23 other 48 19.8
zero 226 zero 91
Total 896 Total 333
translation direction, but only a mere 17% of source text tokens of dus are translated
with so. Again these rates are somewhat higher if zero correspondences are not
taken into account (48.6% and 22.7%, respectively).
The absence of a correspondent for about one quarter of tokens – 23.7% for so
and 25.2% for dus – is in line with previous findings on the translation of pragmatic
markers and causal connectors (e.g. Bazzanella and Morra 2000; Aijmer and
Altenberg 2002; Altenberg 2007, 2010), and can be explained by their syntactic and
semantic optionality. Interestingly, though, 27.0% of target text tokens of dus do not
have a correspondent in the English source texts and have, therefore, been added by
the translator. This compares to a much lower rate of 15.9% of target text tokens of
so.
The preference for therefore over so may be explained by two main factors. First,
many of the texts included in the DPC have been taken from a fairly formal context.
As therefore is considered more formal than so and is certainly more typical of writ-
ten language (Biber et al. 1999: 887), translators may have felt it more often appro-
priate to translate dus with therefore than with so.
Second, the preferred position of dus more closely resembles that of therefore
than that of so, which may have led translators more easily to the former than to
the latter. In over two thirds of cases (68.3%, N = 612) dus occurs in mid-position
in Dutch source texts, compared to only 27.9% of times (N = 250) in clause-initial
position and 3.8% (N = 34) in final position. Of the 278 tokens of dus that have
been translated with therefore 225, amounting to 80.9%, take mid-position in the
source text.
56 L. Buysse
4.2 Functions
Figure 19 provides an overview of the functional distribution of so and dus for each
translation direction. Ideational and interpersonal relations are clearly at the heart of
so and dus, as they predominantly mark ‘resultative’ and ‘inferential’ relations.
Taken together these take up 40–50% of all tokens. There are, however, also func-
tional differences between so and dus.
First, the hierarchy between the ‘resultative’ and ‘inferential’ functions appears to
buttress, on the one hand, Schiffrin’s (1987) characterization of so as a “marker of
result” (1987: 191), and on the other hand, the status awarded to dus by Stukker et al.
(2008) of the prototypical marker of “epistemic causal relations” (2008: 1304). Whereas
28.6% of translations of so and 33.6% of its backtranslations have a ‘resultative’ func-
tion, the shares for dus are considerably lower, at 13.8% and 21.6% respectively. The
opposite holds for ‘inferential’ tokens: 27.1% of translated tokens of dus and 28.5% of
its backtranslations are ‘inferential’, compared to 15.7% and 12.6% for so.
Second, the incidence of tokens serving a boundary marking function is consider-
ably higher for so than for dus: 11.7% of source text tokens and 15.6% of target text
tokens of so, as compared with 3.7% and 2.4% for dus. As suggested in Sect. 3.4, one
100%
90%
80%
70%
60%
50%
40%
30%
20%
10%
0%
so > DU dus > EN DU > so EN > dus
REIT 0.0 2.9 0.0 2.7
EL/RE 6.5 23.7 10.5 15.9
SEQ 11.1 1.3 7.3 6.6
BM 11.7 3.7 15.6 2.4
CONC 26.5 27.5 20.4 22.2
INF 15.7 27.1 12.6 28.5
RES 28.6 13.8 33.6 21.6
Fig. 1 Functions of so and dus in translations and backtranslations in percentages
The absolute numbers can be found in the discussion of each function (see Sect. 3).
9
factor in this might be a more outspoken tendency to mark boundaries between large
sections of discourse in English than in Dutch, given the relatively high number of
zero correspondents for so. Another reason is the limited employability of dus for
one of the three types assumed under the category, viz. introducing questions.
A third and final major observation is the high rate of tokens of dus elaborating
or restating a prior segment (22% in source texts and 15.9% in target texts), com-
pared to more modest rates for so (6.5% and 10.3%, respectively). This textual
function has a strong affinity with inferential conclusions – as both entail a process
of inferential deduction – making it a natural extension of the functional scope of a
prototypical inferential marker like dus.
4.3 Text Types
The distribution of so and dus across text types is shown in Table 9, which indicates
that they are both highly common in administrative and journalistic texts. The same
holds for the text type literature (30.1% for so and 27.9% for dus), but there is a dif-
ference in the share taken up by the two kinds of literature in the corpus: fictional
and non-fictional literature are almost on a par for so (13.5% and 16.6%, respec-
tively), whereas fictional literature has a low share for dus (3.9%) compared with
non-fictional literature (24.0%). External communication accounts for 11.2% of
tokens of so and 19.5% of tokens of dus. These markers rarely occur in instructional
texts (1.0% and 1.1%, respectively).
An in-depth analysis of all frequencies of so and dus for each text type falls beyond
the scope of this article. Some observations are nonetheless worth pointing out. The
shares of the text types in resultative and inferential tokens of so and dus largely mir-
ror the shares in the overall incidence of these markers, presented in Table 9. For all
other categories, however, those text types that offer a more productive environment for
so and dus to fulfil a specific function exceed the weight that they might be expected
to have based on the overall occurrence of the markers in the text type.
Although administrative texts only take up slightly over one fifth of overall
tokens of so and dus (Table 9), this text type accounts for 51.9% of tokens of so
Table 9 Tokens of so and dus according to their incidence in each text type
so dus
Text type N % N %
Administrative texts 157 22.5 279 22.7
External communication 78 11.2 240 19.5
Instructions 7 1.0 14 1.2
Journalistic texts 245 35.2 353 28.7
Literature: Fiction 94 13.5 48 3.9
Literature: Non-fiction 116 16.6 295 24.0
Total 697 100.0 1229 100.0
58 L. Buysse
marking a textual conclusion and 36.6% of dus in this function. Clearly, administra-
tive texts such as speeches and lengthy interventions in parliamentary sessions are
well-suited for so and dus to introduce an opinion at the end of an argumentative
stretch of monologic discourse, to preface a request that rests on such argumenta-
tion or state a summarizing upshot of the prior co-text.
Similarly, journalistic texts dominate the boundary marking function with 50.0%
for tokens of so and 43.9% for tokens of dus, as compared with overall shares for
this text type of 35.2% and 28.7%, respectively. This too can be attributed to the
nature of the text type: questions abound in, for example, interviews, but journalists
also often make use of pivotal sentences as rhetorical signposts for the reader.
Fictional texts equally take up a larger share of boundary marking tokens of so and
dus (24.0% and 14.6%, respectively) than they do of these markers in general
(13.5% and 3.9%, respectively), which is particularly due to the capacity of so and
dus to signal a return to the main discourse unit. The shares of all other genres
within this function are, on the other hand, lower than their overall shares, with non-
fiction reaching a bottom 1.0% for so and 14.6% for dus (as opposed to 16.6% and
24.0% overall, respectively).
The sequential category is also dominated by tokens from journalistic texts
(49.2% for so and 58.8% for dus), as such texts often feature stories with a sequen-
tial structure. At the other end of the spectrum administrative texts (3.2% and 2.9%)
and external communication (4.8% and 11.8%) do not contain as many sequential
tokens of so and dus as they do of these markers on the whole.
Many elaborative instances of so and dus, finally, have been taken from jour-
nalistic texts (35.0% and 24.9%), in line with their overall share in the corpus
(35.2% and 28.7%), but the category external communication (21.7% for so and
27.9% for dus) clearly outnumbers its shares in their overall incidence (11.2%
and 19.5%), which can particularly be attributed to the many instances of finan-
cial reports and press releases where specific phrases are further spelt out, as
illustrated in (24).
In sum, so and dus occur in all text types represented in the DPC, but journal-
istic and administrative texts together account for the majority of tokens, whereas
their incidence is marginal in instructive texts. For the functions that are closest to
the functional core of so and dus, viz. resultative and inferential uses, each text
type accounts for a share that closely resembles its share in the overall frequency
of the markers. The other functions, however, are more specific and therefore bet-
ter suited in certain environments that are more common to one text type than to
another.
5 Conclusion
The scrutiny of 1926 tokens of English so and Dutch dus in a bi-directional parallel
corpus has revealed that the functional scope of these markers is highly similar.
Apart from one, quantitatively marginal category (viz. reiteration) all seven
functions that have been attested apply to both markers. Within this functional spec-
trum the incidence of functions tends to vary. So appears to prefer ‘resultative’ over
‘inferential’ relations, and dus does the opposite, thereby confirming traditional
characterisations of the prototypical uses of so and dus. In the textual domain so is
predominantly a marker of boundaries between larger sections of the discourse,
whereas dus indexes elaboration or restatement.
The mutual correspondence of so and dus is overall quite high, although there is
a “translation bias” (Altenberg 1999: 258), in that the degree of correspondence is
considerably higher when so is translated into Dutch than when dus is translated
into English. This could be explained by three factors.
First, the incidence of so is in part determined by that of its most prominent rival
marker, viz. therefore, which can take on most of the roles played by so albeit in a
more formal context. In the relatively formal register in which the texts that make
up the corpus can be situated, translators have often chosen the more formal candi-
date. In this respect, Altenberg’s (1999) suggestion that “asymmetrical correspon-
dence” (1999: 259) between causal markers may be due to a difference in the
markers’ stylistic status in the two language systems appears to hold.
Second, a similar rival marker for dus is lacking in Dutch. The closest candidate
is daarom (‘that’s why’), but its incidence in the DPC comes nowhere near that of
dus. Stukker et al. (2008) contend that daarom is largely confined to “content causal
relations” – i.e. ‘resultative’ relations – for which the DPC proffers further evidence.
Although dus can more often be observed in a ‘resultative’ context than Stukker
et al. might suggest, daarom cannot be witnessed to enjoy the same degree of free-
dom and largely sticks to ‘resultative’ contexts.
Third, there may well be a tendency to mark certain relations more explicitly in
Dutch than in English. We should always bear in mind that the target texts are the
result of a process of translation. If a translator encounters a token of so in an
English text s/he may instantly think of dus as the default translation option, but this
also counts for the opposite translation direction. This makes it all the more remark-
able that dus has more often been added to target texts by translators without there
being a source text correspondent. This could point at a tendency in Dutch to mark
inferential relations more explicitly than in English, and with dus in particular,
which helps to account for the observed translation bias.
It should be stressed that the overview of the functional correspondence between
so and dus laid bare in this investigation is not fully comprehensive. The Dutch
Parallel Corpus suffers from one insurmountable drawback: being a parallel corpus
it is based on written texts, thereby excluding spontaneous speech. Consequently, a
number of functions that have been attested in prior research (e.g. floor-holding and
floor-yielding tokens of so; see e.g. Müller 2005) could not be observed in this
study. Future contrastive studies that concentrate on comparable rather than parallel
corpora could complement the findings for parallel corpora. The remarkable func-
tional similarities (and subtle differences) between so and dus will hopefully also
spark an interest in more such comparative studies, so that eventually a cross-
linguistic map of pragmatic markers in the ‘resultative’/‘inferential’ domain (e.g.
French donc, Spanish pues, German also) may be drawn up.
60 L. Buysse
References
Aijmer, K., & Altenberg, B. (2002). Zero translations and cross-linguistic equivalence: Evidence
from the English-Swedish Parallel Corpus. In L. E. Breivik & A. Hasselgren (Eds.), From the
COLT’s mouth… and others’. Language corpora studies in honour of Anna-Brita Stenström
(pp. 19–41). Amsterdam: Rodopi.
Aijmer, K., & Simon-Vandenbergen, A.-M. (2003). The discourse particle well and its equivalents
in Swedish and Dutch. Linguistics, 41, 1123–1161.
Aijmer, K., Foolen, A., & Simon-Vandenbergen, A.-M. (2006). Pragmatic markers in translation:
A methodological proposal. In K. Fischer (Ed.), Approaches to discourse particles (pp. 101–
114). Amsterdam: Elsevier.
Altenberg, B. (1999). Adverbial connectors in English and Swedish: Semantic and lexical corre-
spondences. In H. Hassalgård & S. Oksefjell (Eds.), Out of corpora. Studies in honour of Stig
Johansson (pp. 249–268). Amsterdam: Rodopi.
Altenberg, B. (2007). The correspondence of resultive connectors in English and Swedish. Nordic
Journal of English Studies, 6, 2–25.
Altenberg, B. (2010). Conclusive English then and Swedish då: A corpus-based contrastive study.
Languages in Contrast, 10, 102–123.
Bazzanella, C., & Morra, L. (2000). Discourse markers and the indeterminacy of translation. In
I. Korzen & C. Marello (Eds.), Argomenti per una linguistica della traduzione. Notes pour une
linguistique de la traduction. On linguistic aspects of translation (pp. 149–157). Alessandria:
Edizione dell’ Orso.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Grammar of spoken and
written English. Harlow: Longman.
Bolden, G. B. (2009). Implementing incipient actions: The discourse marker ‘so’ in English con-
versation. Journal of Pragmatics, 41, 974–998.
Buysse, L. (2009). So as a marker of elaboration in native and non-native speech. In S. Slembrouck,
M. Taverniers, & M. Van Herreweghe (Eds.), From will to well. Studies in linguistics offered to
Anne-Marie Simon-Vandenbergen (pp. 79–91). Gent: Academia Press.
Buysse, L. (2012). So as a multifunctional discourse marker in native and learner speech. Journal
of Pragmatics, 44, 1764–1782.
Buysse, L. (2014). ‘So what’s a year in a lifetime so.’ Non-prefatory use of so in native and learner
English. Text & Talk, 34, 23–47.
Buysse, L. In press. Question tags in translation. An investigation into the translatability of English
question tags into Dutch. To appear in: Languages in Contrast (accepted).
Degand, L. (2001). Form and function of causation. A theoretical and empirical investigation of
causal constructions in Dutch. Leuven: Uitgeverij Peeters.
Degand, L. (2009). On describing polysemous discourse markers: What does translation add to the
picture? In S. Slembrouck, M. Taverniers, & M. Van Herreweghe (Eds.), From will to well.
Studies in linguistics offered to Anne-Marie Simon-Vandenbergen (pp. 173–183). Gent:
Academia Press.
Degand, L. (2011). Connectieven in de rechterperiferie. Een contrastieve analyse van dus en donc
in gesproken taal. Nederlandse Taalkunde, 16, 333–341.
Denturck, K., & Vandepitte, S. (2009). The translation of stance indexes: Causal connectors Dutch
want and dus and their French and English correspondents. In S. Slembrouck, M. Taverniers,
& M. Van Herreweghe (Eds.), From will to well. Studies in linguistics offered to Anne-Marie
Simon-Vandenbergen (pp. 185–197). Gent: Academia Press.
E-ANS. (2012). Algemene Nederlandse Spraakkunst. http://ans.ruhosting.nl/index.html. Accessed
28 February 2015.
Evers-Vermeul, J. (2010). Dus vooraan of in het midden? Over vorm-functierelaties in het gebruik
van connectieven. Nederlandse Taalkunde, 15, 149–175.
Fraser, B. (1990). An approach to discourse markers. Journal of Pragmatics, 14, 383–395.
Halliday, M. A. K., & Matthiessen, C. M. I. M. (2004). An introduction to Functional Grammar.
London/New York: Arnold.
Hogeweg, L. (2009). The meaning and interpretation of the Dutch particle wel. Journal of
Pragmatics, 41, 519–539.
Johansson, S. (2006). How well can well be translated? On the English discourse particle well and
its correspondences in Norwegian and German. In K. Aijmer & A.-M. Simon-Vandenbergen
(Eds.), Pragmatic markers in contrast (pp. 115–137). Oxford/Amsterdam: Elsevier.
Johnson, A. (2002). So…?: Pragmatic implications of so-prefaced questions in formal police inter-
views. In J. Cotterill (Ed.), Language in the legal process (pp. 91–110). Hampshire: Palgrave
MacMillan.
Lam, P. W. Y. (2009). The effect of text type on the use of so as a discourse particle. Discourse
Studies, 11, 353–372.
Lam, P. W. Y. (2010). Toward a functional framework for discourse particles: A comparison of well
and so. Text and Talk, 30, 657–677.
Macken, L., De Clercq, O., & Paulussen, H. (2011). Dutch parallel corpus: A balanced copyright-
cleared parallel corpus. Meta, 56, 374–390.
Müller, S. (2005). Discourse markers in native and non-native English discourse. Amsterdam/
Philadelphia: John Benjamins.
Niemegeers, S. (2009). Dutch modal particles maar and wel and their English equivalents in dif-
ferent genres. Translation and Interpreting Studies, 4, 47–66.
Norrick, N. R. (2008). Negotiating the reception of stories in conversation: Teller strategies for
modulating response. Narrative Inquiry, 18, 131–151.
Pander Maat, H., & Degand, L. (2001). Scaling causal relations and connectives in terms of
speaker involvement. Cognitive Linguistics, 12, 211–245.
Pander Maat, H., & Sanders, T. (1995). Nederlandse causale connectieven en het onderscheid tus-
sen inhoudelijke en epistemische coherentie-relaties. Leuvense Bijdragen, 84, 349–374.
Pander Maat, H., & Sanders, T. (2000). Domains of use or subjectivity? The distribution of three
Dutch causal connectives explained. In E. Couper-Kuhlen & B. Kortmann (Eds.), Cause, con-
dition, concession, contrast. Cognitive and discourse perspectives (pp. 57–82). Berlin: Mouton
de Gruyter.
Paulussen, H., Macken, L., Vandeweghe, W., & Desmet, P. (2013). Dutch parallel corpus: A bal-
anced parallel corpus for Dutch-English and Dutch-French. In P. Spyns & J. Odijk (Eds.),
Essential speech and language technology for Dutch (pp. 185–199). Heidelberg: Springer.
Polanyi, L., & Scha, R. J. H. (1983). The syntax of discourse. Text, 3, 261–270.
Redeker, G. (1990). Ideational and pragmatic markers of discourse structure. Journal of Pragmatics,
14, 367–381.
Redeker, G. (2006). Discourse markers as attentional cues at discourse transitions. In K. Fischer
(Ed.), Approaches to discourse particles (pp. 339–358). Amsterdam: Elsevier.
Rendle-Short, J. (2003). ‘So what does this show us?’: Analysis of the discourse marker ‘so’ in
seminar talk. Australian Review of Applied Linguistics, 26, 46–62.
Schiffrin, D. (1987). Discourse markers. Cambridge: Cambridge University Press.
Stukker, N., Sanders, T., & Verhagen, A. (2008). Causality in verbs and in discourse connectives:
Converging evidence of cross-level parallels in Dutch linguistic categorization. Journal of
Pragmatics, 40, 1296–1322.
Stukker, N., Sanders, T., & Verhagen, A. (2009). Categories of subjectivity in Dutch causal connec-
tives: A usage-based analysis. In T. Sanders & E. Sweetser (Eds.), Causal categories in dis-
course and cognition (pp. 119–171). Berlin: Mouton de Gruyter.
Sweetser, E. E. (1990). From etymology to pragmatics. Metaphorical and cultural aspects of
semantic structure. Cambridge: Cambridge University Press.
Van Dijk, T. (1979). Pragmatic connectives. Journal of Pragmatics, 3, 447–456.
What English Translation Equivalents Can
Reveal about the Czech “Modal” Particle prý:
A Cross-Register Study
Michaela Martinková and Markéta Janebová
Abstract According to Czech monolingual dictionaries, prý [allegedly, they say, it

is rumoured, X claims] is a polysemic particle with two senses: in the first one, a
modal particle with the meaning of uncertainty and doubt which is caused by the
fact that the information is only second-hand; in the second meaning, prý introduces
somebody else’s direct reported speech. English correspondences of the particle in
three registers of the parallel corpus InterCorp (fiction, journalistic texts and spoken
language), however, only confirm its function of an evidential; doubt and uncer-
tainty are not inherent to the meaning of prý but may arise in the context through the
process of “invited inference” (see e.g. Traugott and Dasher 2002). The default
reporting function is always blocked if the addressee of the reporting event is pres-
ent in the original interchange, i.e. the information is not new. Though this applies
to all three registers alike, there is variation when it comes to the predominant type
of evidentiality: while in the texts of Fiction the dominant function of prý is quota-
tive, in the other registers the source of the information is left unexpressed in the
majority of tokens.
Keywords Register • Subjectification • Epistemic modality • Evidentiality •

Particle prý • Invited inference • Czech • Parallel corpus InterCorp
1 State of the Art
According to the Dictionary of Standard Czech (Slovník spisovné češtiny [SSČ]

2009), prý is a polysemic particle with two senses: in the first one, a modal particle
with the meaning of uncertainty and doubt which is caused by the fact that the infor-
mation is only second-hand:
M. Martinková (*) • M. Janebová

Department of English and American Studies, Faculty of Arts, Palacký University,
Olomouc, Czech Republic
e-mail: michaela.martinkova@upol.cz

DOI 10.1007/978-3-319-54556-1_4
64 M. Martinková and M. Janebová
(1) Je prý nemocen.

[be:3SG PART ill]
‘PRÝ [I hear] he is ill.’
In the second meaning, prý introduces somebody else’s direct reported speech.
This is exemplified in (2):
(2) Přišel k nám Jan. Prý dělej, jdeme do kina

[came:PTC.M to us Jan:M.NOM. PART hurry.up:IMP, go:PRS.1:
PL to cinema]
‘John came to us. PRÝ [he said] hurry up, we’re going to the cinema.’
The question to be asked here is whether prý carries the meaning of doubt and
uncertainty in all of those cases in which it does not introduce direct speech.
According to SSČ it indeed does, and so it does according to the major grammar
books: Komárek et al. (1986, 232), for example, discuss prý in the sections on epis-
temic modality, alongside modal particles whose function is to evaluate the degree
of certainty of the content of the text or a part of it, i.e. epistemic particles.
In an entry on reported speech in the Encyclopedia of Czech (Encyklopedický
slovník češtiny; Grepl [2002, 375]), however, utterances with prý are considered to
be a special type of reproducing an original utterance, alongside direct and indirect
speech, and no meaning of uncertainty or doubt is mentioned. This is in agreement
with the etymology of the word: historically, prý goes back to the transitive verbum
dicendi praviti [to say], which was a full-fledged verb with rich inflection.1 Prý was
originally the 3rd person singular or aorist form of this verb (praví [say:PRS.3SG]
or pravi [say:AORIST]), which later underwent phonetic reduction, via the stages
of praj and prej (Machek 2010, 481), and lost all inflections. The latter form (prej)
is still very frequent in Common Czech, a variety considered as non-standard
(Krčmová 2002, 81).
Grepl notes (2002, 375) that in sentences with prý the original utterance can
either remain unchanged, or its form can be modified in a way that makes it similar
to indirect speech. For convenience, we will distinguish here between prý introduc-
ing direct and indirect speech. Sentence (3) is an example of direct speech as what
we have here is the exact wording of the original utterance Přivezu ti nějaký dárek
‘I will bring you a present’: the verb is in the first person singular form and its
understood subject is the first person singular pronoun, whose referent is “the origi-
nal speaker” (for the terminology, see Huddleston and Pullum 2002, 1023). The
pronominal clitic ti [you] is attached to the verb:
1
The verb still exists today, but Czech monolingual dictionaries (e.g. SSČ) mark it as bookish and
obsolete.
What English Translation Equivalents Can Reveal about the Czech “Modal” Particle… 65
(3) Prý přivezu ti nějaký dárek.

[PART bring:PF.PRS.1SG you:DAT.SG some:ACC present:ACC]
‘PRÝ I will bring you a present.’
In contrast, sentence (4) exemplifies indirect speech: it adopts the perspective of

the “reporter”, i.e. only the content of what was originally uttered is reported, not
the form. First, there is a change in the use of the deictic expressions, namely that
the pronominal clitic changes to mi [me] (reference is still made to the “reporter”,
who was the addressee in the original interchange) and it is attached to prý; unlike
in (3), prý is embedded in the sentence. Second, the verb is used in the third person
singular form and its understood subject is a third person personal pronoun, which
again stands for the “original speaker” (the source of the reported information):2
(4) Prý mi přiveze nějaký dárek.

[PART me:DAT bring:PF.PRS.3SG some:ACC present:ACC]
‘PRÝ he will bring me a present.’
As demonstrated in (5), prý introducing indirect speech is not limited to the sen-
tence initial position, but it can appear in the middle field as well:
(5) Přiveze (prý) mi (prý) nějaký dárek.

[bring:PF.PRS.3SG (PART) me:DAT (PART) some:ACC present:ACC]
‘He will bring (PRÝ) me (PRÝ) a present.’
A systematic corpus-based study of the positions of prý is, however, missing,3 and
corpus-based analyses of prý are scarce. In their study of the collocational profile of
prý in the monolingual written SYN2000 corpus of Czech, Hoffmanová and
Kolářová (2007, 101) note a high frequency of prý in journalistic texts and also
briefly mention the important role prý has in the rendering of dialogues in fiction. In
their study of the adverb údajně [allegedly] in journalistic texts (the AnoPress data-
base), Hirschová and Schneiderová (2012, 2) observe a reporter’s distance from the
reported facts not only for údajně, but also for prý. Importantly, they are the first
ones to discuss both expressions in the context of evidentiality, the evidence being
a verbal report from somebody else.4 Arguably, by making it explicit that the
2
Sentence (4) allows for another interpretation, in which the source of the reported information is
unknown; this will be dealt with later in this section.
3
There is only a 1951 study by Trávníček.
4
Grepl and Karlík (1998, 485) do not use the term evidentiality yet, but they identify a difference
between epistemic particles such as možná [maybe] and those which introduce someone else’s
opinion: while both are considered to mark a speaker’s stance, in the latter case by making it
explicit that the presented information comes from elsewhere the speaker avoids any responsibility
for it.
information comes from elsewhere, the journalists try to avoid responsibility for the
truth of the reported statement, or to show disagreement with it (Hirschová and
Schneiderová 2012, 2).
Though according to the authors both prý and údajně are evidentials of the hear-
say type,5 where, as they argue, the source of the reported information is not known
(ibid.), their analysis presents numerous examples in which the source is present.
This is in agreement with Aikhenvald (2004, 178),6 who mentions several languages
in which “[t]he same evidential may combine the meanings of a reported and a
quotative”. Evidentiality which prý marks is thus – if we borrow Aikhenvald’s
(2004) terminology – both reported (i.e. the authorship is not specified) and quota-
tive (i.e. the author is introduced). In other words, example (4) quoted above and
repeated here for convenience as (6) can be interpreted as either “He said he would
bring me a present”, or as “It was said he would bring me a present”:
(6) Prý mi přiveze nějaký dárek

[PART me:DAT bring:PF.PRS.3SG some:ACC present:ACC]
‘PRÝ he will bring me a present.’
The question arises, however, as to how to account for the meaning of “uncer-
tainty” or “doubt” expressed by prý introducing indirect speech, mentioned in the
Czech linguistic sources quoted above, namely SSČ and Komárek et al. (1986). In
general, these types of meaning belong to the domain of epistemic modality (cf. also
Komárek et al. 1986), which, according to Lyons (1977, 805) “express[es] different
degrees of commitment to factuality”.
Aikhenvald (2004) is quite explicit when it comes to the issue of evidential mark-
ers expressing modality: in her opinion, evidentiality “covers the way in which the
information was acquired, without necessarily relating to the degree of speaker’s
5
Like Aikhenvald (2004), Hirschová and Schneiderová (2012) maintain that evidentials are non-
truth-conditional. It has been argued, however, that evidential and epistemic adverbials need not be
non-truth-conditional; see e.g. Ifantidou-Trouki (1993) and Papafragou (2006). Ifantidou applies
the embedding tests of truth-conditionality on the hearsay adverbial allegedly; in her test, allegedly
contributes to the proposition expressed, and the same can be said about the Czech adverb údajně,
which translates allegedly. On the other hand, prý does not seem to pass the test, cf. her example If
the cook has allegedly poisoned the soup, the police should make an inquiry, which is acceptable
when translated into Czech by means of údajně (i), but not with prý (ii):
(i) Pokud kuchařka údajně otrávila polévku, měla by policie zahájit vyšetřování.
‘If the cook ÚDAJNĚ poisoned the soup, the police should start the inquiry.’
(ii) ?Pokud prý kuchařka otrávila polévku, měla by policie zahájit vyšetřování.
?‘If the cook PRÝ poisoned the soup, the police should start the inquiry.’
The difference between údajně and prý is worth more attention, but it is beyond the scope of
the present study.
6
It has to be remembered, though, that in Czech, as in other Slavic languages, evidentiality is not
a grammatical category; Hirschová and Schneiderová (2012) talk about lexical means of express-
ing evidentiality.
certainty concerning the statement or whether it is true or not” (2004, 3); she goes
on to add that evidentials “may acquire secondary meanings – of reliability, proba-
bility, and possibility (known as epistemic extensions), but they do not have to”
(2004, 6).
An opposite view is endorsed by Palmer (1986, 51ff.), who discusses evidential-
ity in the chapter on epistemic modality. According to him, presenting the informa-
tion as something a speaker has been told about is one of at least four ways in which
a “speaker may indicate that he is not presenting what he is saying as a fact”.
Ultimately, as he argues, all the four ways (the others are speculation, deduction and
appearance) “are concerned with the indication by the speaker of his (lack of) com-
mitment to the truth of the proposition being expressed”. More specifically, by using
an evidential, the speaker “provide[s] an indication of the degree of commitment”
and “qualifies” the proposition “in terms of the type of evidence” he or she has
(1986, 54). In this sense, says Palmer, evidentials are “subjective” (ibid.).
The relationship between modality and evidentiality is a strongly debated issue
(see e.g. Chafe and Nichols 1986, Willett 1988, and Dendale and Tasmowski 2001);
clearly, as Plungian (2001, 354) remarks, epistemic modality is “a domain where
evidential and modal values overlap” because it is concerned with the probability of
a proposition (P), which indicates that “the speaker has no direct knowledge of P”.
As Traugott (1989, 33) notices, “epistemics and evidentials share a great number of
similarities in their semantic development, and the histories of items in the one
domain can illuminate the histories of items in the other. Naturally, though, it may
be useful in some other endeavor, such as a fine-grained analysis of modality, mood,
and data-source/authority to distinguish epistemics and evidentials.”
This paper attempts at such a fine-grained analysis. We will investigate whether
the meaning of uncertainty or doubt is encoded in the meaning of prý, i.e. whether
prý in reporting indirect speech either (a) always carries the meaning of uncertainty
or doubt (SSČ and Komárek et al.), or (b) whether perhaps two autonomous senses
can be recognized for prý introducing indirect speech, or (c) whether uncertainty
and doubt are only epistemic overtones, which prý introducing indirect speech may
but need not carry.
We will argue for the last option. At the same time, we will resort to Traugott’s
concept of subjectification (e.g. 1989, 1995, 2010) as a “a pragmatic-semantic pro-
cess” and “a gradient phenomenon, whereby forms and constructions that at first
express primarily concrete, lexical, and objective meanings come through repeated
use in local syntactic contexts to serve increasingly abstract, pragmatic, i nterpersonal,
and speaker-based functions” (1995, 32). In her 1995 paper, Traugott discusses
examples “that correlate with change of grammatical status from main verb con-
structions not merely to auxiliaries (i.e. reduced verbs), but to discourse particles
with quasi-adverbial properties” (1995, 36); these examples include, among others,
quotative like, be going to, I think and let’s. Typically, Traugott argues, such changes
“involve a shift from relatively objective reference to use as markers of discourse
reference, i.e. they acquire a metalinguistic function of creating text and signalling
information flow” (1995, 39). This, as we believe, is what happened to prý, when it
developed from the lexical verb praviti meaning “to say”.
Following Johansson (2007), who argues that “in monolingual corpora we can
easily study forms and formal patterns, but meanings are less accessible”, we will
look at prý through the lens of another language. Since “one of the most fascinating
aspects of multilingual corpora is that they can make meanings visible through
translation” (Johansson 2007, 57), we turn to the English-Czech and Czech-English
sections of InterCorp (through the search engine KonText),7 a multilingual parallel
corpus of texts written or transcribed in 39 languages (as of 2016), all of which are
aligned with their Czech counterparts. Our aim is to investigate the functions of prý8
via its English correspondences in English source and target texts belonging to three
different registers.
As a starting point of the corpus analysis, we are interested to see whether the
source of the reported information (original speaker) is left unexpressed, as the
dictionary definition of prý seems to suggest (Sect. 3), and whether there is a differ-
ence between registers in this respect (Sect. 4). Several methodological issues will
be raised. Ultimately, however, we are interested in knowing in which registers the
correspondences of prý explicate the function of prý as an evidential marker of
reported information or whether it is rendered as an epistemic marker expressing a
lack of the speaker’s commitment to the factuality of the proposition (i.e. uncer-
tainty or doubt). Section 5 thus delves deeper into the status of the component of
doubt in the meaning of prý. More specifically, we question the definition in the
dictionaries which see doubt as an inherent part of its meaning. Here we try to out-
line the pragmatic mechanisms by which the modal overtones of doubt about the
truth of the reported facts arise.
2 Data and Methodology
All studies of register variation in parallel corpora are restricted to the types of texts
available in the corpora, which, in turn, are restricted to texts that are often trans-
lated. According to Hasselgård (2010), “this precludes the study of many types of
text, such as conversation, daily newspapers, and academic prose”. Our study
focuses on three registers which are available in InterCorp: fiction, journalistic texts
and spoken language. Unfortunately, journalistic texts are only represented by the
PressEurope database, which does not contain any news reporting, and spoken lan-
guage in InterCorp is far from spontaneous: our data come from the Proceedings
from the European Parliament (Europarl) and from subtitles taken from the Open
Subtitles Database. In addition, the data representing each register vary in size (the
PressEurope subcorpus is the smallest and the subcorpus of Subtitles translated
7
Both InterCorp and KonText were created at Charles University in Prague. https://ucnk.ff.cuni.cz/
intercorp/?lang=en
8
Prý, not the Common Czech prej, was selected for the analysis, since its usage is not restricted to
informal registers.
Table 1 The size of the subcorpora (in text positions, TPs)

Czech is target language (TL) Czech is source language (SL)
Fiction 5,085,844 943,125
PressEurope 281,461 59,111
Subtitles 43,579,345 511,641
Europarl 15,038,876
from English by far the largest; see Table 1) and come from different periods: the
PressEurope texts in InterCorp cover the period between 2009 and 2014 and the
Europarl texts date from 2007 to 2011. Our subcorpus of Fiction was created manu-
ally to include only books published after 1950 and to ensure that not more than two
novels per author are included. As far as the subcorpora of Subtitles are concerned,
since InterCorp only provides information about the year in which the original lan-
guage version of the film was released (no information is provided for Czech as the
target language), and since the number of Czech original films in InterCorp is very
low, all Subtitle data available were included in the subcorpus.
This brings us to two more problems regarding comparability of the subcorpora:
first, Czech as a small language is the source language of a much lower number of
texts than English, and so subcorpora consisting of Czech target texts are always
larger than subcorpora of Czech source texts. This applies to all subcorpora, as
demonstrated in Table 1 above. Second, there is a problem with the concept of
“original language”: while in our subcorpus of Fiction and of Subtitles Czech or
English is always the language of the original, the same cannot be said about the
PressEurope subcorpus; InterCorp does not provide information about the language
of the original text for the texts included in the PressEurope database. Europarl
(which is not annotated for the original language either) questions even the concept
of the source language: as Gast and Levshina (2014, 377–378) argue, “until 2003
the texts were translated directly from the source languages into any of the target
languages. From 2003 onwards … all languages were first translated into English
and then into the relevant target language”.9 What makes this even more compli-
cated is the fact that a large proportion of the Europarl texts are not even annotated
for the source language, which means, for example, that a potential subcorpus of
Czech translations from English source texts only has 9,284 text positions (TPs) and
prý does not occur in it. To obtain a sensible amount of Europarl data for analysis,
following Gast and Levshina (2014),10 we resorted to a methodologically problem-
atic solution not to differentiate between Czech as source and target language and to
include even translations from other languages. This explains the single number in
the last row in Table 1 above.
9
We hear that this practice is, however, abandoned if an interpreter between e.g. German and
Czech is available; the translation then goes directly from German to Czech (Šárka Timarová, pers.
comm.).
10
Gast and Levshina (2014, 379) argue that the “translations of the EUROPARL corpus are of a
very high quality and certainly come close to that ideal”, namely the ideal of “near equivalence” in
a translation corpus.
3 Downloading and Sorting the Data
A simple Word form query (with Case unmatched) was used to retrieve all tokens of
prý from the individual subcorpora; absolute and normalised (instances per million
words, ipm) frequencies of prý are presented in Table 2.
It follows from Table 2 that Czech target texts (TTs) in the subcorpus of Fiction
and of PressEurope texts show significant translation effects, namely a lower nor-
malised frequency of prý in TTs than in the source texts (STs). This is statistically
more significant in the Fiction texts (LL 101.08, p < 0.0001) than in PressEurope
(LL 44.25, p < 0.0001); no statistical significance is observed for the Subtitles (LL
1.40, p > 0.05).11 These quantitative differences, however, by no means indicate that
the data are not reliable,12 as only two incorrect translations into Czech were identi-
fied. Incidentally, in both of them the Czech translation suggests that the original
author is also the reporter, which Czech does not allow; prý introducing indirect
speech can only be used to report someone else’s words:
(7) But I only thanked her and said no, that I wished to be on my own.
[EnCz.Fict:BJ_S].13
Jen jsem jí ale poděkoval a odmítl jsem, že prý chci být sám.
‘I only thanked her and refused, [saying] that I PRÝ want to be on my own.’
Table 2 Absolute and normalised frequencies of prý in the subcorpora

Czech is TL Czech is SL
Absolute Normalised Absolute Normalised
frequency frequency (ipm) frequency frequency (ipm)
Fiction 285 56 155 164.4
PressEurope 6 21.3 20 338.4
Subtitles 3,895 89 54 105.5
Europarl Absolute frequency: 51
Relative frequency: 3.4
11
For the statistics, we used Andrew Hardie’s online calculator available at http://corpora.lancs.
ac.uk/sigtest/ (p-value returned by the Fisher exact text). Naturally, it could not be applied to the
Europarl data, where Czech originals and Czech translations are not distinguished.
12
Compare Altenberg and Granger (2002, 40): “Translation effects, whether induced by the source
language or universal strategies, are seldom violations of the target language system in profes-
sional translations, but quantitative deviations from the target language norm . . . . As such they are
of course eligible as potential translation equivalents.”
13
The legend in square brackets indicates source language, target language, subcorpus and abbrevi-
ated title (in this order, where applicable). For the abbreviations and list of titles quoted in this
paper, see Appendix 1 and 2.
All tokens of prý found in Fiction, PressEurope, Europarl and source texts of
Subtitles, and a random sample of 200 tokens of prý in target texts of Subtitles14
were sorted and subjected to scrutiny. Section 3.1 demonstrates with concrete (but
randomly selected) examples which types of English correspondences were counted
as those in which the source of the reported information (original speaker) is
unknown (not expressed), while Section 3.2 presents the types of correspondences
identifying the original speaker. Section 4 then provides a detailed comparison of
the registers.
3.1 Source (Original Speaker) Unknown
The correspondences of prý with no reference to the original speaker range from
clauses with nouns such as word or rumour, clauses with the evidential verb seem,
verbs think, suppose and guess, evidential adverbs and the evidential semi-auxiliary
be supposed to15 to reporting clauses with verba dicendi and verbs referring “to the
receptive end of the communication process” (Quirk et al. 1985, 181). Furthermore,
the reporter may seek for the confirmation of the validity of the reported
statement.
Sentence (8) exemplifies the noun word in the correspondence, sentence (9) a
clause with the verb seem, (10) a clause with the verb suppose, and in (11) the
reporter seeks a confirmation of the validity of the reported statement:
(8) Word is that Randy, the Boy Wonder, is convinced that he can turn the center
into a hot acquisition target that will attract one of the big pharmaceutical
companies. [En-Cz.Fict:KJA_FA]
Zázračný chlapec Randy je prý přesvědčený, že může ústav změnit ve
velepřitažlivý cíl investorů a přilákat jednu z největších farmaceutických
firem.
‘The wonder boy Randy is PRÝ convinced that …’
(9) So when Willem began hitting Catharina it seems Tanneke got in between
them to protect her. [En-Cz.Fict:CT_GP]
Takže když Willem začal Catharinu mlátit, Tanneke prý vběhla mezi ně, aby
ji chránila.
‘So when Willem began hitting Catharina, Tanneke PRÝ got in between
them to protect her.’
14
The examples from Czech STs come from films released in the period between 1955 and 2010,
and from the Czech subtitles of films released in English between 1915 and 2012 (but only five
tokens of the two hundred are pre-1950, and 172 are post-1980).
15
Chafe (1986) lists evidently, apparently, and be supposed to as hearsay evidentials.
(10) On a clear day I suppose it is possible to see both ranges.

[En-Cz.Fict:SAR_HT]
Obě pohoří je prý vidět, pokud není zataženo.
‘Both ranges are PRÝ to be seen when it is not overcast.’
(11) Now tell me you’re married. [En-Cz.Subt:TD]

Můj drahý Armande, prý jsi ženatý.
‘My dear Armand, PRÝ you are married.’
The most frequent adverbs are the evidential ones, namely apparently, allegedly,
supposedly, and reportedly:
(12) Apparently Beata had not gone to America at all. [Cz-En.

Fict:VM_VDC]
Beáta prý do Ameriky vůbec neodletěla.
‘Beata PRÝ had not gone to America at all.’
Then there is the evidential “semi-auxiliary” be supposed to:
(13) He’s supposed to be after me. So McGonagall reckons he might have sent
it. [En-Cz.Fict:RJK_PA]
Jde mi prý po krku, takže McGonagallovou napadlo, že mi Kulový blesk
možná poslal on.
‘He is PRÝ after me, so …’
Sentences (14)–(18) exemplify the cases of reporting clauses in which the source of
the reported information is underspecified or entirely left out. In (14) and (15) the
reporting verb has a general subject argument, namely the generically used 3rd per-
son plural pronoun they and the noun people; most typically, however, the reporting
verb is used in the passive, as in (16):
(14) They say his father was a fisherman. [En-Cz.

Fict:HE_OMS]
Jeho táta prý byl rybář.
‘His father PRÝ was a fisherman.’
(15) People sometimes tell me I’ve missed out on life because I never married
and had children. [En-Cz.Fict:IK_AFW]
Prý jsem o mnoho přišel, protože jsem se neoženil a neměl jsem děti.
‘PRÝ I’ve missed a lot because . .
(16) Though he was said to be in his mid-sixties, he didn’t look to be any older
than her fifty-year-old father. [En-Cz.Fict:RF_HS]
Třebaže mu prý je kolem pětašedesáti, nevypadal o nic starší než její

padesátiletý otec.
‘Though he PRÝ is about sixty-five, …’
Sentence (17) exemplifies the tokens in which the subject argument of the reporting
verb in the passive voice is the reporter (addressee in the original interchange):
(17) I’m told that the initial tests have gone very well. [En-Cz.
Fict:KJA_FA]
Klinické testy prý zatím probíhají velice dobře.
‘The clinical tests PRÝ are going very well.’
Finally, (18) demonstrates a reporting clause with a communication verb refer-

ring “to the receptive end of the communication process” (Quirk et al. 1985, 181),
namely the verb hear:
(18) I hear that Dubrovnik is the most beautiful city in the world. ..
[En-Cz.Fict:SAR_HT]
Dubrovník je prý nejkrásnější město na světě …
‘Dubrovnik is PRÝ the most beautiful city in the world …’
All linguistic expressions (words, phrases and clauses) exemplified in (8)–(18)

directly correspond to prý, that is, they are its direct equivalents. We will call such
correspondences “direct correspondences” and we will differentiate them from
cases in which prý is used, but, on top of that, the English reporting expressions
have their own translation equivalents as well. This is the case of sentence (19),
where the phrase the rumor is translated as tvrdí se [it is claimed], while prý still
occurs:
(19) The rumor is that Lily and James Potter are – are – that they’re – dead
[En-Cz.Fict:RJK_PS]
A tvrdí se, že Lily a James Potterovi jsou jsou – že prý jsou mrtví.
‘And it is claimed that Lily and James Potter are are – that they PRÝ are
dead.’
Correspondences such as (19), where prý is in fact added (or omitted) in the transla-
tion, since the Czech sentence contains another overt marker of indirect reporting
(one which has its own counterpart in English), will be referred to as “indirect cor-
respondences”. These, in turn, will be kept separate from the zero correspondences
“proper”. The term zero correspondence will only be used for cases such as (20), in
which no direct or indirect correspondence of prý can be identified:
(20) A Japanese team has arrived in Skardu and they’re paying 6 dollars a day.
[En-Cz.Subt:K2]
Japonský tým přijel do Skardu a prý platí 6 dolarů denně.

‘A Japanese team has arrived in Skardu and PRÝ they’re paying 6 dollars
a day.’
3.2 Source (Original Speaker) Known
On the basis of the data, we distinguish two types of direct correspondence in sen-
tences with the known source. First, prý corresponds to an English reporting clause
with a noun or pronoun in the subject argument of the reporting verb; this noun or
pronoun refers to the original speaker:
(21) He said he could bring the rest later. [En-Cz.

Fict:SD_SC]
Pro zbytek si prý zajede později.
‘The rest PRÝ [he] will bring later.’
(22) (My husband doesn’t know about. .. you know…) and your mother says
she’s not going to tell your father, either. [En-Cz.Fict:DC_CW]
A tvoje matka to prý otci taky nepoví.
‘And your mother PRÝ will not tell your father either.’
Second, the source of the reported information is introduced in the phrase

according to:
(23) According to the Bolivians, it was a routine stop, and when they discovered
Mathis’ body, Bond disarmed and shot them. [En-Cz.Subt:QS]
Když prý bolivijská policie našla Mathisovo tělo, začal Bond střílet.
‘When PRÝ the Bolivian police discovered Mathis’ body, …’
Then there are the cases of indirect correspondence: the English clause or phrase
containing reference to the source of the reported information has its own counter-
part, which occurs alongside prý. This is the case of (24) and (25), where prý occurs
in the final clause of a (very long) reported complex, while the reporting clause has
its own counterpart. In (25), this involves the loss of the original sentence boundary;
in the English translation the sentences are joined:
(24) Miss Vavasour insisted that his daughter and her family should all stay for
lunch, that she would cook a chicken… [En-Cz.Fict:BJ_C]
Slečna Vavasourová nedala jinak, než že jeho dcera musí i s celou rodinou
zůstat na oběd, že prý upeče kuře…
‘Miss Vavasour insisted that his daughter and her family should all stay for
lunch, that PRÝ she would cook a chicken …’
(25) The secretary reported that Mr. Uzel had turned up and was maintaining a
vigil in the corridor outside my study, doggedly waiting to see me.
[Cz-En.Fict:SV_SP]
Sekretářka mi hlásí, že hajný Uzel vartuje na chodbě přede dveřmi mé
pracovny. Nedal prý se odbýt.
‘The secretary reports to me that gamekeeper Uzel is maintaining a vigil
in front of the door outside my study. He PRÝ wouldn’t be got rid of.’
In (26) the English reporting clause (introducing the source of the reported informa-
tion) is even found in a larger preceding context, outside the sentence boundary:
(26) (“He thinks it’s going to storm,” Rachel explained when the meeting
was over.) “He says you can go, but he will not send a guide. It’s too
dangerous.” [En-Cz.Fict:GJ_T]
Říká, že můžete odejít, ale průvodce s vámi poslat nechce. Prý by to bylo
příliš nebezpečné.
‘[He] says you can go, but he does not want to send a guide. PRÝ it would
be too dangerous.’
In (27) and (28) the source of the reported information is inferred from the imme-
diately preceding context. It was a participant in the original interchange, which
directly precedes:
(27) (Tracy was on the other line. She was very upset.) Becky has taken a sudden
turn for the worse and has been moved to the ICU. [En-Cz.Fict:CR_T]
Becky se prý náhle zhoršila, a tak ji převezli na JIPku.
‘Becky PRÝ got suddenly worse and so they moved her to the ICU.’
(28) (This is the boarding house Albert. Send us someone at once. A lodger has
gone mad.) Sorry? How do we know? How is he? [Cz-EN.Subt:PSP]
Prosím? Jak se to jeví? Jak se … jak prý se to jeví?
‘Sorry? How does it show? How … how PRÝ does it show?16
In (29), the reporter is the agent of the verbal event in the secondary predication
after have, i.e. it is semantically present:
(29) – Karen, I’ve had those images of the creature analyzed. – What is it? – It’s
something new, but gorilla-like. [En-Cz.Subt:C]
– Karen, nechal jsem analyzovat záběry těch bytostí. – Co je to? – Prý je
to něco nového, prý něco jako gorila.
‘– Karen, I’ve had those images of the creature analysed. – What is it? –
PRÝ it’s something new, PRÝ something like a gorilla.’
This is a telephone conversation between the owner of the boarding house and the police, but we
16
can only hear the owner.

4 Prý Across the Registers
Figure 1 suggests a difference between the Fiction texts, where the source of the
reported information tends to be expressed (60% of correspondences of prý in
Czech target texts and 57.4% of correspondences of prý in Czech source texts), and
the other subcorpora, in which it is left unexpressed in the majority of tokens; this
is most evident in the Europarl corpus.
4.1 Prý in the Fiction Subcorpus
A closer look at the Fiction subcorpus reveals that if the source of the reported infor-
mation is expressed, the most frequent English correspondence of prý is a reporting
clause with a non-generic subject: 61 tokens (21.4%) of prý in Czech target texts
(TT prý) correspond directly to a reporting clause whose subject is a specific noun
or pronoun referring to the original source. For the Czech source texts the percent-
age is even higher (36.1%, 56 tokens). The most frequent verb is the verb say, which
covers 82% of these correspondences of TT prý (50 of the 61) and 67.9% of these
correspondences of ST prý (38 of the 56). Among the remaining tokens in Czech
source texts where prý has a direct correspondence with a reporting clause in which
an explicit reference to the source of the reported information is made, however,
there are nine tokens of the verb claim, as in (30). This might suggest that the trans-
lators do sometimes try to express the reporter’s lesser commitment to the truth of
the reported statement.17 This issue will be readdressed in Section 5.
100%
90%
80% 66 114
70% 11 119
36
60%
44
50%
40%
30% 89 171
9 Source unknown
20% 81
18 Source known
10% 7
0%
TT
ST
TT
RL
ST
TT
PA
Cz
Cz
Cz
Cz
Cz
RO
N
S
N
PE
LE
LE
IO
EU
RO
TI
CT
IT
IT
C
BT
BT
FI
FI
E
SS
SU
SU
E
PR
Fig. 1 Correspondences of prý in individual subcorpora according to the presence/absence of the

source of the reported information
17
Oxford Advanced Learner’s Dictionary (8th ed.) defines claim as “to say that sth is true although
it has not been proved and other people may not believe it”.
Cz ST Cz TT
100 200
80 Zero 150 Zero
33 1 correspondences correspondences
60 3 110
100 16
Indirect 23 Indirect
40 correspondences correspondences
56 62 50
20 Direct 61 75 Direct
correspondences correspondences
0 0
Source Source Source Source
known unknown known unknown
Fig. 2 Correspondences of ST prý and TT prý in Fiction (in absolute numbers)
(30) He claimed to have read somewhere (but more likely the possibility just
occurred to him) that lung cancer was infectious, and he was constantly
making a scene about my endangering our child. [Cz-En.Fict:KP_SZS]
Dočetl se prý kdesi (ale spíš si usmyslel), že rakovina plic je nakažlivá …
‘He read PRÝ somewhere (but more likely he just took it into his head)
that …’
A difference between the correspondences of ST prý and TT prý in Fiction can be

observed in that ST prý has a much higher proportion of correspondences with an
evidential adverb (29 tokens, which cover 18.7% of all correspondences of the ST
prý) than TT prý (ten tokens, 3.5% of all correspondences of the TT prý). This
might be due to the fact that Czech-English dictionaries (cf. Fronek 2000) offer
evidential adverbs as the first translation equivalent.
Finally, Fig. 2 brings a summary of the types of correspondences of ST prý and
TT prý in Fiction, suggesting a higher proportion of direct correspondences in
Czech source than target texts. In other words, prý tends to be more often added in
translation than omitted. That is to say, if zero and indirect correspondences are
counted together, they cover 23.9% (37 tokens) of the correspondences of ST prý
and 52.6% (149 tokens) of the correspondences of TT prý.
4.2 Prý in the Subtitles Subcorpus
The subcorpus of Subtitles shows a different picture (see Fig. 3): first, the absolute
number of the tokens of ST prý is rather low, namely 54, and the source of the
reported information is known only in 33.3% of the correspondences (18 tokens).
TT prý has a higher correspondence with the known source (40.5%; 81 tokens) than
ST prý, but the unknown source still prevails. The most frequent correspondences in
the Subtitles are communication verbs such as hear (e.g. [31]) and understand,
which cover ca. 22% of correspondences of prý (12 out of the 54 tokens of ST prý
and 43 out of the 200 tokens of TT prý).
Cz ST Cz TT
40 140
35 6 120
30 Zero 15 Zero
4 correspondences 100 6 correspondences
25
80
20 Indirect 18 Indirect
60
15 9 26 correspondences 98 correspondences
10 40 63
5 9 Direct 20 Direct
correspondences correspondences
0 0
Source Source Source Source
known unknown known unknown
Fig. 3 Correspondences of ST prý and TT prý in Subtitles (in absolute numbers)
(31) I hear you’re looking for me. [EN-Cz.Subt:SFA]

Prý se po mně sháníš.
‘PRÝ you’re looking for me.’
A difference can be observed between STs and TTs, namely that TT prý has a
high percentage of direct correspondences with an English reporting clause intro-
ducing the original speaker (63 tokens, i.e. 31.5%), higher than ST prý (9 tokens, i.e.
16.7%) and even higher than TT prý found in Fiction (21.4%). This might be due to
the fact that in Subtitles, where space is very limited, prý as a three-letter word is
considered to be a useful tool as an equivalent of a whole clause (compare also the
fact that in the Subtitles where Czech is the target language prý has the highest rela-
tive frequency of the target texts of all subcorpora; ipm 89). In direct correspon-
dences of TT prý it is again reporting clauses with the verb say that dominate
(covering 40 out of the 63 tokens of a reporting clause with a non-generic subject as
a direct correspondence of prý, i.e. 63.5%), and five out of the nine direct correspon-
dences of ST prý contain a reporting clause with the verb say, as in (32):
(32) He said you’re not to get close to the window! [Cz-En.Subt:PVV]

Nemáš prý chodit k oknu!
‘You are not supposed to PRÝ get close to the window!’
The original speaker is sometimes known because the reported utterance immedi-
ately precedes; if the addressee was also present in the original interchange (which
they were both in [33] and [34]), the reporting function is blocked because the
information is not new to them. In such cases, prý expresses a strong detachment of
the reporter from the reported information and the whole statement is ironic:
(33) – [Daughter:] Daddy, do be reasonable! – [Father:]And saucy you are, too!

So I’m not reasonable. [Cz-EN.Subt:BJJK]
–Tatínku, s Tebou není rozumná řeč! – Tak ty budeš ještě drzá! Se mnou
prý není rozumná řeč!
‘–Daddy, you aren’t talking reasonably. – So now you will be saucy on top
of everything! I PRÝ am not talking reasonably!’
(34) – Well, uh … the way I see it, this is a pretty big favor … – Some big favor.
I could operate that goddamn thing. [EN-Cz.Subt:RLD]
– Prý velká laskavost … – Sám bych uměl obsluhovat tu blbou pec.
‘– PRÝ a big favor …’
Unlike in Fiction (Fig. 2), ST prý has a lower proportion of direct correspon-

dences than TT prý. In other words, in Subtitles, prý tends to be more often omitted
in translation than added. That is to say, if zero and indirect correspondences are
counted together, they cover 35.2% of the correspondences of ST prý and 19.5% of
the correspondences of TT prý. This is perhaps what one could expect in the corpus
of Subtitles, where each word matters: additions are not welcome.
4.3 Prý in the Europarl Subcorpus
The Europarl data show not only the lowest relative frequency of prý (3.4), but also
the lowest percentage of correspondences revealing the original speaker, namely
13.7% (seven tokens). In addition, the context suggests a detachment of the reporter
from the reported statement:
(35) (I offer, as an example of such dogmas, the recent article by Václav Klaus
advising us how to overcome this financial crisis by temporarily softening
social, environmental and health standards) because, he says, these
standards obstruct rational human behaviour. [Cz-En.Europarl]
…protože prý tyto standardy brání racionálnímu lidskému jednání.
‘…protože PRÝ these standards obstruct rational human behaviour.’
As for the source unknown, the most frequent correspondence in Europarl is a

reporting clause with the verb in the passive form (24 tokens, 46.2%), followed by
evidential adverbs (13 tokens, 25%), most typically allegedly (six tokens, 11.8%);
in some cases, there is an explicit detachment from the reported information:
(36) It is said that Kosovo does not set a precedent. (That is a mistake …)
[Cz-En.Europarl]
Kosovo není prý žádný precedent. (Ale to je omyl …)
‘Kosovo PRÝ is no precedent. (But that is a mistake …)’
As Fig. 4 shows, zero correspondences proper are not found at all and indirect cor-
respondences cover 19.6% (10 tokens) of all correspondences of prý in Europarl,
which is a proportion very close to the proportion of zero and indirect
Fig. 4 Correspondences Europarl

of prý in Europarl (in 50
absolute numbers)
40 4
Zero
correspondences
30
Indirect
20 40 correspondences
10 6 Direct
1 correspondences
0
Source Source
known unknown
correspondences of TT prý in the subcorpus of Subtitles. This might be due to the

fact that many of the tokens of prý do not come from Czech source texts, but from
translations from English and other languages.
4.4 Prý in the PressEurope Subcorpus
The PressEurope data are the least reliable, since there are only six tokens of prý in
Czech target texts and 20 tokens of prý in Czech source texts. Among the latter, nine
were classified as introducing the source; however, five of them (including sentence
[37]) come from one section of one text, concluded by “at least, so says Karel Kříž”
(a journalist quoted in the text):
(37) And the same fate awaits the Czech Republic, whose capital is rapidly being
drained away. [Cz-En.PressEurope]
Kolonií je asi Česká republika, protože z ní prý teče kapitál jako z
vodovodní trubky.
‘… the Czech Republic, because PRÝ capital is rapidly being drained away.’
That is to say, texts included in PressEurope are special in that they do not report but
argue, namely for or against the contents of other articles (such as the one written
by Karel Kříž). It thus turns out to be impossible for us to prove or disprove
Hirschová and Schneiderová’s (2012) hypothesis that prý is used by journalists to
avoid responsibility for what they are reporting. What was confirmed, on the other
hand, is a detachment from the reported information. In correspondences with a
reporting clause introducing the original speaker, a wider context proves the reported
statement to be either false, or at least open to discussion:
(38) The Judicial Council denied her sickness benefits, alleging that she was
faking her illness. (Shortly afterwards, she fell into a coma and died of
heart failure.) [Cz-En.PressEurope]
Fig. 5 Correspondences PressEurope

of prý in PressEurope (in 12
absolute numbers)
10
Zero
8 correspondences
7
6 Indirect
7
4 correspondences
2 4 Direct
2 correspondences
0
Source Source
known unknown
Soudní rada jí zamítla příspěvky k nemocenské – prý proto, že simuluje.

‘The Judicial Council denied her sickness benefits, PRÝ because she is
faking her illness.’
Figure 5 suggests a high proportion of indirect and zero correspondences (the latter
include three tokens of a missing sentence). The numbers are, however, too low to
allow for any conclusions.
5 Discussion and Theoretical Implications
We believe that the correspondences of prý reveal different dominant functions of

prý in the registers analysed. In the texts of Fiction, correspondences of prý intro-
duce reference to a specific source in the majority of cases. Its primary function can
thus be regarded as quotative. The PressEurope texts, on the other hand, are highly
polemic, which means that the authors either make reference to other texts, or they
express doubt about the reliability of the reported information. The latter is what
they share with the Europarl texts, which have the smallest proportion of tokens
with a known source; moreover, if the source is known, it is usually mocked. The
subcorpora of Subtitles, which, along with Europarl, are the closest approximation
to spoken language of all the data in InterCorp, also show a lower proportion of cor-
respondences with the known source.
This does not mean, however, that prý is never used to cast doubt on the reported
information in the texts of Fiction. As we observed in Section 4.1, sometimes the
verb claim is used in the reporting clause equivalent to prý, more frequently for ST
prý than for TT prý. The reason might be that in the Czech STs, the texts include
novels by Kundera and Škvorecký, which mock the communist regime:
(39) (Only once in a while I am upset by some political outrage or other, like
Sakharov,) who claims that American workers are better off than Soviet
workers. (American workers may have higher wages – and the freedom to
fight for them in continual struggles – but I know very well how workers in
your country live in security, in peace, how they are cared form every way
by the state.) [Cz-En.Fict:SJ_EHS]
Americký dělník se prý má líp než sovětský.
‘The American worker is PRÝ better off than the Soviet one.’
Detachment from the reported proposition can also be seen in other cases; it is a
wider context which makes it explicit:
(40) The absurd thing is, it was these very lines that some of the critics beat me
over the head with. They said the heroes of my Monologues were simply
the bourgeoisie in proletarian dress. [Cz-En.Fict:SJ_EHS]
Prý hrdinové mých Samomluv jsou buržousti převlečení za proletáře.
‘PRÝ the heroes of my Monologues are simply the bourgeoisie [pejorative]
in proletarian dress.’
(41) But the true source of poetry, I was told by this comrade-person, is not, as I
wrote, Beauty, but Class Hatred, Class War. [Cz-En.Fict:SJ_EHS]
Pramenem poesie však prý není, jak jsem napsal já, Krása, řekl mi soudruh
člověk, ale Třídní Nenávist, Třídní Boj.
‘The true source of poetry, however, PRÝ is not, as I wrote, Beauty, said
this comrade-person, but Class Hatred, Class War.’
This doubt or lack of commitment to the truth of the proposition, however, cannot
be regarded as an inherent part of the meaning of prý because it is not obligatory:
examples can be found (even in Škvorecký and Kundera) where there is no doubt
expressed or implied, as in (42):
(42) Those beautiful, faded eyes from Kiruna are constantly watching me. Her
grandfather, she said, used to own iron mines in Kiruna, a city of the
midnight sun. [Cz-En.Fict.:SJ_EHS]
Vytrvale mě pozoruje krásně vyšisovanýma očima z Kiruny, kde prý měla
kdysi dědečka majitelem železných dolů, v tom městě půlnočního slunce.
‘She constantly watches me with the beautifully faded eyes from Kiruna,
where PRÝ she once had a grandfather, owner of iron mines, in that city
of the midnight sun.’
The same can be said about the subcorpus of Subtitles, where we observed a lower
proportion of correspondences with a reference to the original speaker. Even in
tokens with correspondences where no specific source is expressed and with zero
correspondences no component of doubt is necessarily present. In fact, in the fol-
lowing examples the reporters commit themselves to the factuality of the
proposition:
(43) They said you’d last another two weeks, but we can’t wait that long,
Mrs. Stubb! [Cz-En.Subt: PVV]
Říkali, že prý vydržíte asi tak čtrnáct dnů, jenomže my nemůžeme tak
dlouho čekat, paní Stubová!
‘They said you’ll PRÝ last another fourteen days, but we can’t wait that
long, Mrs. Stubb!’
(44) Life is beautiful, so enjoy it. I’m coming for you in a year. [Cz-EN.Subt:
CVP]
Život je prý krásný, tak si ho užívejte, přesně za rok si pro vás přijdu!
‘Life is PRÝ beautiful, so enjoy it. I’m coming for you in a year!’
Most importantly, the overall results show that typical counterparts of prý are not
modal markers (modal verbs/adverbs/adjectives), but a reporting clause (most fre-
quently with the verb say), communication verbs such as hear, evidential adverbs
such as apparently, allegedly, supposedly and reportedly, or the evidential semi-
auxiliary be supposed to. Importantly, these correspondences can be found in both
of the major categories, i.e. in tokens with the source known and unknown alike.
Therefore, it can be concluded that doubt and uncertainty are not an inherent part
of the meaning of prý, but its overtones: in other words, they may be inferred from
the context. The mechanism which triggers such an interpretation, we believe, is
pragmatic inferencing. This process, which can be called “invited inference” (see
e.g. Traugott and Dasher 2002) has, according to Traugott, “a cognitive-
communicative motivation”: it is “the attempt on the speaker’s part to increase the
informativeness to the interlocutor of what is being said” (1995, 49). This is in
accordance with Aikhenvald (2004, 164), who observes that “[n]ot every reported
evidential implies that the information is unreliable”. Speakers “may choose to
employ the reported evidential for two reasons: firstly, to show his or her objectiv-
ity; that the speaker was not the eyewitness to an event and knows about it from
someone else. Secondly, as a means of ‘shifting’ responsibility for the information
and relating facts considered unreliable” (2004, 180): in the latter case, the reported
evidential gains an “epistemic extension”. Such an extension, as we have seen, is
contextually bound.18
The present-day function of prý is that of signalling information flow in that the
speaker either directly reports an utterance or signals to the hearer that the source of
information is external to the speaker. In this respect, we can speak of the polysemy
of prý in terms of the direct and indirect reporting functions (see Section 1 for the
18
In a similar vein, Bybee et al. (1994, 180) argue that “an indirect evidential, which indicates that
the speaker has only indirect knowledge concerning the proposition being asserted, implies that the
speaker is not totally committed to the truth of that proposition and thus implies an epistemic
value”. In other words, “the implication is definitely an epistemic one – that the speaker does not
vouch unconditionally for the accuracy of the information” (1994, 203).
grammatical differences). Polysemy, as Traugott (2010, 32) put it, is typical of sub-
jectification: “by hypothesis most new semantic developments emerge as polyse-
mies, pragmatic to begin with, then semantic”. As our data show, however, it is not
justified to speak of the polysemy (i.e. polysemy of the evidential and modal use) of
prý with the indirect reporting function, because doubt and uncertainty are not part
of its conventional meaning. Prý rather seems to behave like, for example, in fact in
Aijmer and Simon-Vandenbergen’s (2004) analysis. Here the authors argue that the
uses of in fact are “pragmatic implicatures which are conventionalised to a greater
or lesser extent, as some contextual meanings are more frequent and more conven-
tionalised than others” (Aijmer and Simon-Vandenbergen 2004, 1788).
It is interesting to notice under which contextual circumstances the pragmatic
implicatures arise. In recent years, subjectivity is discussed alongside intersubjectiv-
ity (see e.g. Davidse et al. 2010). In contrast to subjectivity, intersubjectivity marks
the speaker-addressee relationship in a more prominent way. Through the invited
inference, prý can function as a marker of intersubjectivity, which is described as
“the explicit expression of the SP[eaker]/W[riter]‘s attention to the ‘self’ of addressee/
reader in both an epistemic sense (paying attention to their presumed attitudes to the
content of what is said), and in a more social sense (paying attention to their ‘face’ or
‘image needs’ associated with social stance and identity)” (Traugott 2003, 128). In
this respect, Traugott mentions some uses of hedges such as well or perhaps (2010,
37). When applied to the usage of prý in cases such as (11) (repeated here as [45]),
by using prý the speaker invites the addressee not only to infer that he or she does not
commit to the factuality of the reported statement, but also to confirm it:19
(45) Now tell me you’re married. [En-Cz.Subt:TD]

Můj drahý Armande, prý jsi ženatý.
‘My dear Armand, PRÝ you are married.’
As for the “social” sense, we observed several cases, which, however, express the
opposite of “paying attention” to the face of the addressee. In (46) Draco Malfoy is
humiliating Harry Potter by making him recognize the fact that he fainted:
(46) (As Harry stepped down, a drawling, delighted voice sounded in his ear.)
“You fainted, Potter? (Is Longbottom telling the truth? You actually
fainted?”) [En-Cz.Fict:RJK_PA]
Tys prý omdlel, Pottere?
‘You PRÝ fainted, Potter?’
In (33), repeated here as (47), the father reports what his daughter has just said to
him, using prý. Just as it is infelicitous for the speaker to report (indirectly) his or
19
However, we cannot go as far as claiming that prý has undergone the process of “intersubjectifi-
cation”, which, according to Traugott (2010), follows, or arises from, subjectification. Traugott
(2010, 37) makes a distinction between intersubjectivity and intersubjectification along the follow-
ing lines: “If it is derivable from the context, it is only a case of increased pragmatic intersubjectiv-
ity. In other words, there may be more addressee-oriented uses, but unless a form–meaning pair has
come to code intersubjectivity, we are not seeing intersubjectification”.
her own utterance with prý (see example [7] in Section 3), it is normally pragmati-
cally infelicitous for the speaker to report the hearer’s utterance to the hearer or the
audience who participated in the original interchange; when it does happen, the
sentence expresses a negative attitude, criticism or irony.
(47) – Daddy, do be reasonable! – And saucy you are, too! So I’m not
reasonable. [Cz-EN.Subt: BJK]
Tatínku, s Tebou není rozumná řeč! – Tak ty budeš ještě drzá! Se mnou
prý není rozumná řeč!
‘– Daddy, you aren’t talking reasonably. – So now you will be saucy on top
of everything! I PRÝ am not talking reasonably!’
The same applies to (48), where the speaker mocks the original speakers in front of
them:
(48) You littluns started all this, with the fear talk. Beasts! Where from?
[En-Cz.Fict:GW_LF]
Vy mrňousi jste to všechno začali tím ustrašeným žvaněním. Že prý jsou
tu obludy! Kde by se tu vzaly?
‘You littluns started all this, with the fear talk. [That] PRÝ there are beasts
here! Where would they come from?’
However, as Aikhenvald (2004, 183) notices, reported evidentials can (with proper
intonation or gestures) also be used ironically even “when used in a statement that
both the speaker and the hearer know to be true”, i.e. even if they have no “overtones
of unreliable information” (2004, 184). This is exemplified not only in (46) above, but
also in (49), which presents a dialogue between Nicolas Cage as the FBI agent Stanley
Goodspeed and Sean Connery as John Patrick Mason (currently a prisoner), who has
been told before that there is a serious problem and the FBI needs his help urgently:
(49) – [Goodspeed:] I’m Stanley Goodspeed. – [Mason:] But of course you are.
– [Goodspeed:] Of course I am. Huh. – [Mason:] And you have an
emergency. – [Goodspeed:] That’s right. [En-Cz.Subt: TR]
– Jsem Stanley Goodspeed. – Ale jistě. – Jistě. – A prý máte problém.
– Ano.
‘… And PRÝ you have an emergency.’
This again confirms that no modal overtones are encoded in the meaning of prý.
6 Conclusions
This paper investigated the dominant functions of the Czech (arguably modal) par-
ticle prý, which is used to introduce reported speech, both direct and indirect (the
latter function is referred to as evidential). In order to do this, we looked at prý
through the lens of another language – English – and we focused on three registers,
namely fiction, journalistic texts (represented by PressEurope texts) and spoken lan-
guage (represented by Subtitles and Europarl texts) available in the parallel corpus
InterCorp.
As a starting point of the corpus analysis, we were interested to see whether the
source of the reported information (original speaker) is indeed left unexpressed, as
the dictionary definition of prý suggests. There turns out to be a difference between
Fiction, where the source of the reported information tends to be expressed (60% of
correspondences of prý in Czech target texts and 57.4% of correspondences of prý
in Czech source texts), and the other subcorpora, in which it is left unexpressed in
the majority of tokens; this is most evident in Europarl.
What follows is that in the texts of Fiction, the dominant function of prý is quota-
tive. Czech and English are two languages in which evidentiality is not grammati-
calised, but both have lexical markers of evidentiality. Our analysis of the
correspondences of prý in Fiction reveals that prý in the reporting function is more
often added than omitted. This might suggest that in Czech there is a stronger ten-
dency to mark the external information source than in English – even if there is an
evidential marker such as the verb say, in Czech there is a tendency to reinforce it
lexically with prý (our indirect correspondences). In the remaining registers, prý
was mainly a reported evidential (the source was not known).
Ultimately, we wanted to prove or disprove the dictionary definition of prý,
according to which it is a modal particle expressing doubt whenever it does not
introduce direct speech. It turned out that in all registers alike, regardless of whether
the source of information was known or unknown, prý hardly ever corresponded to
a modal marker. In the majority of cases, it was rendered as an evidential marker:
the most typical correspondences included the verb say and evidential adverbs.
In the cases in which there was a lack of speaker commitment to the factuality of
the proposition (i.e. uncertainty or doubt), it was contextually bound, which is why
we concluded that the uncertainty or doubt are only epistemic overtones (“invited
inferences” in the sense of Traugott and Dasher [e.g. 2002]).
We noted that the rise of prý can be described in terms of subjectification in the
sense of Traugott (e.g. 1995) – ultimately, prý is used as a “marker of discourse
reference”; i.e. it has acquired “a metalinguistic function of creating text and signal-
ling information flow” (1995, 39). We also noticed cases in which the default report-
ing function was blocked because the addressee of the reporting event was present
in the original interchange, i.e. the information was not new. A special case of this
occurs when the addressee of the reporting event is the original source: we classify
such cases as instances of the interpersonal function of prý; the sentences express a
negative attitude, criticism or irony. In some of these cases, the modal overtones of
doubt are not present at all.
We believe that the conclusions we draw from our approximations to spoken
language are plausible for spoken language in general. To really confirm this, we
would need a corpus of interpreted spontaneous dialogues. What urgently calls for
investigation now is the diachrony of the process of subjectification of prý.
Appendices
Appendix 1: InterCorp Texts: Fiction
[Cz-EnFict:KP_SZS] Kohout, Pavel. Sněžím. Zpověď Středoevropanky. 1993.

Translated by Neil Bermel as I am Snowing: The
Confessions of a Woman of Prague. 1995.
[Cz-En. Škvorecký, Josef. Příběh inženýra lidských duší. 1992.
Fict.:SJ_EHS] Translated by Paul Wilson as Engineer of Human Souls.
1984.
[Cz-En.Fict:SV_SP] Stýblová, Valja. Skalpel, prosím. 1981. Translated by John
Newton as Scalpel, Please. 1985.
[Cz-En. Viewegh, Michal. Výchova dívek v Čechách. 1995.
Fict:VM_VDC] Translated by A.G. Brain as Bringing Up Girls in
Bohemia. 1996.
[En-Cz.Fict:BJ_S] Banville, John. The Sea. 2005. Translated by Richard
Podaný as Moře. 2006.
[En-CzFict:CR_T] Cook, Robin. Toxin. 1998. Translated by Miroslav Jindra
as Toxin. 1998.
[En-CzFict:CT_GP] Chevalier, Tracy. Girl with a Pearl Earring. 1999.
Translated by Ivana Breznenová as Dívka s perlou. 2000.
[En-Cz. Day, Cathy. The Circus in Winter. 2004. Translated by
Fict:DC_CW] Milena Pellarová and Šimon Pellar as Cirkus v zimě.
2005.
[En-Cz.Fict:GJ_SL] Grisham, John. The Street Lawyer. 1998. Translated by
Jan Jirák as Advokát chudých. 1998.
[En-Cz.Fict:GJ_T] Grisham, John. The Testament. 1999. Translated
by Vladimír Panoš as Poslední vůle. 1999.
[En-CzFict:GW_LF] Golding, William. Lord of the Flies. 1954. Translated by
Heda Kovályová as Pán much. 1968.
[En-Cz. Hemingway, Ernest. The Old Man and the Sea. 1952.
Fict:HE_OMS] Translated by František Vrba as Stařec a moře. 1957.
[En-Cz. Ishiguro, Kayuo. An Artist of the Floating World. 1986.
Fict:IK_AFW] Translated by Jiří Hanuš as Malíř pomíjivého světa. 1999.
[En-Cz. Krentz, Jayne Ann. Falling Awake. 2004. Translated by
Fict:KJA_FA] Hana Krejčí as Zajatci snů. 2006.
[En-CzFict:RF_HS] Roth, Philip. The Human Stain. 2000. Translated by Jiří
Hanuš as Lidská skvrna. 2005.
[En-CzFict:RJK_PA] Rowling, J. K. Harry Potter and the Prisoner of Azkaban.
1999. Translated by Vladimír Medek as Harry Potter a
vězeň z Azkabanu. 2001.
[En-Cz.Fict:RJK_PS] Rowling, J. K. Harry Potter and the Philosopher’s Stone.

1997. Translated by Vladimír Medek as Harry Potter a
kámen mudrců. 2000.
[En-CzFict:SAR_ Siddonsová, Anne Rivers. Hill Towns. 1993. Translated
HT] by Hana Parkánová-Whitton as Bezpečné výšiny. 2004.
[En-Cz.Fict:SD_SC] Steel, Danielle. Second Chance. 2004. Translated by
Dana Lagronová as Druhá šance. 2005.
Appendix 2: InterCorp Texts: Subtitles
[Cz-En.Subt: Byl jednou jeden král (There Once Was a King…). 1955. Dir.
BJJK] Bořivoj Zeman.
[Cz-En.Subt: Čert ví proč (The Devil Knows Why). 2003. Dir. Roman
CVP] Vávra.
[Cz-En.Subt:PSP] Pension pro svobodné pány (Pension for Single Gentlemen).
1967. Dir. Jiří Krejčík.
[Cz-En.Subt: Pane, vy jste vdova! (You’re a Widow, Sir!). 1970. Dir.
PVV] Václav Vorlíček.
[En-Cz.Subt:C] Congo (Kongo). 1995. Dir. Frank Marshall.
[En-Cz.Subt:K2] K2 (K2). 1991. Dir. Franc Roddam.
[En-Cz.Subt:QS] Quantum of Solace (Quantum of Solace). 2008. Dir. Marc
Forster.
[En-Cz. The Return of the Living Dead (Návrat oživlých mrtvol).
Subt:RLD] 1985. Dir. Dan O’Bannon.
[En-Cz.Subt:SFA] Shrek Forever After (Shrek: Zvonec a konec). 2010. Dir.
Mike Mitchell.
[En-Cz.Subt:TD] The Duellists (Soupeři). 1977. Dir. Ridley Scott.
[En-Cz.Subt:TH] The Rock (Skála). 1996. Dir. Michael Bay.
References
Aijmer, K., & Simon-Vandenbergen, A.-M. (2004). A model and a methodology for the study of
pragmatic markers: The semantic field of expectation. Journal of Pragmatics, 36, 1781–1805.
Aikhenvald, A. Y. (2004). Evidentiality. Oxford: Oxford University Press.
Altenberg, B., & Granger, S. (2002). Recent trends in cross-linguistic lexical studies. In
B. Altenberg & S. Granger (Eds.), Lexis in contrast: Corpus-based approaches (pp. 3–48).
Bybee, J., Perkins, R., & Pagliuca, W. (1994). The evolution of grammar: Tense, aspect, and
modality in the languages of the World. Chicago: Chicago University Press.
Chafe, W. (1986). Evidentiality in English conversation and academic writing. In W. Chafe &
J. Nichols (Eds.), Evidentiality: The linguistic coding of epistemology (pp. 61–273). Norwood:
Ablex.
Chafe, W., & Nichols, J. (Eds.). (1986). Evidentiality: The linguistic coding of epistemology.
Norwood: Ablex.
Davidse, K., Vandelanotte, L., & Cuyckens, H. (Eds.). (2010). Subjectification, intersubjectifica-
tion and grammaticalization. Berlin: De Gruyter Mouton.
Dendale, P., & Tasmowski, L. (Eds.). (2001). On evidentiality. Amsterdam: Elsevier. Special issue
of Journal of Pragmatics 33(3).
Fronek, J. (2000). Velký česko-anglický slovník. Prague: LEDA.
Gast, V., & Levshina, N. (2014). Motivating W(h)-clefts in English and German: A hypothesis-
driven parallel corpus study. In A.-M. de Cesare (Ed.), Frequency, forms and functions of cleft
constructions in Romance and Germanic. Contrastive, corpus-based studies (pp. 377–414).
Berlin: De Gruyter Mouton.
Grepl, M. (2002). Reprodukce prvotních výpovědí. In P. Karlík, M. Nekula, & J. Pleskalová (Eds.),
Encyklopedický slovník češtiny. Prague: Nakladatelství Lidové noviny.
Grepl, M., & Karlík, P. (1998). Skladba češtiny. Olomouc: Votobia.
Hasselgård, H. (2010). Parallel Corpora and contrastive studies. In: R. Xiao (Ed.), Proceedings of
the international symposium on Using Corpora in Contrastive and Translation Studies 2010
Conference (UCCTS2010). http://www.lancaster.ac.uk/fass/projects/corpus/
UCCTS2010Proceedings/papers/Hasselgard.pdf. Accessed 1 July 2015.
Hirschová, M., & Schneiderová, S. (2012). Evidenciální výrazy v českých publicistických textech
(případ údajně–údajný). In Gramatika a korpus/Grammar and Corpora 2012. http://www.ujc.
cas.cz/miranda2/export/sitesavcr/data.avcr.cz/humansci/ujc/vyzkum/gramatika-a-korpus/pro-
ceedings-2012/konferencni-prispevky/HirschovaMilada_SchneiderovaSona.pdf. Accessed 1
July 2015.
Hoffmanová, J., & Kolářová, I. (2007). Slovo prý/prej: možnosti jeho funkční a sémantické dife-
renciace. In F. Štícha & J. Šimandl (Eds.), Gramatika a korpus/Grammar and Corpora 2005.
Prague: Ústav pro jazyk český Akademie věd České republiky.
Huddleston, R., & Pullum, G. K. (2002). The Cambridge grammar of the English language.
Cambridge: Cambridge University Press.
Ifantidou-Trouki, E. (1993). Sentential adverbs and relevance. Lingua, 90, 69–90.
Johansson, S. (2007). Seeing through multilingual Corpora. On the use of Corpora in contrastive
studies. John Benjamins: Amsterdam.
Komárek, M., Kořenský, J., Petr, J., Veselková, J., et al. (1986). Mluvnice češtiny 2. Tvarosloví.
Prague: Academia.
Krčmová, M. (2002). Čeština obecná. In P. Karlík, M. Nekula, & J. Pleskalová (Eds.),
Encyklopedický slovník češtiny. Prague: Nakladatelství Lidové noviny.
Lyons, J. (1977). Semantics. Cambridge: Cambridge University Press.
Machek, V. (2010). Etymologický slovník jazyka českého. Prague: Nakladatelství Lidové noviny.
Oxford Advanced Learner’s Dictionary. (2010). 8th ed. Ed. A. S. Hornby. Oxford: Oxford
University Press.
Palmer, F. R. (1986). Mood and modality. Cambridge: Cambridge University Press.
Papafragou, A. (2006). Epistemic modality and truth conditions. Lingua, 116, 1688–1702.
Plungian, V. A. (2001). The place of evidentiality within the universal grammatical space. Journal
of Pragmatics, 33(3), 349–357.
Quirk, R., Greenbaum, S., Leech, G., & Svartvik, J. (1985). A comprehensive grammar of the
English language. London: Longman.
Slovník spisovné češtiny pro školu a veřejnost [SSČ]. (2009). Prague: Academia.
Trávníček, F. (1951). Mluvnice spisovné češtiny II. Skladba. Prague: Slovanské nakladatelství.
Traugott, E. C. (1989). On the rise of epistemic meanings in English: An example of subjectifica-
tion in semantic change. Language, 65, 31–55.
Traugott, E. C. (1995). Subjectification in Grammaticalization. In D. Stein & S. Wright (Eds.),
Subjectivity and subjectivisation: Linguistic perspectives (pp. 31–54). Cambridge: Cambridge
University Press.
Traugott, E. C. (2003). From subjectification to intersubjectification. In R. Hickey (Ed.), Motives

for language change (pp. 124–139). Cambridge: Cambridge University Press.
Traugott, E. C. (2010). (Inter)Subjectivity and (Inter)subjectification: A reassessment. In
K. Davidse, L. Vandelanotte, & H. Cuyckens (Eds.), Subjectification, intersubjectification and
grammaticalization (pp. 29–71). Berlin: De Gruyter Mouton.
Traugott, E. C., & Dasher, R. (2002). Regularity in semantic change. Cambridge: Cambridge
University Press.
Willett, T. L. (1988). A cross-linguistic survey of the grammaticization of evidentiality. Studies in
Language, 12(1), 51–97.
Corpora
Czech National Corpus – InterCorp. Institute of the Czech National Corpus, Prague. http://www.
korpus.cz.
Czech National Corpus – SYN2000. Institute of the Czech National Corpus, Prague. http://www.
korpus.cz.
Modal Adverbs of Certainty in EU Legal
Discourse: A Parallel Corpus Approach
Magdalena Szczyrbak
Abstract Modal adverbs of certainty are commonly found in argumentative writ-

ing, where they operate as stance markers and engagement devices guiding the read-
ers towards the author’s intended interpretation. Such is also the case with legal
opinions which abound in instances of explicit authorial marking, although, natu-
rally, author visibility levels vary from language to language. This chapter examines
the use of modal adverbs of certainty in judicial argumentation as attested by the
English and Polish versions of the Opinions of Advocates General which represent
EU legal discourse. To this end, a parallel corpus approach is used to explore “pro-
totypical” meanings and context-dependent renderings of selected English adverbs
in the Polish language and to investigate the effect that omissions of these adverbs
have on the argumentative force of the translated texts. The study not only shows
conventional and ad hoc meanings of the adverbs analysed, but also reveals differ-
ences between the author visibility levels and the rhetorical force of the original
English texts and that of their translated Polish equivalents.
Keywords EU legal discourse • Judicial opinions • Legal argumentation • Modal

adverbs • Parallel corpora • Stance
1 Introduction
Although traditional studies into modality focus on modal auxiliaries, more recent
approaches recognise the interplay of lexical means including modal nouns, adjec-
tives and adverbs. Modal adverbs of certainty, it can be argued, attest to the dialogic
orientation of discourse, since they help speakers and writers to “contest, refute, or
build an argument toward alternative or different conclusions” (Traugott 2010: 15).
This being the case, modal adverbs are used for rhetorical purposes and they serve,
among other goals, to foreground stronger arguments and to background alternative
voices. This, in turn, makes them a useful rhetorical device which is frequently
deployed in argumentative writing. This chapter examines the use of modal adverbs
M. Szczyrbak (*)
Institute of English Studies, Jagiellonian University, Kraków, Poland
e-mail: magdalena.szczyrbak@uj.edu.pl

DOI 10.1007/978-3-319-54556-1_5
92 M. Szczyrbak
of certainty in legal argumentation drawing on bilingual data from the Opinions of

Advocates General representing EU legal discourse. More precisely, the corpus data
are used to explore both “prototypical” meanings and context-dependent renderings
of English adverbs in the Polish language and to investigate the translation choices
made by legal translators working in a multilingual setting.
2 Modal Adverbs and Stance
Modal adverbs of certainty, whose role in discourse goes far beyond that of marking
varying degrees of certitude, are inextricably linked to stance and argumentation.
On the one hand, as epistemic stance devices, they “can mark certainty (or doubt),
actuality, precision, or limitation” as well as “indicate the source of knowledge or
the perspective from which the information is given” (Biber et al. 1999: 972). On
the other, they “are indexically related to variables in the social situation and are
associated with types of social activity, with social roles and with power” (Simon-
Vandenbergen and Aijmer 2007: 5). Put differently, they are linked to cultural and
social dimensions including social acts, activity types, social identity and relation-
ships (Simon-Vandenbergen and Aijmer 2007: 55–56). As for the conceptualisation
of stance, the term lends itself to a variety of (often complementary or overlapping)
interpretations, given that it can be expressed by a multitude of linguistic and para-
linguistic resources. Starting with Biber et al. (1999: 966), stance is defined as “per-
sonal feelings, attitudes, value judgments, or assessments”. It is also theorized as
“the space in language where literal, figurative, and functional meanings intersect”
(Precht 2003: 239) or, elsewhere, as a situational dimension which encompasses
“types of (epistemic or affective) attitude and degrees of affective intensity or
strength of commitment” (Aijmer 2007: 330). The role of paralinguistic elements in
expressing stance, in turn, is recognised by Chindamo et al. (2012), for whom com-
municative stance denotes an “attitude which, for some time, is expressed and sus-
tained interactively in communication, in a unimodal or multimodal manner.”
Another approach sees stance as “a linguistically articulated form of social action”
(du Bois 2007: 139), whereby social actors, using covert communicative means,
simultaneously evaluate objects, position themselves and others, and align with
other subjects (du Bois 2007: 163).1 Yet another view – which is most relevant to the
current study – is expounded by Hyland (2005), studying the resources which aca-
demic writers employ in order to express their positions and to connect to readers.
Unlike du Bois, whose definition of stance includes the mutual positioning of
1
In agreement with this view, various interactional practices and linguistic resources have been
explored to date, including, for instance, the role of I guess in conversational stancetaking
(Kärkkäinen 2007), digressing (Kärkkäinen 2012), positioning and alignment in news interviews
(Haddington 2007), resonance in storytelling (Niemelä 2011), challenging the prior speaker and
tag questions (Keisanen 2006, 2007) as well as repetition and returning to prior talk (Rauniomaa
2007, 2008).
Modal Adverbs of Certainty in EU Legal Discourse: A Parallel Corpus Approach 93
subjects, Hyland situates stance (i.e. an attitudinal dimension) alongside engage-

ment (i.e. an alignment dimension), explaining that while stance “includes features
which refer to the ways writers present themselves and convey their judgements,
opinions, and commitments” as well as “intrude to stamp their personal authority
onto their arguments or step back and disguise their involvement” (Hyland 2005:
176), engagement refers to the ways in which “writers acknowledge and connect to
others, recognising the presence of their readers, pulling them along with their argu-
ment, focusing their attention, acknowledging their uncertainties, including them as
discourse participants, and guiding them to interpretations” (Hyland 2005: 176).
One of the ways in which writers can effectively engage in a dialogue with read-
ers is the use of adverbials, including modal adverbs of certainty, which are one of
the most frequently discussed devices speakers and writers use to express epistemic
attitudes and to negotiate viewpoints across various settings (see, e.g., Chafe 1986;
Hoye 1997; Biber and Finegan 1988, 1989; Hyland 1994; Salager-Meyer 1995;
Downing 2009; Adams and Quintana-Toledo 2013). However, the treatment of
adverbs of certainty in the literature is not consistent and they are classified in dis-
parate ways reflecting the scholars’ varied approaches and interests.
To begin with, grammar books describe adverbs of certainty under various labels.
Quirk et al. (1985: 615), for instance, use the term content disjuncts to refer to
adverbials (placed alongside style disjuncts) which “make observations on the
actual content of the utterance and its truth conditions” and, in particular, the “degree
of or conditions for truth of content” or the “value judgment of content.” Of special
interest for the purposes of this study is the subclass of content (or attitudinal) dis-
juncts which represent comments on the degree of truth or truth conditions and,
more specifically, which express conviction (“as a direct claim” or “as an appeal to
general perception”) or some degree of doubt (Quirk et al. 1985: 620). Further, fol-
lowing syntactic rather than semantic criteria, the authors contrast disjuncts with
subjuncts (including emphasizers) and conjuncts; however, these distinctions are
not always precise. Biber et al. (1999: 853–857), on the other hand, classify stance
adverbials as those which have the “primary function of commenting on the content
or style of a clause or a particular part of a clause”, proposing three semantic catego-
ries: epistemic, attitude and style, the first of which is further subdivided into “doubt
and certainty,” “actuality and reality,” “source of knowledge,” “limitation,” “view-
point or perspective” and “imprecision.” Importantly, Biber et al. (1999: 857–858),
recognise the difficulty of drawing exact distinctions between certain classes of
adverbials which perform multiple functions. Finally, Huddleston and Pullum
(2002: 576) speak of “VP-oriented adjuncts” and “clause-oriented adjuncts” plac-
ing modal adjuncts within the latter of the two categories. Putting much emphasis
on the position-meaning correlation, Huddleston and Pullum stress that modal
adjuncts, which, they claim, can only have epistemic meanings, express four degrees
of certainty. It should be noted, however, that their groupings of strong and less
strong adverbs are not entirely consistent with the classification proposed by Quirk
et al. and discussed above.
Semantic and syntactic considerations aside, several recent studies centre on the
rhetorical or interpersonal functions of modal adverbs instead. For instance,
94 M. Szczyrbak
Simon-Vandenbergen and Aijmer’s (2007) corpus-based account of modal adverbs

of certainty highlights their dialogic potential and connection with the speakers’
social roles. In the authors’ view, when interpreted through the lens of rhetoric,
modal adverbs serve as stance markers and as such, they are used towards interper-
sonal ends, i.e. for alignment with, or disalignment from other utterances. In their
investigation, Simon-Vandenbergen and Aijmer (2007: 84) propose a division into
epistemic, evidential, expectation and speech act adverbs, admitting, however, that
the boundaries between individual classes are sometimes fluid.2 In a more finely-
grained distinction – considering the co-occurrence of features such as position,
modal status, heteroglossic functions, discourse functions, indexical stance and reg-
ister (Simon-Vandenbergen and Aijmer 2007: 279–280) – the scholars suggest that
the four groups of adverbs should be subdivided into clusters. To provide several
examples, the cluster obviously – of course – naturally comes to mean “as you and
I know,” whereas certainly – definitely – clearly denotes the “speaker’s committed
pronouncement” (Simon-Vandenbergen and Aijmer 2007: 317). Interestingly, cer-
tainly and of course are found in concessive contexts, with the first adverb fore-
grounding certainty in contrast with uncertainty and the latter foregrounding
common knowledge in contrast with new information (Simon-Vandenbergen and
Aijmer 2007: 317). As for the varied degrees of certitude, the pair surely – no doubt
literally means a high degree of certainty, but, as the scholars explain, it has devel-
oped a weaker epistemic stance, whereas obviously and evidently literally indicate
the presence of evidence, but, likewise, have developed a sense of “apparently” or
“as evidence seems to suggest” (Simon-Vandenbergen and Aijmer 2007: 316). In
sum, what the authors emphasise in their study is that the epistemic meanings of
adverbs need to be distinguished from their rhetorical functions and, further, that the
deployment of such adverbs reflects the multiplicity of perspectives and voices
which are built into the text (cf. Bakhtin 1981, White 2003) inasmuch as it mirrors
the type of social activity and the arguer’s social role and power.
The pragma-dialectic study authored by Tseronis (2009), in turn, focuses on the
discursive role of stance adverbs in qualifying standpoints, describing them as “pre-
sentational devices for managing the burden of proof.” In his view, qualifying a
standpoint, that is an assertive illocutionary act, can be understood in two ways,
namely “quantitatively, as adding an extra element that is structurally detached and
peripheral to the main constituents of a sentence, and qualitatively as adding some
extra meaning to the core meaning conveyed by parts of the sentence or by the sen-
tence as a whole” (Tseronis 2009: 25). He goes on to say that such peripheral ele-
ments “supply extra information that is not directly essential to the understanding of
the core meaning of what is being said” but which “is required in order to facilitate
the understanding of what is said and/or to relate what is said to the context in which
it is said” (Tseronis 2009: 26). In other words, qualifying comments can be
conceived of “as being either about the propositional content of the assertive speech
act or about the assertive speech act as a whole” (Tseronis 2009: 31), or more pre-
2
Cf. the concept of prototypicality and prototype theory, as proposed by cognitive linguists (see,
e.g., Geeraerts 2006).
cisely, they can express commitment to the propositional content, convey evaluation
of the propositional content or convey information about the performance of the act
as a whole (Tseronis 2009: 34). What is more, in line with pragma-dialectic assump-
tions, Tseronis (2009: 41) argues that standpoint qualification can be analysed as
part of strategic manoeuvring which language users implement in order to clearly
mark a difference of opinion, while promoting their own interests in the discourse.
Yet, it must be added that, however revealing, Tseronis’s study – following Biber
et al.’s (1999) classification of stance adverbs – aims to provide a theoretical tool for
analysing argumentative discourse rather than to account for the social or cognitive
reasons for qualifying standpoints in argumentation (Tseronis 2009: 12).
The interactional potential and the pragmatic reading of modal adverbs in one
specific discourse, i.e. in legal discourse, are, on the other hand, explored in
Szczyrbak (2014), where it is shown that in the legal context, modal adverbs serve
to foreground and background differing legal arguments and interpretations, on the
one hand, and to demonstrate power and authority, on the other. Again, it is con-
tended that – seen as a site of multiple voices – both spoken and written legal genres
can be approached as polyphonic or heteroglossic. In terms of frequencies, the find-
ings reveal that of course is by far the most common modal adverb in spoken genres
(especially in adversarial proceedings), whereas indeed is most frequently deployed
in written genres (including Opinions of Advocates General) (Szczyrbak 2014: 92).
Analysing the rhetorical effect of selected adverbs, Szczyrbak (2014: 98) also points
out that while of course and certainly are linked to politeness and solidarity, indeed
and clearly are associated with power and authority.3 What the aforementioned
study also shows is that modal adverbs are systematically interwoven into larger
argumentative schemata. Remarkably, in judicial argumentative patterns – incorpo-
rating both the arguer’s actual standpoint and alternative built-in voices – Concessive
sequences comprising claims, acknowledgments (i.e. moves in which the arguer
partly concedes an opposing viewpoint) and counterclaims4 are especially
noticeable.
To conclude this section then: there is a clear link between modal adverbs of cer-
tainty and stance, the principal assumption being that these language devices allow
speakers and writers to engage in a dialogue and to evaluate other standpoints.
Therefore, building on previous studies, in the current investigation I will take the
research into the usage of modal adverbs further by looking at their stancetaking
potential in forensic argumentation, on the one hand, and by examining their canoni-
cal and less obvious meanings, on the other. However, rather than treat stance and
engagement as complementary notions, as proposed by Hyland (2005), I will
conceive of stance as incorporating intersubjective positioning and audience involve-
ment features, among which modal adverbs of certainty play a prominent role.
3
It is also demonstrated that although of course often serves as a solidarity device, it can also be
used to assert authority and superiority of knowledge (Szczyrbak 2014: 97).
4
See Couper-Kuhlen and Thompson (2000) and Barth-Weingarten (2003) for a detailed descrip-
tion of this analytical model.
96 M. Szczyrbak
3 Aims, Methodology and Data
The present study aims to investigate the role of modal adverbs of certainty in the
Opinions of Advocates General and to explore their polysemy based on the transla-
tion patterns found in English and Polish data. In particular, an attempt will be made
to answer the following questions:
1. What conventional and context-specific meanings of modal adverbs of certainty
are revealed by the bilingual data under study?
2. How frequent are omissions in the translations of these adverbs and what effect
do these omissions have on the argumentative force of the translated texts?
Since in addressing the above issues corpus data will be used, it must be remem-
bered that various types of corpora (e.g. bilingual or multilingual) are widely applied
in contrastive and translation studies for theoretical or practical purposes. While
theoretically-oriented research investigates the manner in which the same ideas are
transmitted in various languages, practically-oriented explorations aim, for instance,
to develop machine translation and computer-assisted translation systems. It should
also be highlighted that, as held by Grisot and Moeschler (2014: 13), corpora allow
“the researcher to uncover on the one hand, what is probable and typical and, on the
other hand, what is unusual about the phenomenon considered.”
At this point, a note clarifying the meaning of parallel corpora is in order, espe-
cially given that there is some confusion related to this term.5 The terminology
adopted for the purposes of the current study is in line, for instance, with Baker
(1999), Hunston (2002) and McEnery and Xiao (2007), who draw a distinction
between a comparable corpus and a parallel corpus. In this approach, a comparable
corpus is defined as one with “the same proportions of the texts of the same genres
in the same domains in a range of different languages in the same sampling period”
(McEnery and Xiao 2007: 20). Thus, the subcorpora composing a comparable cor-
pus are not translations but rather, they use the same sampling frame and show
“similar balance and representativeness.” As regards the definition of a parallel cor-
pus, in the case of which the sampling period is irrelevant, the same linguists hold
that the term refers to “a corpus that contains source texts and their translations” and
which can be either bilingual or multilingual (McEnery and Xiao 2007: 19). What
is more, as McEnery and Xiao (2007: 19) see it, parallel corpora can be uni-
directional, bi-directional or multi-directional, the latter including texts which are
written simultaneously in different languages. Going further, McEnery and Xiao
(2007: 20) subdivide parallel corpora into general and specialised ones, stressing
that specialised parallel corpora (including, for instance, contract law texts) are par-
ticularly useful in domain-specific translation research.
5
For a discussion on the various labels used to describe different types of multilingual corpora, see
McEnery and Xiao (2007).
Understandably, regardless of their subtype, parallel corpora offer possibilities

of monolingual and cross-linguistic analyses of various discourse-pragmatic phe-
nomena and this applies to the usage of modal adverbs, too. At this point it might
be reiterated that a cross-linguistic approach to the study of modal adverbs, involv-
ing translations and back-translations, enables scholars to establish meaning rela-
tions within the semantic field of uncertainty (Simon-Vandenbergen and Aijmer
2007: 1) by reflecting the meaning of the adverbs in the other language. In a wider
perspective, the usefulness of a parallel corpus lies, then, in the possibility for the
researcher of discovering translations which “constitute paradigms representing a
broad spectrum of meanings”; to “get more correspondences or meanings than if
we consult a dictionary or use introspection” and to “get information about what
meanings of the source item are most frequent or salient” (Aijmer 2007: 333). This
view is followed in the current study aimed at revealing correspondences between
selected English modal adverbs and their Polish equivalents, as used in argumenta-
tive legal writing.
Turning now to the corpus design, the material used for analysis consists of 30
Opinions of the Advocates General at the European Court of Justice, issued between
2011 and 2013, comprising about 576,000 words. In order to discover subtle mean-
ing distinctions and to arrive at conventionalised and context-bound readings of the
adverbs under scrutiny, I have drawn data from aligned corpora and compared trans-
lations from English into Polish. In other words, I have used a parallel corpus (con-
taining source texts in English and their Polish translations) which is specialised
(representing one legal genre, i.e. Opinions of Advocates General), bilingual
(including English and Polish data) and uni-directional (containing translations
from English into Polish). The English texts of the Opinions were written by a
native speaker of English,6 whereas the Polish texts were translations made by
Polish professionals translating the texts of the Opinions into their mother tongue.
The genre of Opinion was used, since it was believed that this form of legal writ-
ing would be rife with persuasive devices and author visibility marking. This was
attributed to the fact that the Opinions serve primarily “to persuade the Court that
the solution proposed is well founded from a legal point of view and [that] the
court’s rulings should be based on it” and, further, “to persuade the litigants that the
rulings of the Court which follow are based on a thoroughly and justly argued legal
Opinion, and therefore, are the right decisions” (Salmi-Tolonen 2005: 66). As for
the classification of modal adverbs of certainty adopted in the present study, it fol-
lows Simon-Vandenbergen and Aijmer (2007), who distinguish four clusters: epis-
temic (e.g. certainly, definitely, indeed), evidential (e.g. obviously, clearly, plainly),
expectation (e.g. of course, naturally, inevitably) and speech act adverbs (e.g. admit-
tedly, undeniably, indisputably).
6
As there is only one British Advocate General at the ECJ, the Opinions used to compile the corpus
were written by one person. However, this fact appears to have no bearing on the results, since the
focus of the analysis is on the translation process.
98 M. Szczyrbak
Finally, it should also be noted that the range of the data used in the study was
limited and that, therefore, further research is needed for valid generalisations to be
made. Still, despite this limitation, it is believed that they offer insight into the poly-
semy of modal adverbs of certainty and that they can therefore be relevant to future
investigations focusing on other languages, discourses or genres.
4 Results and Discussion
At the outset of the investigation, the most frequent modal adverbs in the English
subcorpus were identified and then their Polish equivalents in the Polish subcorpus
were determined. As a result of the frequency count, the following modal adverbs
were identified as most common: indeed (83 tokens), necessarily (36 tokens), not
necessarily (35 tokens),7of course (35 tokens), clearly (32 tokens)8 and obviously (18
tokens). All the other adverbs which had fewer than 10 occurrences were excluded
from the analysis. For the individual translations and their frequencies, see Table 1.
In the remainder of this section I will illustrate, through examples, the usefulness
of parallel corpora in exploring the polysemy of English modal adverbs of certainty,
assuming that they can provide insight into what might remain unnoticed if only
monolingual corpora were consulted.
4.1 The Case of ‘Indeed’
As shown above, indeed was by far the most frequent modal adverb of certainty in
the corpus. Fourteen different translations of indeed were recognised in the Polish
data and as many as 16 omissions. Overall, indeed was found: (1) to co-occur with
the Concessive relation9; (2) to mark rhetorical emphasis or (3) to operate as a dis-
course marker. A relatively frequent co-occurrence pattern was that of the emphatic
do followed by indeed (9 occurrences), linked to Concession and associated with
acknowledgments. Example (1) below illustrates such an acknowledgment, sig-
nalled with the concessive whilst (choć in Polish), where the stress introduced by
indeed is strengthened by the emphatic do. Here, the arguer concedes that other
parts of the Framework Decision include references to national law, but, at the same
time, she claims that there is no such mention in the excerpt under consideration.
7
For the purpose of the analysis, this category subsumes instances of negation + necessarily (e.g.
not necessarily, cannot necessarily, without necessarily, etc.).
8
In total, there were 78 occurrences of clearly including its non-modal use as an adverb of
manner.
9
Following the convention found in Barth-Weingarten (2003), whenever capitalised, Concession
refers to the discourse-pragmatic relation, but when written with a lower-case letter, it denotes the
interclausal relation.
Table 1 Polish translations of selected English adverbs in the data

No. of
English adverbs tokens Polish translations
INDEED 83 w istocie ( as a matter of fact ) (23)
rzeczywiście (indeed/really) (16)
Ø [omission] (16)
bowiem (for/because) (8)
istotnie (in fact) (4)
faktycznie (actually/in fact) (3)
w rzeczywistości (in reality) (3)
w gruncie rzeczy (essentially) (2)
w rzeczy samej (in fact) (1)
wręcz (downright) (1)
jednak (however) (1)
właśnie (just) (1)
nawet (even) (1)
przyznać należy (admittedly/it must be admitted) (1)
z kolei (in turn) (1)
NECESSARILY 36 Ø [omission] (8)
z konieczności (out of necessity) (6)
koniecznie (necessarily) (5)
w sposób nieunikniony (unavioudably/in an unavoidable
manner) (4)
musieć (have to/must) (4)
w sposób konieczny (necessarily/in a necessary manner)
(2)
w sposób oczywisty (evidently, in an evident manner) (1)
nieuchronnie (inevitably) (1)
w konsekwencji (as a consequence) (1)
automatycznie (automatically) (1)
bezwzględnie (unconditionally) (1)
żeby trzeba było (in order to have to) (1)
NOT NECESSARILY 35 niekoniecznie ( not necessarily ) (20)
nie musi (wcale) (not have to (at all)) (11)
nie zawsze (not always) (1)
może nie być (may not be) (1)
wcale nie (not at all) (1)
Ø [omission] (1)
OF COURSE 35 oczywiście ( of course/obviously ) (30)
naturalnie (naturally) (2)
z pewnością (with certainty) (1)
rzecz jasna (needless to say) (1)
Ø [omission] (1)
(continued)
100 M. Szczyrbak
Table 1 (continued)
No. of
English adverbs tokens Polish translations
CLEARLY 32 w sposób oczywisty/w oczywisty sposób ( evidently, in
an evident manner ) (6)
wyraźnie (plainly/expressly) (5)
oczywiście (of course/obviously) (4)
najwyraźniej (apparently/most obviously) (3)
Ø [omission] (3)
oczywisty (obvious/evident) (2)
bez wątpienia (undoubtedly/without a doubt) (2)
jasno (plainly) (2)
naturalnie (naturally) (1)
właśnie (just) (1)
bezwzględnie (unconditionally) (1)
nie ma wątpliwości (there is no doubt) (1)
w sposób wyraźny (plainly/in an express manner) (1)
OBVIOUSLY 18 oczywiście ( of course/obviously ) (11)
oczywisty (obvious/evident) (2)
Ø [omission] (2)
w oczywisty sposób/w sposób oczywisty (evidently, in an
evident manner) (2)
wprost (simply) (1)
(1)
ENG:
The objective pursued by the Framework Decision has already been
identified: the enforcement of financial penalty decisions through mutual
recognition. (14) The term ‘court having jurisdiction in particular in
criminal matters’ used in Article 1(a)(iii) plays a crucial role in determining
the scope of the Framework Decision, because it defines a category of
financial penalty decision that benefits from mutual recognition and hence
enforcement. Whilst other parts of the Framework Decision do indeed
cross-refer to national law, (15) here there is no such mention.
POL:
Cel decyzji ramowej został już wskazany: wykonywanie orzeczeń
nakazujących uiszczenie kary o charakterze pieniężnym w drodze
wzajemnego uznawania (14). Wyrażenie „sąd właściwy także w sprawach
karnych” zawarte w art. 1 lit. a) pkt iii) odgrywa kluczową rolę przy
określaniu zakresu decyzji ramowej, ponieważ określa ono kategorię
orzeczeń nakazujących uiszczenie kary o charakterze pieniężnym
korzystających z wzajemnego uznawania, a w konsekwencji – wykonania.
Choć w innych przepisach decyzji ramowej rzeczywiście występują
odesłania do prawa krajowego (15), to omawiany przepis ich
nie zawiera. [OAG_7]
Likewise, (2) shows how indeed is deployed in combination with clear to highlight
this part of the argument which is conceded (“that Mrs McCarthy could stay in the
United Kingdom on her own”) and how this acknowledgment is contrasted with the
contested part of the argument (“it is less clear whether the Court considered the
detailed implications”).
(2)
ENG:
0 [IMPLIED CLAIM]
X’ [ACKNOWLEDGMENT]
Whilst it is indeed clear that Mrs. McCarthy could stay in the United
Kingdom on her own by virtue of her nationality and that she was not
being deprived of a right to move under EU law by denying her husband
derived rights as a third country national family member,
Y [COUNTERCLAIM]
it is less clear whether the Court considered the detailed implications.
Perhaps the short answer was simply ‘EU law can’t help: try the ECHR’.
POL:
0 [IMPLIED CLAIM]
O ile rzeczywiście jest bezsporne, że S. McCarthy sama posiadała prawo
pobytu w Zjednoczonym Królestwie z uwagi na swoje obywatelstwo,
jak również że nie pozbawiano jej prawa do przemieszczania się na
gruncie prawa Unii poprzez odmowę jej mężowi prawa pobytu jako
obywatelowi państwa trzeciego będącemu członkiem rodziny,
Y [COUNTERCLAIM]
o tyle jest już mniej oczywiste, czy Trybunał przeprowadził analizę
szczegółowych implikacji. Niewykluczone, że odpowiedzią jest po
prostu: „Prawo Unii nie może nic zdziałać: spróbujcie w Europejskim
Trybunale Praw Człowieka”. [OAG_3]
As can be seen, both in (1) and (2) indeed is translated conventionally as

rzeczywiście (indeed/really), although in the whole corpus w istocie (as a matter of
fact) proved to be a more frequent choice (23 to 16 tokens, respectively), followed
by less common equivalents such as istotnie (in fact), faktycznie (actually/in fact) or
w rzeczywistości (in reality), and a few more translations attested by single occur-
rences (for a complete list see Table 1). As for the rhetorical appeal of the English
and Polish versions, accordingly, it should be noted that in (1) the emphatic do is left
untranslated, since it has no formal counterpart in Polish. Thus, the Polish text
appears less forceful. With regard to the translation of clear in (2), on the other
hand, it may be speculated that the co-occurrence of bezsporne (undisputable/
incontestible) and oczywiste (obvious/evident) creates a more powerful effect than
would be the case if the conventional translation of clear as jasny or oczywisty was
repeated.
102 M. Szczyrbak
A somewhat unexpected discovery was the translation of indeed in the form of

bowiem (for/because), found in the case of sentence-initial occurrences of indeed (8
tokens) and, as can be justifiably claimed, linked to its discourse-organising func-
tion. In fact, this observation corroborates what Simon-Vandenbergen and Aijmer
(2007: 281) say about sentence adverbs, namely that “[w]hen separated from the
rest of the sentence by a pause or comma these peripheral positions signal their
loose connection to the clause and their discourse marker status.” This is illustrated,
for instance, in (3), where indeed is found in sentence-initial position. It can also be
observed that there is no corresponding structural element in the Polish version,
where the power-neutral bowiem is integrated into the sentence instead.
(3)
ENG:
I see no basis for saying that, in such circumstances, the EU citizen
should be required temporarily to sacrifice his right to a family life
(or, put slightly differently, that he should be prepared to pay that price in
order subsequently to be able to rely on EU law as against his own
Member State of nationality). Indeed, under Directive 2004/38, family
members are entitled to accompany the EU citizen immediately to the
host Member State. Directive 2004/38 does not make their entitlement to
that derived right conditional on a minimum residence requirement for
the EU citizen. Rather, the conditions applicable to the dependents vary
with length of residence in the territory.
POL:
Nie widzę żadnych podstaw do twierdzenia, że w takich okolicznościach
od obywatela Unii można wymagać tymczasowego poświęcenia prawa
do życia rodzinnego (albo, ujmując rzecz nieco odmiennie, że powinien
on być przygotowany na zapłatę tej ceny za możliwość powołania się w
terminie późniejszym na prawo Unii względem państwa członkowskiego,
którego obywatelstwo posiada). Zgodnie bowiem z dyrektywą 2004/38
członkowie rodziny uprawnieni są do towarzyszenia obywatelowi Unii
bezpośrednio w państwie członkowskim pochodzenia. Dyrektywa 2004/38
nie uzależnia ich ewentualnego uprawnienia do prawa pochodnego od
wymogu minimalnego pobytu dla obywatela Unii. Przeciwnie, warunki
mające zastosowanie względem osób pozostających na utrzymaniu mogą
różnić się w zależności od długości pobytu na terytorium. [OAG_3]
We may wonder what effect the insertion of bowiem in the excerpt in (3) has on
the interpretation of the relation holding between the sentence with indeed and the
preceding one. In the English excerpt, indeed indicates a kind of sequential relation-
ship between the sentences and it may well be paraphrased as “what is more” (cf.
Aijmer 2007: 332). In addition, it “signals that what follows is not only in agree-
ment with what precedes, but is additional evidence being brought to bear on the
argument” (Traugott and Dasher 2002: 164 quoted in Aijmer 2007: 332). The Polish
translation, however, is not entirely consistent with the source text, since the use of
bowiem in the Polish sentence suggests that according to the author of the Opinion,
EU citizens should not be required to sacrifice their right to family life and that this
fact follows from Directive 2004/38 under which “family members are entitled to
accompany the EU citizen immediately to the host Member State.” In addition, in
the source text, indeed is not used to mark causality; rather, it adds emphasis and has
a discourse marker status. Finally, the authority associated with the English indeed
is no longer detectable in the Polish wording. Interestingly, the cross-checking of
the English correspondences of bowiem found in the Polish data suggests that bow-
iem is sometimes inserted in the Polish translation to mark cohesion, even where
there is no direct equivalent in the English source text.
In a similar vein, omission of indeed in the Polish version of the Opinion seems
to lessen the rhetorical force of the translated text and, potentially, its ability to influ-
ence the reader’s attitude and beliefs. During the analysis, several patterns became
visible. Firstly, it was observed that sentence-initial occurrences of indeed were
sometimes left untranslated (7 tokens), as in (4). It must be admitted, however, that
although these omissions accounted for almost 50% of all omissions of indeed,
sentence-initial uses of this adverb were more frequently rendered in Polish as w
istocie (as a matter of fact) or bowiem (for/because) discussed above.
(4)
ENG:
The objective of those articles was to protect shareholders and creditors
from market behaviour that might reduce a company’s capital and falsely
raise its share price. That objective is not defeated by a company acquiring
its own shares where a legal obligation requires it to do so. Indeed, as the
Portuguese Government and the Commission rightly point out, Article
20(1)(d) specifically permits Member States to allow a company to acquire
shares ‘by virtue of a legal obligation’ without having recourse to the
procedures laid down in Article 19.
POL:
Celem tych artykułów była ochrona akcjonariuszy i wierzycieli przed
zachowaniami rynkowymi, które mogą zmniejszyć kapitał spółki lub
sztucznie podwyższyć cenę akcji spółki. Z celem tym nie jest sprzeczne
nabycie przez spółkę jej akcji w wykonaniu obowiązku przewidzianego
prawem. [OMISSION] Jak trafnie wskazały rząd portugalski i Komisja,
art. 20 ust. 1 lit. d) pozwala państwom członkowskim na nabycie akcji
właśnie „w wykonaniu obowiązków ustawowych”, bez konieczności
stosowania procedur przewidzianych w art. 19. [OAG_5]
Secondly, the strategy of omission was seen also in the case of parenthetical uses
of indeed, most notably in the structures: and indeed, or indeed and though indeed,
as illustrated in (5), in which “the right to impose criminal sanctions” is no longer
emphasised in the Polish text, unlike the English original.
104 M. Szczyrbak
(5)
ENG:
Article 25 merely confirms that the administrative measures and
sanctionsthat it requires Member States to impose are ‘without prejudice
to their civil liability regime[s]’ (or indeed to their right to impose
criminal sanctions).
POL:
Artykuł 25 jedynie potwierdza, że środki i sankcje administracyjne,
których nakładania wymaga on od państw członkowskich, pozostają, bez
uszczerbku dla ich systemu odpowiedzialności cywilnej” (lub
[OMISSION] ich prawa do nakładania sankcji karnych). [OAG_5]
In the Polish version of the Opinion in (6), similarly, there is no equivalent of

indeed, and the rhetorical effect is lost also through the choice of the much weaker
Polish adjective dolegliwe (bothersome) which is a not-so-perfect equivalent of the
much stronger and value-laden English adjective repugnant (with odrażający
(abhorrent) being the conventional translation in Polish).
(6)
ENG:
It follows from the references there to ‘sufficiently serious’, ‘severe
violation’ and ‘accumulation … which is sufficiently severe’ that not
every violation of human rights (repugnant though it indeed may be)
will fall to be considered as an ‘act of persecution’ for the
purposes of Article 9.
POL:
Z zawartych w nim wyrażeń: „wystarczająco poważne”, „poważne
naruszenie” i „kumulacja […] naruszeń […], które są wystarczająco
poważne” wynika, że nie każde naruszenie praw człowieka
(niezależnie od tego, jak [OMISSION] może być dolegliwe) można
uznać za kwalifikujące się jako „akt prześladowania” do celów art.
9 dyrektywy. [OAG_9]
In sum, the contrastive analysis has shown that indeed can adopt different meanings
and that these meanings are not always interchangeable. It was also demonstrated
that during the translation process the rhetorical force of arguments may be affected
due to the omission of this adverb – which itself can be interpreted as another mean-
ing – or through the choice of non-conventional equivalents of the co-occurring
adjectives.
4.2 The Case of ‘(Not) Necessarily’
Let us now turn to (not) necessarily – listed by Simon-Vandenbergen and Aijmer

(2007) among expectation adverbs – which was attested by 71 tokens in the corpus.
For the purpose of the study, necessarily (36 tokens) and negation + necessarily (35
tokens) were listed as two separate categories. In the case of necessarily, there were
eight omissions in the Polish data,10 as well as the following translations: z
konieczności (out of necessity), koniecznie, w sposób nieunikniony (unavoidably/in
an unavoidable manner) and musieć (have to/must). Example (7) illustrates a typi-
cal translation of necessarily as koniecznie, reflecting “[t]he unavoidable nature of
the information” marked by this adverb (Simon-Vandenbergen and Aijmer 2007:
38) and something “necessitated by circumstances,” rather than expressing the writ-
er’s subjective (and gradable) commitment (Simon-Vandenbergen and Aijmer 2007:
188).
(7)
ENG:
I therefore have little difficulty in agreeing with the majority of the
submissions to the Court on this question that Article 5(2)(a) of the
Directive covers only analogue to analogue copying. The word
‘photographic’ necessarily requires optical input of an analogue
original, and the need for paper or a similar output medium means
that the output must also be analogue.
POL:
Z tego względu nie mam wielkich trudności, aby zgodzić się ze
zgłoszonym Trybunałowi w zakresie tego pytania stanowiskiem
większości, zgodnie z którym art. 5 ust. 2 lit. a) dyrektywy obejmuje
tylko kopiowanie „z formatu analogowego na analogowy”. Słowo
„fotograficzna” koniecznie wymaga optycznego wprowadzenia
oryginału w formie analogowej, a potrzeba posłużenia się papierem lub
podobnym nośnikiem wyjściowym oznacza, że etap wyjścia musi
dotyczyć formy analogowej. [OAG_12]
The effect of the omission of necessarily in the translated text, as compared with the
original, can in turn be observed in (8) and (9). Accordingly, the Polish wording in
(8), i.e. “miał on wiedzę” (he was aware), lacks any equivalent unit signalling the
writer’s epistemic stance conveyed in the English text by necessarily,11 similarly to
(9), in which the deontic modalisation expressed by necessarily is no longer present
in the Polish unmodalised statement “a w konsekwencji arbitralny” (and hence
arbitrary).
Remarkably, it was the most frequent translation strategy in the case of this adverb.
10
As pointed out by Simon-Vandenbergen and Aijmer (2007: 188), epistemic uses of necessarily
11
and inevitably are infrequent.

106 M. Szczyrbak
(8)
ENG:
I am also far from certain that he would necessarily have been aware
of the (limited) possibilities of applying to this Court for legal aid.
POL:
Daleka jestem również od pewności, że miał on [OMISSION] wiedzę
na temat (ograniczonych) możliwości zwrócenia się do Trybunału
o pomoc prawną. [OAG_7]
(9)
ENG:
In order to avoid this logical conundrum, most legal residence tests
specify a fixed (and hence necessarily arbitrary) ‘qualifying’ period
of presence before residence is achieved. There is no objective difference,
however, between presence the day before and presence the day after
the magic figure is attained.
POL:
W celu uniknięcia tej łamigłówki logicznej większość kryteriów
prawnych zamieszkania przewiduje określony (a w konsekwencji
[OMISSION] arbitralny) okres „kwalifikacyjny” obecności, zanim
nastąpi zamieszkanie. Nie ma jednak żadnej obiektywnej różnicy
pomiędzy obecnością w dniu poprzedzającym magiczną cezurę a
obecnością w dniu następującym po niej. [OAG_7]
In contrast to the translations of necessarily, less variety was observed in the case
of not necessarily, with 20 instances of the prototypical niekoniecznie, 11 attesta-
tions of nie musieć (not have to) and only one omission. While the translations of
necessarily emphasised inevitability or necessity, Polish renditions of not necessar-
ily revealed the writer’s epistemic stance, as in (10). At this point it might also be
remarked, following Simon-Vandenbergen and Aijmer (2007: 190), that since nega-
tion presupposes its counterpart in the discourse, not necessarily marks the counter-
ing of an expectation based on the writer’s own experience or logical assumptions.
This was clearly reflected by the Polish translations such as, for instance, nie musi
(wcale) (not have to at all) or nie zawsze (not always).
(10)
ENG:
Where one physically resides is a question of fact. However, the place
where a person actually lives or is registered as living may not
necessarily be the place at which a Member State defines, as a matter
of law, that person to have his permanent residence or domicile.
POL:
Fizyczne miejsce zamieszkania jest kwestią z zakresu okoliczności
faktycznych. Jednakże miejsce, w którym dana osoba faktycznie
zamieszkuje lub jest zameldowana jako zamieszkała, może
niekoniecznie być miejscem, które państwo członkowskie określa na
gruncie prawa jako miejsce jej stałego zamieszkania. [OAG_11]
Overall, the study has shown that the modalisation expressed by necessarily was
sometimes lost in the Polish version of the Opinions, even though it must be
acknowledged that in the majority of occurrences Polish equivalents of necessarily
were identified in the text. These translations, as predicted, conveyed the meaning
of external necessity and inevitability. On the other hand, in the case of not neces-
sarily, the translations confirmed the meaning of counterexpectancy and, signifi-
cantly, only one occurrence of this adverb was left untranslated.
4.3 The Case of ‘Of Course’
As shown by earlier studies (see, e.g., Simon-Vandenbergen 1992; Simon-

Vandenbergen and Aijmer 2007; Simon-Vandenbergen et al. 2007), the recruitment
of of course in interaction is clearly linked to politeness and power relations. In the
dataset analysed, more often than not (30 occurrences out of 35), of course was
translated conventionally as oczywiście, with only one omission found in the Polish
subcorpus. Alternative translations included naturalnie (naturally), z pewnością
(with certainty) and rzecz jasna (needless to say). Predictably, of course was found
both in sentence-initial and sentence-medial positions. What resurfaced most visi-
bly in the analysis of this adverb was the co-occurrence with Concessive patterns
and the as-everybody-knows meaning.
Turning to (11), we can see how of course is deployed for alignment with an
opposing view. The adverb is incorporated into a broader argumentative schema, in
which the jurist first provides her preferred argument, that is she counters the propo-
sition advanced earlier by the other party, and then she goes on to voice her partial
agreement (“it is, of course, true that…”) only to return to the standpoint she
expressed in the first move. Unlike high-stake face-to-face encounters (e.g. those of
politicians), in this context, where no immediate response from the interlocutor is
expected, the face-saving role of of course seems less prominent. Yet, this “authori-
tative backgrounding device” (Simon-Vandenbergen and Aijmer 2007: 221) serves
to balance acknowledgments and counterclaims, i.e. preferred and dispreferred
moves. As such, it can be viewed as a politeness and solidarity-building device.
108 M. Szczyrbak
(11)
ENG:
Y [COUNTERCLAIM]
The difficulties associated with preferring a uniform interpretation
over one that defers to national law in its definition of that provision are,
in my view, more theoretical than real.
It is, of course, true that each Member State has its own particular
structure of courts; and that neither this Framework Decision nor any
other has thus far attempted any degree of harmonisation in that field.
Y’ [RETURN TO COUNTERCLAIM]
However, I point out that, from a practical point of view, whether a
‘court having jurisdiction in particular in criminal matters’ is interpreted
as an autonomous concept or interpreted by reference to the law of the
issuing State makes no actual difference to the court in the executing
State. It is still faced with the basic problem that it is (probably)
unfamiliar with the court structure of the issuing State. It may therefore
be unable, without making further enquiries, to satisfy itself whether
ornot the court in the issuing State satisfies that definition.
POL:
Y [COUNTERCLAIM]
Trudności związane z przedkładaniem jednolitej wykładni nad
wykładnią,która odsyła do prawa krajowego w celu zdefiniowania
owego przepisu, mają charakter bardziej teoretyczny niźli rzeczywisty.
Pozostaje oczywiście prawdą, że każde państwo członkowskie
posiada swój własny ustrój sądów, a także że ani niniejsza decyzja
ramowa, ani pozostałe decyzje ramowe dotychczas nie podejmowały
próby dokonania harmonizacji w tym zakresie.
Y’ [RETURN TO COUNTERCLAIM]
Zwracam jednakże uwagę, iż z praktycznego punktu widzenia to,
czy wyrażenie „sąd właściwy także w sprawach karnych” jest
interpretowane jako autonomiczne pojęcie, czy też w drodze odesłania
do prawa państwa wydającego, nie stanowi faktycznie żadnej różnicydla
sądu w państwie wykonującym. Sąd ten dalej stoi przed zasadniczym
problemem (prawdopodobnej) nieznajomości ustroju sądów w państwie
wydającym. Zatem bez zasięgnięcia bardziej szczegółowych informacji
może on nie być w stanie przekonać się, czy sąd w państwie wydającym
spełnia tę definicję. [OAG_7]
The “as-everybody-knows meaning” of of course, on the other hand, is illus-

trated in (12). The beginning of this excerpt can be glossed over as “everybody
knows that the Court is not in a position to decide which interpretation is correct,”
likewise the second use of of course can be reworded as “it goes without saying that
the necessary condition for the request to be treated as valid is that it has been com-
pleted…” Again, of course is used for interpersonal ends and it operates as an
engagement device which shows that the Advocate General – to use Hyland’s (2005)
words – recognises the presence of readers and tries to connect to them and to pull
them along with her argument.
(12)
ENG:
This Court cannot, of course, say which interpretation is correct but it
seems to me that neither would be inconsistent with Articles 7 and 22
of Directive 92/12 – provided, of course, that (i) the request is treated
as valid once it has been completed and (ii) the relevant provisions are
sufficiently clear to ensure that whatever procedure is applied complies
with the requirements of legal certainty.
POL:
Trybunał nie może z pewnością rozstrzygnąć, która interpretacja jest
prawidłowa, lecz moim zdaniem ani pierwsza, ani druga nie są sprzeczne
z art. 7 i 22 dyrektywy 92/12, oczywiście pod warunkiem że i) wniosek
jest uznany za prawidłowy po uzupełnieniu go oraz ii) odpowiednie
przepisy są wystarczająco jasne, aby zapewnić, że niezależnie
odstosowanej procedury odpowiada ona wymogom pewności
prawa. [OAG_14]
Finally, it should be noted that unlike indeed and necessarily in the case of which
“the authorial imprint” was lost in the translation process, of course was almost
always translated and oczywiście was the translators’ preferred choice.12
4.4 The Case of ‘Clearly’
The translations of clearly which were found in the corpus suggested the following
meanings: (1) “obviousness resulting from accessible evidence”, such as in w oczy-
wisty sposób (evidently/in an evident manner) or oczywiście (of course/obviously)
and (2) authority and conviction, as indicated by the translations wyraźnie (plainly/
expressly) and najwyraźniej (apparently/most obviously).13 By analogy to of course,
the adverb clearly was used both sentence-initially and sentence-medially.
12
It is interesting to note that in the case of Swedish, Dutch and German correspondences of of
course, the most frequent translations, as attested by Simon-Vandenbergen and Aijmer (2007: 342-
343), i.e. naturligtvis, natuurlijk and natürlich, respectively, are conventional equivalents of the
English naturally, which suggests that “naturalness” or the fact of being “expected and accepted”
is the most salient meaning of of course. This, however, is not corroborated by the Polish data
analysed here, where only two instances of of course were translated as naturalnie (naturally).
13
On the other hand, the non-modal use of clearly, typical of legalese and linked to explicitness (as
in clearly defined or clearly indicate) was translated as wyraźnie (plainly/expressly) or jasno
(plainly).
110 M. Szczyrbak
As pointed out above, the obviousness indicated by clearly was mirrored by the
Polish translation oczywiście (of course), which is shown in (13) below, whereas
authority and conviction based on accessible evidence were conveyed by w sposób
wyraźny (in a clear manner), as in (14).14 Interestingly enough, although the latter
translation seems to indicate an adverb of manner,15 its sentence-initial occurrence
in English, though not marked off by a comma, excludes this possibility.
(13)
ENG:
I can accept that a measure which reduces the amount of duty payable
on the purchase of a new principal residence is likely to facilitate
moving in general, and that that may include moving closer to one’s
place of work, with the health and environmental benefits attendant
thereon. But that begs the question: why not facilitate, in the same way,
moving into (or out of) the Flemish Region (which would clearly be
beneficial in order to limit cross-border commuting)? The disputed
measure, however, links availability of the offset to sequential purchases
within the Flemish Region.
POL:
Jestem w stanie przyjąć, że środek zmniejszający kwotę opłaty należnej
przy zakupie nowej nieruchomości stanowiącej główne miejsce
zamieszkania może ogólnie ułatwiać przenoszenie się, co może
obejmować przenoszenie się bliżej miejsca pracy danej osoby z
towarzyszącymi temu korzyściami dla zdrowia i środowiska. Rodzi to
jednak pytanie: dlaczego nie ułatwiać w ten sam sposób przenoszenia
się do Regionu Flamandzkiego (lub poza ten region) (co byłoby
oczywiście korzystne dla ograniczenia dojazdów transgranicznych)?
Sporny środek łączy jednakże dostępność możliwości odliczenia z
kolejnymi zakupami w Regionie Flamandzkim. [OAG_29]
(14)
ENG:
Clearly there are points of similarity between the contested measures
in those cases and the present matter: Indeed, the Commission alleges
discrimination and restriction of Treaty freedoms in all three.
POL:
W sposób wyraźny pomiędzy spornymi środkami w tychże sprawach
oraz w obecnej sprawie istnieją elementy podobieństwa: Komisja zarzuca
dyskryminację i ograniczenie swobód traktatowych we wszystkich
trzech sprawach. [OAG_29]
14
Cf. the most common German translations of clearly and of course, that is deutlich and natürlich,
respectively (Simon-Vandenbergen and Aijmer 2007: 331, 343), which indicate the difference
between the two adverbs. In the Polish translations analysed here, this difference is less obvious.
15
Only one such translation was attested by the data.
For illustrative purposes, omissions of clearly in the Polish text are shown in (15)
and (16) below. Again, the absence of the Polish equivalent in the translation results
in the unmodalised statements “it has to look” and “the General Court has,”
respectively.
(15)
ENG:
In order for a national court to do this effectively, it clearly has to
look beyond the wording of the Decree.
POL:
Aby sąd krajowy mógł to skutecznie rozważyć, powinien [OMISSION]
kierować się czymś więcej, niż tylko brzmieniem dekretu. [OAG_24]
(16)
ENG:
In any event, the General Court clearly has ‘full jurisdiction’ for the
purposes of Article 6(1) ECHR (not to be confused with the EU concept
of unlimited jurisdiction to review financial penalties).
POL:
W każdym razie Sąd [OMISSION] posiada „pełne kompetencje
orzecznicze” w rozumieniu art. 6 ust. 1 EKPC (nie należy tego
mylić z unijnym pojęciem „nieograniczonego prawa orzekania” w
zakresie kontroli kar finansowych). [OAG_30]
4.5 The Case of ‘Obviously’
The last adverb to be discussed in this chapter is obviously, which was translated
chiefly as oczywiście (of course). Alternative translations included the adjective
oczywisty (obvious) as well as the adverbials w oczywisty sposób (evidently/in an
evident manner) and wprost (simply). With regard to omissions, only two instances
were found. As for position in the sentence, the adverb occurred mostly medially
and once initially. The core meaning of obviously as borne out by the Polish data
was that of “obviousness,” rather than its evidential status. A point worth noting
here is that in the dataset analysed, both of course and obviously had the same Polish
counterpart, i.e. oczywiście, as its preferred translation (see (12) and (17)). This is in
contrast with what Simon-Vandenbergen and Aijmer (2007: 219–220) say about the
differences between of course and obviously. In their view, although both adverbs
share the backgrounding function, of course means “as everybody knows or should
know” or “according to expectations,” whereas obviously means “as evidence
shows” or “as knowledge of the world shows.” Thus, of course is more forceful and
authority-oriented, while obviously is evidential and does not necessarily imply the
hearer’s knowledge. This, however, was not so manifest in the Polish data, in which
112 M. Szczyrbak
the aspect of “obviousness” stood out, rather than the evidential status of the adverb.
It can be posited then that while English distinguishes between of course and obvi-
ously, linked to “expectation” and “evidence,” accordingly, such a distinction
appears to have less or no relevance in Polish, where only oczywiście is used.
(17)
ENG:
KME asks the Court of Justice to replace the General Court’s appraisal
by KME’s preferred test. Not only is that inadmissible but the General
Court’s appraisal is obviously correct and KME obviously wrong.
POL:
KME zwraca się do Trybunału o zastąpienie oceny dokonanej przez
Sąd preferowanym przez KME kryterium. Jest to nie tylko
niedopuszczalne, ale też ocena Sądu jest oczywiście prawidłowa, podczas
gdy ocena KME jest oczywiście błędna. [OAG_30]
5 Concluding Remarks
As the study of bilingual data has shown, modal adverbs of certainty are polyse-
mous, with more conventional meanings being enriched with ad hoc readings. It is
clearly seen that apart from reflecting the author’s varied degrees of certainty, the
adverbs are used for rhetorical and argumentative purposes, that is to convey autho-
rial stance and to dialogue with alternative standpoints. To be precise, the analysis
confirmed the canonical meanings of indeed, necessarily, not necessarily and of
course, as reflected in the Polish translations w istocie, z konieczności, niekoniec-
znie and oczywiście, respectively. At the same time, somewhat surprisingly, it was
established that indeed was translated non-conventionally as bowiem (for/because),
which did not fit under the general meaning of this adverb. It was also remarkable
to observe that sentence-initial and parenthetical occurrences of indeed were often
left untranslated. Similarly, omission was the most common strategy in the case of
necessarily. Not necessarily, conversely, was almost always retained in the transla-
tion, and so was of course, performing the role of a backgrounding device or a soli-
darity marker. As expected, both of course and indeed were found in Concessive
contexts, in which they prefaced disagreement. Obviousness, in turn, seemed salient
in the case of clearly, which conveyed authority and conviction, too. At the same
time, it was noted, the distinction between of course and clearly appeared to be less
visible than was the case in the translations into other languages.16 Finally, the
16
Cf. Simon-Vandenbergen and Aijmer (2007).
Polish translations of obviously pointed to obviousness as its core meaning, rather

than the evidential status of this adverb. In addition, the meanings conveyed by of
course and obviously were expressed chiefly by one Polish adverb, that is oczywiście,
which blurred the difference between the as-everybody-knows meaning and the as-
evidence-shows reading palpable in English.
Another point to note is that the degree of persuasiveness of the original Opinions
and that of their Polish translations was not always the same. That is to say, it could
be felt that the argumentative appeal of the Polish texts was often weaker than in the
original opinions. For instance, it was observed that the author’s presence and per-
sonal authority, which were tangible in the English texts thanks to the use of the
emphatic and authority-oriented necessarily and indeed (additionally strengthened
by the co-occurring adjectives), were often absent in the Polish version. What is
more, no differentiation was made in Polish between of course and obviously; there-
fore, authority and power linked to of course were no longer to be seen in the trans-
lated text. On the other hand, with regard to of course and not necessarily, where the
number of omissions was the lowest, the rhetorical effect in English and in Polish
appeared to be the same. Finally, it can be argued that the virtual non-occurrence of
certainly and no doubt (with only 8 tokens each) was a meaningful absence, since
the deployment of these adverbs might justifiably be expected in authority-based
legal writing. This can, however, be attributed to the fact that although (most) cer-
tainly and no doubt are theoretically high certainty markers, speakers and writers
often use them when their views are challenged and when they, in fact, are not cer-
tain at all.
To conclude, although the study presented here does not make any claim to being
exhaustive, it sheds light on how modal adverbs of certainty are deployed for argu-
mentative purposes in English and in Polish. Naturally, to establish sound corre-
spondences between English modal adverbs and their equivalents in other languages,
and for the findings to be generalisable, aligned corpora representing more lan-
guages would have to be analysed and the translations would best be investigated
bi-directionally (e.g. from English into Polish and from Polish into English).17
Further still, to see if any of the identified uses are typical of the legal context, data
drawn from various legal genres would have to be confronted with data taken from
a balanced reference corpus comprising various text types and communicative con-
texts. What can, however, be seen already at this stage of the analysis is that even
though all the legal texts composing the acquis communautaire and written in the
official languages of the EU Member States are regarded as “equal” and authentic,
there are noticeable differences between the author visibility levels and the rhetori-
cal force of the original English texts and that of their translated Polish
equivalents.
17
Conveniently, in the case of EU legal discourse, multilingual corpora representing the official
languages of the EU Member States are freely available to an analyst.
114 M. Szczyrbak
References
Adams, H., & Quintana-Toledo, E. (2013). Adverbial stance marking in the introduction and conclu-
sion sections of legal research articles. Revista de Lingüística y Lenguas Aplicadas, 8, 13–22.
Aijmer, K. (2007). Modal adverbs as discourse markers: A bilingual approach to the study of
indeed. In J. Rehbein, C. Hohenstein, & L. Pietsch (Eds.), Connectivity in grammar and dis-
course (pp. 329–344). Amsterdam/Philadelphia: John Benjamins.
Baker, M. (1999). The role of corpora in investigating the linguistic behaviour of professional
translators. International Journal of Corpus Linguistics, 4, 281–298.
Bakhtin, M. M. (1981). The dialogic imagination. In M. Holquist (Ed.), Four essays by
M.M. Bakhtin (C. Emerson & M. Holquist, Trans.). Austin: University of Texas Press.
Barth-Weingarten, D. (2003). Concession in spoken English. On the realisation of a discourse-
pragmatic relation. Tübingen: Narr.
Biber, D., & Finegan, E. (1988). Adverbial stance types in English. Discourse Processes, 11, 1–34.
Biber, D., & Finegan, E. (1989). Styles of stance in English: Lexical and grammatical marking of
evidentiality and affect. Text, 9, 93–125.
Biber, D., et al. (1999). Longman grammar of spoken and written English. Harlow: Longman.
Chafe, W. L. (1986). Evidentiality in English conversation and academic writing. In W. L. Chafe
& J. Nichols (Eds.), Evidentiality: The linguistic coding of epistemology (pp. 261–272).
Norwood: Ablex.
Chindamo, M., Allwood, J., & Ahlsén, E. (2012). Some suggestions for the study of stance in com-
munication. 2012 ASE/IEEE International Conference on Social Computing and 2012 ASE/
IEEE International Conference on Privacy, Security, Risk and Trust (pp. 617–622).
Couper-Kuhlen, E., & Thompson, S. A. (2000). Concessive patterns in conversation. In E. Couper-
Kuhlen & B. Kortmann (Eds.), Cause, condition, concession, contrast: Cognitive and dis-
course perspectives (pp. 381–410). Berlin/New York: Mouton de Gruyter.
du Bois, J. W. (2007). The stance triangle. In R. Englebretson (Ed.), Stancetaking in discourse:
Subjectivity, evaluation, interaction (pp. 139–182). Amsterdam/Philadelphia: John Benjamins.
Downing, A. (2009). Surely as a marker of dominance and entitlement in the crime fiction of
P.D. James. Brno Studies in English, 35(2), 79–92.
Geeraerts, D. (2006). Prospects and problems of prototype theory. In D. Geeraerts (Ed.), Cognitive
linguistics: Basic readings (pp. 141–167). Berlin/New York: Mouton de Gruyter.
Grisot, C., & Moeschler, J. (2014). How do empirical methods interact with theoretical pragmat-
ics? The conceptual and procedural contents of the English Simple Past and its translation into
French. In J. Romero-Trillo (Ed.), Yearbook of Corpus Linguistics and Pragmatics 2014: New
empirical and theoretical paradigms (pp. 7–33). Dordrecht: Springer.
Haddington, P. (2007). Positioning and alignment as activities of stancetaking in news interviews.
In R. Englebretson (Ed.), Stancetaking in discourse: Subjectivity, evaluation, interaction
(pp. 283–317). Amsterdam/Philadelphia: John Benjamins.
Hoye, L. (1997). Adverbs and modality in English. Essex: Longman.
Huddleston, R. D., & Pullum, G. K. (2002). The Cambridge grammar of the English language.
Hunston, S. (2002). Corpora in applied linguistics. Cambridge: Cambridge University Press.
Hyland, K. (1994). Hedging in academic writing and EAP textbooks. English for Specific Purposes,
13, 239–256.
Hyland, K. (2005). Stance and engagement: A model of interaction in academic discourse.
Discourse Studies, 7(2), 173–192.
Kärkkäinen, E. (2007). The role of ‘I guess’ in conversational stancetaking. In R. Englebretson
(Ed.), Stancetaking in discourse: Subjectivity, evaluation, interaction (pp. 183–219).
Amsterdam/Philadelphia: John Benjamins.
Kärkkäinen, E. (2012). On digressing with a stance and not seeking a recipient response. In
E. Kärkkäinen & John du Bois (Eds.), Stance, affect, and intersubjectivity in interaction:
Sequential and dialogic perspectives. Special issue of Text and Talk, 32(4), 477–502.
Keisanen, T. (2006). Patterns of stance staking: Negative yes/no interrogatives and tag questions in
American English conversation. Acta Universitatis Ouluensis, B71. Oulu: Oulu University
Press. http://urn.fi/urn:isbn:9514280393. Accessed 14 Jan 2014.
Keisanen, T. (2007). Stancetaking as an interactional activity: Challenging the prior speaker. In
R. Englebretson (Ed.), Stancetaking in discourse: Subjectivity, evaluation, interaction
(pp. 253–281). Amsterdam/Philadelphia: John Benjamins.
McEnery, A., & Xiao, Z. (2007). Parallel and comparable corpora: What are they up to? In
G. Anderman & M. Rogers (Eds.), Incorporating corpora: Translation and the linguist.
Translating Europe (pp. 18–31). Clevedon: Multilingual Matters.
Niemelä, M. (2011). Resonance in storytelling: Verbal, prosodic and embodied practices of stance
taking. Acta Universitatis Ouluensis, B95. Oulu: Oulu University Press. http://urn.fi/
urn:isbn:9789514294174. Accessed 14 Jan 2014.
Precht, K. (2003). Stance moods in spoken English: Evidentiality and affect in British and
American conversation. Text, 23(2), 239–257.
Quirk, R., Greenbaum, S., Leech, G., & Svartik, J. (1985). A comprehensive grammar of the
Rauniomaa, M. (2007). Stance markers in spoken Finnish: minum mielestä and minusta in assess-
ments. In R. Englebretson (Ed.), Stancetaking in discourse: Subjectivity, evaluation, interac-
tion (pp. 221–252). Amsterdam/Philadelphia: John Benjamins.
Rauniomaa, M. (2008). Recovery through repetition. Returning to prior talk and taking a stance in
American-English and Finnish conversations. Acta Universitatis Ouluensis, B85. Oulu: Oulu
University Press. http://urn.fi/urn:isbn:9789514289248. Accessed 14 Jan 2014.
Salager-Meyer, F. (1995). I think that perhaps you should: A study of hedges in written scientific
discourse. Journal of TESOL France, 2, 127–143.
Salmi-Tolonen, T. (2005). Persuasion in judicial argumentation: The opinions of the Advocates
General at the European Court of Justice. In H. Halmari & T. Virtanen (Eds.), Persuasion
across genres. A linguistic approach (pp. 59–101). Amsterdam/Philadelphia: John Benjamins.
Simon-Vandenbergen, A.-M. (1992). The interactional utility of of course in spoken discourse.
Occasional Papers in Systemic Linguistics, 6, 213–226.
Simon-Vandenbergen, A.-M., & Aijmer, K. (2007). The semantic field of modal certainty. A
corpus-based study of English adverbs. Berlin/New York: Mouton de Gruyter.
Simon-Vandenbergen, A.-M., White, P., & Aijmer, K. (2007). Presupposition and ‘taking-for-
granted’ in mass communicated political argument. An illustration from British, Flemish and
Swedish political colloquy. In A. Fetzer & G. E. Lauerbach (Eds.), Political discourse in the
media (pp. 31–74). Amsterdam/Philadelphia: John Benjamins.
Szczyrbak, M. (2014). Of course, indeed or clearly? The interactional potential of modal adverbs
in legal genres. SKASE Journal of Theoretical Linguistics , 11(2), 90–102. http://www.skase.
sk/Volumes/JTL26/pdf_doc/05.pdf. Accessed 20 Jan 2015.
Traugott, E. C., & Dasher, R. B. (2002). Regularity in semantic change. Cambridge: Cambridge
University Press.
Traugott, E. C. (2010). Dialogic contexts as motivations for syntactic change. In R. A. Cloutier,
A. M. Hamilton-Brehm, & W. A. Kretzschmar (Eds.), Variation and change in English gram-
mar and lexicon (pp. 11–27). Berlin/New York: Mouton de Gruyter.
Tseronis, A. (2009). Qualifying standpoints. Stance adverbs as a presentational device for manag-
ing the burden of proof. Utrecht: LOT.
White, P. (2003). Beyond modality and hedging: A dialogic view of the language of intersubjective
stance. Text, 23(2), 259–284.
Primary Sources
Opinions of Advocates General at the European Court of Justice. Downloaded from: http://curia.
europa.eu/jcms/jcms/j_6/. Accessed 10 Jan 2014.
Part II
Contrastive Analysis with Comparable
Corpora
Adverbial Clauses in English and Norwegian
Fiction and News
Hilde Hasselgård
Abstract This paper considers the placement of adverbial clauses in English and
Norwegian with regard to their form, meaning, information status and semantic
relation to the matrix clause proposition. The study is based on comparable original
texts in both languages, representing two registers: fiction and news reportage. End
position of adverbial clauses is most common in both languages, with initial posi-
tion as an alternative in many cases. Positional freedom is found to differ greatly
between finite and non-finite clauses, and also across different semantic types of
adverbial clauses. For those types of adverbial clauses that vary across positions,
mostly time and contingency clauses, information status (new vs. anchored) is
found to have some influence. Iconic order was found to be less important, but was
more noticeable in fiction than in news. The placement of adverbial clauses seems
to be guided by similar principles in both languages. Register differences are identi-
fied in both languages, but they do not show consistent patterns.
Keywords Adverbial clauses • Information structure • Iconic word order • Adverbial

placement • Register • English • Norwegian
1 Introduction
This study presents a comparison of adverbial clause placement in English and

Norwegian with regard to form, meaning, information structure, text strategy and
register. The syntax of both languages allows an adverbial clause to occur before or
after the matrix clause, and more marginally, clause-medially within the matrix.
However, similar syntactic possibilities do not automatically entail similar patterns
of use: the languages may exploit the possibilities in different ways due to differ-
ences in their preferred syntactic and discourse-organizational patterns. One aim is
therefore to find out whether the principles governing the placement of adverbial
H. Hasselgård (*)
Department of Literature, Area Studies and European Languages, University of Oslo,
Oslo, Norway
e-mail: hilde.hasselgard@ilos.uio.no

DOI 10.1007/978-3-319-54556-1_6
120 H. Hasselgård
clauses differ between English and Norwegian. Previous cross-linguistic studies of

adverbial clauses have found that their placement varies across languages, see e.g.
Diessel (2001) and Hetterle (2015), but neither of these studies includes Norwegian.
Secondly, since grammar has been shown to vary across registers (e.g. Biber et al.
1999; Diessel 2005), another research question is whether the placement of adver-
bial clauses is different between registers. The study is primarily based on compa-
rable corpora of original fiction and news texts in English and Norwegian; see Sect.
3. The analysis involves a number of factors believed to have an impact on adverbial
clause placement, both within and across registers and languages. These include the
form and meaning of the adverbial clause; e.g. whether it is finite or non-finite and
whether it denotes time, contingency, manner, etc. At the level of discourse, the
investigation concerns the extent to which the placement of adverbial clauses is
affected by information structure and finally whether the ordering of the adverbial
clause and the matrix can be said to reflect the text-strategic principle of experiential
iconicism (e.g. Enkvist 1981).
The article is structured as follows: after an overview of basic definitions (Sect.
2), relevant previous research on the topic (Sect. 3) and the material and method of
the study (Sect. 4), the main analysis is presented in Sects. 5, 6 and 7. Section 5 is
concerned with lexicogrammatical features of adverbial clauses and adverbial
placement at sentence level. Particular attention is given to placement in relation to
the syntactic realization of adverbial clauses (Sect. 5.2) and the semantic type of
adverbial clause (Sect. 5.3). Section 6 discusses adverbial clause placement in rela-
tion to information structure, Sect. 7 looks into the role of experiential iconicism in
the ordering of clauses, and Sect. 8 offers a summary of findings and some conclud-
ing remarks.
2 Some Basic Definitions
Adverbial clauses are defined by Hetterle (2015: 2) as “clausal entities that modify,
in a very general sense, a verb phrase or a main clause and explicitly expresses a
conceptual-semantic concept such as simultaneity, anteriority, posteriority,
causality or conditionality”. In more traditional terms they are subordinate
clauses, finite and non-finite, which have the function of (adjunct) adverbial in a
matrix clause (e.g. Biber et al. 1999: 194). Finite adverbial clauses in both English
and Norwegian are typically marked by “a subordinator indicating the relationship to
the main clause” (ibid.; see also Faarlund et al. 1997: 800). English non-finite adver-
bial clauses include infinitive clauses, participle clauses (−ing and -ed) and verbless
clauses, as well as a category of ‘prepositional clauses’, i.e. a clause governed by a
preposition.1 Norwegian non-finite clauses in the present material are invariably
1
The reason for regarding such constructions as clauses rather than phrases is that they invariably
contain a proposition and are also clause-like in their positional preferences; see Hasselgård (2010:
37).
Adverbial Clauses in English and Norwegian Fiction and News 121
prepositional clauses, typically an infinitive governed by a preposition as in (1).

English prepositional clauses typically contain -ing participles, as shown in the idi-
omatic translation in (1). Norwegian prepositional clauses, unlike English ones, can
also be finite as illustrated in (2)2; hence they are divided into prepositional non-
finites and prepositional finites.
(1) Vi satt uten å si noe. (LSC2)

Lit: “We sat without to say anything”.
We sat there without saying anything. (LSC2T)
(2) Til tross for at han hadde drukket konjakk, ønsket han å være usynlig. (OEL1)
Lit: “In spite of that he had drunk cognac, wished he to be invisible”
In spite of the cognac he wanted to be invisible. (OEL1T)
The adverbial clauses are furthermore classified semantically into the following cat-
egories: time, space, manner, contingency, respect, and comparison; see Hasselgård
(2010: 39). Contingency clauses comprise adjuncts of condition, concession, cause
and purpose (ibid.); see examples (6)–(11) in Sect. 5.3.
Adverbial positions are classified as in Biber et al. (1999: 771) and Hasselgård
(2010: 41 ff), into initial, medial and end position. Initial is the position before the
matrix clause, as in (2) and (3); medial position is after the subject, but before any
postverbal obligatory element of the matrix clause, as in (4); and end position is
after the matrix clause, as in (1) and (5). The same positions are identified for both
languages. For definitions concerning information structure and text strategy, see
Sects. 6 and 7.
(3) Unless something’s done about her she’ll end up like her mother.
(ICE-GB: W2F)
(4) A 19th-century ornithologist, Robert Gray, when visiting the island in the
1860s, described an occasion on Ailsa Craig when he disturbed the puffin
population. (ICE-GB: W2C)
(5) Josh nodded before straightening up away from the gate. (ICE-GB: W2F)
2
Examples (1) and (2) come from the English-Norwegian Parallel Corpus (ENPC). In ENPC
examples the original is given first. Norwegian examples are followed by a word-for-word transla-
tion, while the published (idiomatic) translation is followed by a tag ending in -T.
122 H. Hasselgård
3 P
revious Work on Adverbial Clauses in English
and Norwegian
A number of studies, including reference grammars, describe aspects of adverbial

clause placement in English. The literature on Norwegian is more scant, as are con-
trastive studies of English and Norwegian. This section gives a brief overview of
some relevant studies.
Biber et al. (1999: 830 ff) observe that English adverbial clauses are common in
both initial and end position, while medial position is a highly marked choice. End
position is identified as the unmarked position for all types of non-finite adverbial
clauses (ibid.: 831). Most types of finite adverbial clauses also prefer end position,
but display more variation than non-finite clauses, depending on semantic type. The
factors believed to influence the placement of adverbial clauses are cohesion, infor-
mation structuring, discourse framing, and structural considerations (i.e. “the length
of the adverbial clause and whether or not it is located within another dependent
clause”, p. 837). Initial finite clauses are said to frequently contain given informa-
tion and precede a matrix clause containing new information (p. 835). Adverbial
clauses with new information, on the other hand, “tend to be in final position”
(ibid.).
The order of constituents – including clausal ones – in a sentence may be influ-
enced by what Enkvist (1981) refers to as “experiential iconicism”, a text strategy
whose purpose is “to make the text mimetic of experience” (1981: 101). In other
words, the order of clauses, according to the principle of experiential iconicism, can
mimic the order of events in the world, and is therefore also referred to as “natural
order”.
Altenberg (1987), in a study of causal ordering strategies, finds that causal
clauses are rare in initial position, particularly in conversation. Instead the cause-
effect (or cause-result) order is typically realised by a main clause where causality
is left implicit and a following clause starting with so (1987: 56). Thus, “natural
ordering rarely affects because-clauses” (ibid.: 58).
Ford (1993) studies temporal and conditional clauses in conversation. She argues
that conditional clauses are frequent in initial position due to their discourse-
organizational work. She points out that adverbial clauses in initial and end position
have different textual functions. Temporal clauses in initial position signal back-
grounded information. On the other hand, “final adverbial clauses specify main
clause meaning, but do not participate in information patterning strategies as do
initial adverbials” (1993: 134).
Diessel (2005) studies conditional, temporal and causal clauses in three registers
of English: conversation, fiction and science. He finds the semantic types of clauses
to have different positional tendencies, with conditionals being initial in the major-
ity of cases (67–73%), while the other two are sentence-final in the majority of cases
(34–41% of temporal clauses are initial, and 1–27% of causal clauses). The registers
also differ markedly, especially with regard to the position of causal clauses, which
are most frequent in clause-initial position in science texts and least in conversation
(see Diessel 2005: 454). Diessel argues that the placement of adverbial clauses is
governed by three competing forces: processing (which favours final placement
(ibid.: 459)), semantics (acknowledging that “different semantic types of adverbial
clauses differ in their distribution”, (ibid.: 465)), and discourse-pragmatic factors,
including information structure and iconicity, which can often explain the choice of
initial position (ibid.). In a follow-up study, Diessel (2008) looks specifically at how
the placement of temporal clauses introduced by when, after, before, once, and until
may be determined by “iconicity of sequence” (passim). Iconicity is found to have
a “strong and consistent effect on the linear structuring of complex sentences with
temporal adverbial clauses” (2008: 483), but this factor is more clearly visible with
initial than with final adverbial clauses. The placement of an adverbial clause is also
found to be influenced by its length relative to the main clause as well as by the
conjunction introducing it (ibid.: 484).
Thompson et al. (2007: 271 ff) discuss initial adverbial clauses as a means of
cohesion both within and across paragraphs. In both cases an initial adverbial clause
is cohesive by means of back-reference to the previous sentence or paragraph.
However, initial adverbial clauses are also said to be “bidirectional, linking what has
gone before to what is to come” (2007: 296). Conversely, the information encoded
in a postposed adverbial clause “may be significant, closely parallel to that encoded
in clauses in coordination”, and an adverbial clause in end position may even “con-
vey globally crucial information and mark a turning point or peak” (ibid.).
Hasselgård (2010) studies adjunct adverbials in general, and makes particular
note of adjuncts realized by clauses. 74% of adverbial clauses are found in end posi-
tion, 24% in initial position, and 2% in medial position (2010: 87). The semantic
type to occur most frequently in initial position is contingency (ibid.), followed at a
distance by time. However, the same two categories are also the most common ones
in end position (ibid.: 136), reflecting that time and contingency are the most com-
mon meanings conveyed by adverbial clauses overall. It is suggested that
adverbial clauses are placed initially if they do one or more of the following discourse jobs:
(i) provide a setting / frame of reference for the following clause(s); (ii) provide a relevant
and/or necessary restriction on the validity of the matrix clause proposition; (iii) provide a
link to the preceding discourse by means of given information or cohesive devices (2010:
91).
Wichmann and Kertz (2013) study the placement of concessive clauses in

English in relation to formal and discourse-pragmatic variables as well as process-
ing factors. They conclude that “the two strongest predictors in our study, are
semantic or discourse organizational in nature” (2013: 19); in particular, an adver-
bial clause is likely to be fronted if it contains an anaphoric item (ibid.: 20).
Processing factors, though significant, are found to play only a subsidiary role when
weighed against other variables.
No studies have been found that explicitly contrast initial and end position of
adverbial clauses in Norwegian. However, the Norwegian reference grammar
(Faarlund et al. 1997) remarks on the typical positions of some adverbial clause
types, identified by the conjunction used to introduce them. Causal clauses intro-
124 H. Hasselgård
duced by fordi (‘because’) are said to occur in initial position if they convey presup-
posed information and in end position if they convey new information (1997: 1036).
Purpose clauses introduced by slik at (‘so that’) are typically in end position while
those introduced by for at (‘for that’) can vary between the positions (ibid.: 1040 f).
No positional tendency is noted for conditional clauses in general, but it is claimed
that conditionals marked by inversion rather than a subjunction are always initial
(ibid.: 1046).
Fossestøl (1980: 280 ff) discusses the relationship between the temporal sequence
of events and the linear sequence of clauses, noting that adverbial clauses with fordi
(‘because’) tend to be sentence final, thus reversing the temporal sequence of the
cause and consequence. However, he does not offer a detailed study of adverbial
clause placement, but simply puts forward some principles of text organization.
Meier (2001) is a contrastive study of causal subordination in English and
Norwegian based on the English-Norwegian Parallel Corpus (ENPC). Meier found
that clauses introduced by because and its closest Norwegian counterpart fordi are
typically found in end position while clauses introduced by other causal subordina-
tors (English as, since and Norwegian siden, ettersom) are more likely to occur in
initial position. This is linked to the information typically conveyed by such clauses
as well as the range of pragmatic functions typically served.
Hasselgård (2014a) investigates the discourse functions of initial adjunct adver-
bials in English and Norwegian, based on the same material as the present study
(see Sect. 4). Initial adjuncts are found to be more frequent in Norwegian than in
English, partly as a consequence of a generally higher frequency of adjuncts. Initial
placement of adjuncts seems to be less marked in Norwegian, and initial adjuncts
are commonly used for discourse linking.
Hasselgård (2014b) studies conditional clauses in English and Norwegian on the
basis of the non-fiction part of the ENPC. Conditionals are most frequently found in
initial position in both languages, but in original texts, end position is more common
in Norwegian than in English. This is linked to the division of conditionals into
open, hypothetical and pragmatic (p. 192 f.): in particular, open conditionals are
more frequently sentence-final in Norwegian than in English. The similarity between
the languages is, however, extensive enough for the position of the conditional
clause to be changed very rarely in translation between the languages (p. 198).
Diessel (2001: 433 f), in a typologically oriented study, argues that the placement
of adverbial clauses in languages that use both initial and final position varies with
the meaning and function of the clauses, and to some extent with the choice of sub-
ordinator. Hetterle (2015: 121–127) makes similar observations on the positions of
adverbial clauses in a number of languages (not including Norwegian or other
Scandinavian languages).
As will have been noted, all the studies point to variation in adverbial clause
placement according to the semantic type of adverbial clause, information structure,
and discourse coherence. While English adverbial clauses have been extensively
studied and fairly well described, the contribution of the present study will be the
language comparison and the results for Norwegian.
Based on the previous studies, the following findings can be expected for the
present one:
• Initial placement of adverbial clauses will be more frequent in Norwegian than
in English, partly as a consequence of an overall higher frequency of adverbial
clauses, and partly because of different positional preferences between the lan-
guages (Hasselgård 2014a, b).
• News will use initial position more often than fiction (Hasselgård 2014a).
• Different syntactic types of clauses will have different positional preferences
(Diessel 2005; Hasselgård 2010). In particular, non-finite clauses will have less
freedom of position. The preferences may vary between languages and
registers.
• Different semantic types of clauses will have different positional preferences
(Diessel 2005, 2008; Hasselgård 2010). For example, conditional and causal
clauses will prefer initial and end position, respectively (Hasselgård 2014b;
Altenberg 1987). The preferences may vary between languages and registers.
• Adverbial clauses containing given information are more likely to be sentence-
initial; those containing new information are more likely to be sentence-final
(Ford and Thompson 1986; Diessel 2005; Hasselgård 2010).
• Experiential iconicism/iconic order (Enkvist 1981; Diessel 2008) is likely to
influence the order of subordinate and matrix clause with the possible exception
of causal clauses introduced by because/fordi; cf. Fossestøl (1980) Altenberg
(1987), and Meier (2001).
4 Material and Method for the Current Study
The English material has been culled from the British component of the International
Corpus of English (ICE-GB), and is a subset of the material used for the study of
adjunct adverbials in Hasselgård (2010). The Norwegian fiction texts come from the
English-Norwegian Parallel Corpus (ENPC), while the Norwegian news texts are a
collection of news articles sampled from various online newspapers in March 2011;
see Table 1 and the list provided in the references section for details. The adverbial
clauses were extracted and analysed manually. A subset was used for the case stud-
ies of information structure and experiential iconicism (Sects. 6 and 7). Table 1 also
shows the frequency of adverbial clauses per 10,000 words, which gives an
Table 1 Corpus composition

N of adverbial Adv clauses per
Corpus Source Words clauses 10,000 words
English fiction ICE-GB 10,000 146 146
English news ICE-GB 10,000 118 118
Norwegian fiction ENPC 24,800 229 92.3
Norwegian news Online newspapers 11,000 156 141.8
126 H. Hasselgård
30
25.4
25 21.4
20 18.1 18
14.8
15 Mean sentence length
10.1 10.3 9.6
10 Adv clauses per 100 sentences
5
0
English fiction English news Norwegian Norwegian
fiction news
Fig. 1 Mean sentence length and frequency of adverbial clauses per 100 sentences across lan-
guages and registers
indication that in English such clauses are more frequent in fiction, whereas in
Norwegian they are more frequent in news.
However, as occurrence per number of words is not an ideal measure for the fre-
quency of adverbial clauses, the number of adverbial clauses per 100 orthographic
sentences was also calculated. The number of sentences in the Norwegian material
was calculated with WordSmith Tools (Scott 2014), while for the ICE-GB texts the
number of ‘text units’ given for each corpus text was used. The mean sentence length
is practically identical between English and Norwegian, but the registers differ in both
languages, with sentences being almost twice as long in news as in fiction (see Fig. 1).
This indicates that sentence complexity is greater in news, which correlates with a
markedly higher frequency of adverbial clauses per 100 sentences in news than in fic-
tion, as shown by Fig. 1. Frequencies per 100 sentences highlight similarities between
the languages and differences between the registers, and thus give a different picture
than the calculation per 10,000 words reported in Table 1: in terms of frequency per
sentence Norwegian fiction has fewer adverbial clauses than English fiction, while
Norwegian news has more than English news. It should be noted, however, that the
opportunity of occurrence for adverbials is not the sentence, but the clause; thus fre-
quency per sentence is not an ideal measure either. The quantitative findings of this
study will therefore mainly be given in terms of raw frequencies or proportional dis-
tribution of adverbial clauses across positions within each subcorpus.
5 Positions, Forms and Meanings of Adverbial Clauses
5.1 The Placement of Adverbial Clauses
The hypothesis that Norwegian will use initial position more often than English is
at best only partially confirmed, as shown in Table 2: initial position is proportion-
ally more frequent in Norwegian fiction than in English fiction, but for news, the
opposite is the case. However, Fisher’s exact test shows that the cross-linguistic
Table 2 Frequency of positions of adverbial clauses in English and Norwegian fiction and news
E fiction E news N fiction N news
N % N % N % N %
Initial 27 18.5 27 22.9 57 24.8 33 21.2
Medial 1 0.7 2 1.7 2 0.9 0 0
End 118 80.8 89 75.4 171 74.3 123 78.9
Total 146 100 118 100 230 100 156 100
E/I ratio 5.4 4.4 4.0 3.7
25
20
15
initial
10
end
5
0
E fiction E news N fiction N news
Fig. 2 The percentage of sentences in each subcorpus that contain an initial or final adverbial
clause
difference is not statistically significant for either register.3 Table 2 may indicate that
the hypothesis of a (proportionally) more frequent use of initial position in news
than in fiction is correct for English, but not for Norwegian, though the apparently
different distribution of initial vs. end position between fiction and news is found to
be not statistically significant in both languages.
Figure 2 gives a different perspective on the frequencies, which alters the picture
to some degree. The figure shows the percentage of sentences in each subcorpus (cf.
Fig. 1) that contain an adverbial clause in initial and end position, respectively.
From this perspective, initial adverbial clauses are more frequent in news in both
languages, but so, it must be noted, are adverbial clauses in end position. For the
present, I will not pursue the calculations per sentence any further.
The findings presented here are inconclusive with regard to the hypotheses pre-
sented above. There is a higher frequency of initial adverbial clauses in news than
in fiction in both languages, but as this is matched by a higher frequency of final
clauses, the percentage of clauses in initial position is greater in news than in fiction
only in English. Contrary to expectation, Table 1 and Figs. 1 and 2 show adverbial
clauses to be less frequent in Norwegian than in English fiction. However, none of
the frequency differences observed between languages and registers have proved to
be statistically significant.
The calculation took only initial and end position into account.
3
128 H. Hasselgård
5.2 Placement and Syntactic Realization of Adverbial Clauses
It was predicted that non-finite clauses would have less positional freedom than
finite ones, and findings support this hypothesis. Figure 3 shows the percentage of
initial position for finite and non-finite clauses across languages and registers. Non-
finite clauses are consistently less frequent in initial position than finite ones across
languages and registers. The register difference is greater in Norwegian than in
English as regards non-finite clauses, but it is smaller for finite clauses.
The raw frequencies underlying Fig. 3 are shown in Table 3. The differences
between finite and non-finite clause placement are consistent across the material
and across different types of non-finite clauses. That is, the overwhelming prefer-
ence of non-finite clauses in both languages and both registers is for end position.
The differences in distribution between initial and end position are statistically sig-
nificant across the material, at p < 0.01 for English fiction and Norwegian news and
p < 0.0001 for English news and Norwegian fiction. Interestingly, prepositional
finites show the same tendency as non-finites: only one out of 14 such clauses in the
Norwegian material was found in initial position.
40
35.7
35 30.6
30 27.3
25
25
20 Finite
15 Non-finite
10 6.5
4.5 4.4
5 2.3
0
English fiction English news Norwegian fiction Norwegian news
Fig. 3 The percentage of clauses occurring in initial position (in contrast to end position)
Table 3 Positions of finite and non-finite adverbial clauses (raw frequencies)

Initial End Initial End Initial End Initial End
Finite clause 25 75 25 45 55 121 30 73
Prepositional finite 0 0 0 0 1 6 0 7
Non-finite clause 1 35 2 31 0 0 0 3
Prepositional non-finite 1 7 0 11 1 43 3 40
Verbless clause 0 0 0 1 0 0 0 0
27 117 27 88 57 170 33 123
5.3 Placement and Semantic Type of Adverbial Clause
Different semantic types of adverbial clauses have different positional preferences

although end position is the most common alternative for all of them, at least in English
(e.g. Hasselgård 2010). The same is expected to be the case in Norwegian. Medial posi-
tion is so rare in the material that its use is practically negligible (cf. Table 2); the remain-
der of this paper will therefore concentrate on initial and end position.
The following semantic types of adverbial clauses are found in initial position:
time, space, manner, contingency, and comparison. In end position the same types
are found, along with respect. These are illustrated in examples (6)–(11).
(6) Time: When he loses his temper with her she runs off (ICE-GB: W2F)
(7) Space: but he was better off where he was, keeping a low profile.
(ICE-GB: W2F)
(8) Manner: Hun så på klokken som om han skulle begynne med det samme.
(OEL1)
Lit: “She looked at the watch as if he should begin at once”
She looked at her watch as though he was going to begin right away.
(OEL1T)
(9) Contingency (reason, purpose, condition, concession): Dersom andre

teknologier holder mål, kan det bli vanskelig for regjeringen å ikke gå inn
med statlig støtte. (News: DAV3)4
Lit: “If other technologies are up to standard, it can be difficult for the
government to not go in with governmental support.”
(10) Comparison: He went on to say that rather than conducting a war of

attrition, BS should release Ravenscraig for sale. (ICE-GB: W2C)
(11) Respect: Han sier han ble kontaktet og advarte Andhøy mot å seile i
området ... (News: VG2)
Lit: “He says he was contacted and warned Andhøy against to sail
in the area…”
Table 4 shows the positional distribution of adverbial clauses according to

semantic subtype. Since the subcorpora differ in size and in the number of adverbial
clauses they contain, the positional distribution has been calculated proportionally,
as a percentage of the total number of clauses in each subcorpus. Time clauses
occupy initial position relatively frequently across the material, particularly in fic-
4
Examples from the Norwegian newspaper material are accompanied by a translation (produced
by the author) intended to show the structure of the original without being entirely literal.
130 H. Hasselgård
Table 4 The positional distribution of semantic subtypes of adverbial clauses across languages
and registers (percentages)
N = 145 N = 116 N = 227 N = 153
Time 9.7 38.6 6.9 30.2 17.2 37.4 6.5 21.3
Space 0.7 1.4 0.9 0.9 0 0.4 0 0.6
Manner 0 4.8 0 5.2 0.4 11.5 0.6 7.7
Contingency 8.3 35.2 15.5 39.7 7.0 21.6 14.4 35.5
Respect 0 0 0 0.9 0 2.6 0 11.0
Comparison 0 1.4 0 0 0.4 1.3 0 2.6
18.7 81.4 23.3 76.9 25 74.8 21.5 78.7
tion; in fact their proportion in Norwegian fiction is the highest in the material. In
news, contingency clauses are the most frequent type found in initial position in
both languages.
Most semantic types of adverbial clauses are rare or non-existent outside end
position in the present material. It is only with time and contingency clauses that
there seems to be a real choice between the positions – at least they are the only
categories that are frequent enough in both positions to allow a real comparison.
The focus of the next two sections will thus be on these two.
6 Adverbial Clause Placement and Information Structure
Studies of information structure typically distinguish between new information

(where a referent is not known to the reader/hearer) and given (old/known) informa-
tion. In spite of this apparently simple dichotomy, information status is notoriously
hard to assess, even when only nominal referents are investigated, as is commonly
the case (see e.g. Prince 1981). For example, givenness may be assessed on the basis
of both the textual and the situational context (Prince 1992). Furthermore, the sim-
ple given-new opposition does not work well for clauses, which are composite
structures in which the information is hardly ever all given (although it may be all
new). Thus, a slightly modified dichotomy was applied, inspired by Prince (1992)
and Kreyer (2007).
• New [N] means “discourse-new”, i.e. the clause conveys information not found
in the preceding text.
• Anchored [A] indicates that at least part of the content of the adverbial clause is
found in the preceding context. Generally, a clause was not considered anchored
if only one of its constituents could be classified as given information.
Note that only the text itself was used as a basis for determining information status;
Prince’s category of ‘situationally given’ (Prince 1992) was not applied, as such
givenness will vary across readers. There was no formal restriction on how far back
in the context one should look for given information, but in practice, given the win-
dow size of the context in the software used, the span was approximately ten sen-
tences (or s-units).
The typical pattern can be expected to be as in example (12), where the adverbial
clause in sentence (i) is anchored (marked as [A]) in the description of the farm
given in the previous context. The matrix clause is predominantly discourse-new
(marked as [N]), although ‘he’ refers anaphorically to ‘Prince Charles’. Note that
the initial anchored clause also gives a framework of interpretation for the rest of the
passage, by specifying the fundamental premise for the ensuing events. In sentence
(ii) the matrix contains references to both the farm and the sale implied in sentence
(i). The adverbial clauses are discourse-new. (i.e. while it can be inferred that a farm
that has become available is for sale, it cannot be inferred that this will happen
‘without going on the market’). Sentence (iii) is much like (ii) in that the matrix
contains references to the preceding context while the sentence-final adverbial
clause has discourse-new information.
(12) [Context: description of a farm next to a property owned by Prince

Charles] (i) Yesterday after the 80-acre farm became ‘unexpectedly
available’ [A],moves were being made to ensure he gets the chance to
buy it without fear of outside competition. (ii) It is being put up for sale
without going on the market and without being advertised [N]. (iii) Those
involved in the deal are keeping details secret to avoid putting the sale
in jeopardy [N]. (ICE-GB: W2C)
Example (13), on the other hand, is text-initial, so both the matrix and the adverbial
clause contain new information. However, it is the information in the adverbial
clause that is developed in the immediately following context, which makes end
position a natural choice.
(13) HEAD TEACHERS are planning to challenge a key part of the

government’s education reforms by opting out of the national
curriculum which lays down what children should learn. (ICE-GB: W2C)
On the basis of previous findings regarding adverbial placement in English

(Hasselgård 2010), it is expected that the news texts will pay more attention to infor-
mation structure, while the fiction texts will pay more attention to cohesion. Fig. 4
shows the distribution of anchored and new adverbial clauses in initial position.5
Since it is impossible to assess information value in a fully objective manner even
when only the textual context is considered, and I did not have the opportunity to
have the material assessed by a second analyst, the numbers may not be entirely
accurate. However, they show such a clear pattern that I believe they are reliable
enough for the present purposes.
5
Note that the study of information structure is restricted to time and contingency clauses, which
are the only ones to vary between initial and end position.
132 H. Hasselgård
Eng fiction 19 2
Eng news 21 4
Anchored
Nor fiction 24 8 New
Nor news 26 6
0% 20% 40% 60% 80% 100%
Fig. 4 Anchored and new information in initial position
Eng fiction 18 61
Eng news 21 57
Anchored
Nor fiction 14 60
New
Nor news 20 62
0% 20% 40% 60% 80% 100%
Fig. 5 Anchored and new information in end position
The numbers underlying Fig. 4 are small, and percentages may enlarge the dif-
ferences between languages and registers. However, the general trend is clear, and
the patterns in Fig. 4 support the main hypothesis about information structure: ini-
tial adverbial clauses are anchored in the majority of the cases, as illustrated by
sentence (i) in example (12). Anchored initial clauses mainly co-occur with either
discourse-new or anchored matrixes. Discourse-new initial adverbials, in contrast,
typically co-occur with discourse-new matrixes, e.g. in text-initial sentences. There
are more new initial clauses in Norwegian than in English, especially in fiction.
Figure 5 shows the distribution of anchored and new adverbial clauses in end
position, and gives an almost reverse picture of the patterns in initial position: the
information is discourse-new in 75–80% of the cases. Anchored adverbial clauses in
end position co-occur with anchored and new matrixes about equally often. There is
little difference between the languages. However, the registers differ: the proportion
of anchored adverbial clauses in end position is greater in news in both languages.
Information load thus seems to be a good predictor of adverbial clause placement.6
However, the apparently neat patterns involve a potential chicken-and egg problem:
since syntactic subordination may signal downgrading of information, the fact that
a proposition contains anchored information may cause the writer to encode it as a
subordinate clause and place it in initial position.
6
In fact, Fisher’s exact test shows it to be highly significant for the selection of position, at p <
0.0001 for all parts of the material.
In any case the investigation of information structure has shown that adverbial
clauses introducing new information are indeed more frequent in end position in
both languages, while those carrying information anchored in the preceding context
are more frequently initial. Similarly, clauses with information that is developed in
the following context are more likely to be final. However, the picture is not consis-
tent: anchored information can occur in end position – and there are more cases of
this than of new information in initial position.
7 Adverbial Clause Placement and Iconic Order
One of Enkvist’s (1981) text strategies is experiential iconicism, or ‘natural order’:

this principle (also advanced by other linguists, see e.g. Fossestøl (1980) Ramsay
(1987) and Hwang (1994)), predicts that events will be presented in the order in
which they occur. According to Diessel (2005: 463), “there is a tendency to arrange
clauses in an iconic order such that linear clause order reflects the temporal ordering
of the events they describe”. This is illustrated in (14), from Norwegian fiction (its
English translation, from the ENPC, closely follows the original): the arrival at the
steps is prior to the removal of hat and gloves.
(14) Da han kom fram til trappen, stanset han, tok av. seg pelsluen og hanskene.
(LSC2)
When he got to the steps he stopped and took off his fur hat and gloves.
(LSC2T)
(15) Lente jeg meg langt nok ut og så den andre veien, kunne jeg få et glimt av.
pissoaret nedenfor Fagerborg kirke. (LSC2)
If I leaned out far enough and looked the other way, I could get a glimpse
of the urinals down by Fagerborg Church. (LSC2T)
The principle of temporal iconicism may apply to clauses other than temporal ones
too, as illustrated by (15): the leaning out is not only a condition for seeing the uri-
nals, it also needs to be prior in time. Kortmann (1991: 137) discovered “marked
tendencies for adjuncts/absolutes expressing ‘time before’ or condition to precede
their matrix clause, and for those receiving a ‘time after’, result, purpose, contrast,
addition/accompanying circumstance or exemplification/specification interpreta-
tion to occur in final position”. In similar fashion one might expect conditions to
occur before consequences (Ford and Thompson 1986; Hasselgård 2014b), as in
(15), and cause to be mentioned before effect (although Altenberg (1987) and
Diessel (2008) have shown that this is not necessarily the case).
The analysis of iconicism was manual, based on close reading of each adverbial
clause in relation to its matrix clause. As in the study of information structure, only
time and contingency clauses were considered. Figure 6 shows the proportion of
clauses that reflect what will henceforth be referred to as ‘iconic order’. This order
134 H. Hasselgård
English fiction 57 14
English news 56 26
iconic
Norwegian fiction 46 21 reverse
Norwegian news 49 31
0% 20% 40% 60% 80% 100%
Fig. 6 The iconic principle in the order of adverbial and matrix clauses
is slightly more frequent in English than in Norwegian. As regards the register com-
parison, the iconic order is slightly more frequent in fiction than in news in both
languages, but not significantly so.7
Based on previous findings, e.g. Diessel (2008), it appears that a more fine-
grained semantic division of adverbial clauses is needed for a study of iconic order.
In particular, conditional and causal clauses should not be lumped together, as they
have very different positional patterns (Diessel 2008; Hasselgård 2014b). It is
important to note that iconic order works differently with different types of adver-
bial clauses. For time clauses, iconic order implies that the order of clauses mirrors
the temporal succession of events. Thus a temporal clause will precede its matrix if
it is about an event prior to the matrix event (and vice versa). For conditional clauses,
iconic order means that the protasis precedes the apodosis, i.e. the condition is men-
tioned first. For causal clauses iconic order implies that cause is mentioned before
effect. This might pull causal (because) clauses to initial position and defer purpose
and result clauses to end position.
Table 5 presents the placement of subcategories of contingency and time clauses.
Time clauses have been subdivided according to their temporal relationship with the
matrix clause; i.e. whether they refer to an event occurring before that of the matrix
clause (MC) after it, or simultaneously with it (cf. also Diessel 2008: 473). The
shaded cells mark iconic order; bold type marks the most frequent position.
As Table 5 shows, most time clauses occur in end position in both languages and
in both registers, whether they refer to an event that is prior to, simultaneous with,
or posterior to that of the matrix clause. There is thus no consistent reflection of
iconic order. However, a time adjunct that denotes an event prior to the one in the
matrix clause, as in (16), seems more likely to be initial than one that denotes an
event posterior to the matrix event. However, end position is more common even for
adverbial clauses denoting prior events; example (17) illustrates this.8 For temporal
clauses denoting an event that follows the one in the matrix clause, initial position
is unlikely, albeit not impossible.
7
Significance according to Fisher’s exact test: English news vs. English fiction: p = 0.1006;
Norwegian news vs. Norwegian fiction: p = 0.3894; Norwegian fiction vs. English fiction: p =
0.1235; Norwegian news vs. English news: p = 0.4114.
8
Diessel (2008: 474) reports a slight majority of initial placement of “prior” temporal clauses, and
of the temporal clauses placed in initial position, a clear majority reflect iconic order. However, the
adverbial clauses in end position do not reflect iconicity to the same extent (ibid.: 475).
Table 5 Adverbial clause meanings and iconic order (marked by shaded cells). Raw frequencies

Condition 9 6 13 8 7 2 15 12
Cause 0 2 0 7 2 4 1 14
Purpose 0 31 0 29 0 17 2 22
Result 0 0 0 1 0 0 0 1
Concession 1 2 3 1 1 1 3 1
Time before MC 2 2 5 10 8 13 5 15
Time after MC 2 14 0 5 2 11 1 2
Simultaneous 7 22 3 17 12 26 3 14
(16) Once that is achieved, he still faces the choice of whether to call a General
Election in June ... (ICE-GB:W2C)
(17) Men gamle kelner Olesen dukket opp da Helen kom inn i kafeen. (OEL1)
But the old waiter Olesen appeared when Helen came into the cafe. (OEL1T)
Conditional clauses are the only ones to consistently precede the matrix more often
than they follow it although end position is only slightly less common. The same
tendencies can be observed in both languages and both registers. Many of the
clause-final conditionals occur in dependent matrix clauses (as in (19)), especially
in Norwegian.
(18) Og hvis du ser dem i øyet, blir du gal. (TB1)

And if you look them in the eye you go mad. (TB1 T)
(19) Myndighetene har en plan for å motta innvandrere om de kommer, sa

innenriksminister Robert Maroni på en pressekonferanse. (News: DAV2)
Lit: “The authorities have a plan for receiving the immigrants if they come,
said the minister for domestic affairs RM at a press conference.”
Purpose and result clauses occur almost consistently in end position across the
material, in agreement with iconic order, as they convey a possible outcome of the
matrix clause situation. Note, however, that purpose clauses tend to be non-finite in
both languages, which is another strong reason why they should favour end posi-
tion, cf. Sect. 5.2. Examples are given in (20) and (21).
(20) Purpose: Fredsprisvinneren Muhammad Yunus går til retten for å påklage
avskjedigelsen fra Grameen Bank. (News: VL2)
Lit: “Peace prize winner M. Yunus goes to court to appeal against his
dismissal from Grameen Bank.”
136 H. Hasselgård
(21) Result: People would get full counselling before starting the process of
buying so that they were aware of the commitments of home ownership.
(ICE-GB: W2C)
Causal clauses occur predominantly in end position, thus violating iconic order.
However, this was expected on the basis of Altenberg’s (1987) and Diessel’s (2008)
findings as well as the predictions of Fossestøl (1980) and Faarlund et al. (1997).
The typical order is thus as shown in (22).
(22) Og det var. blitt for sent fordi pengene egentlig aldri hadde interessert ham.
(OEL1)
And it had been too late because the money had never really interested him.
(OEL1T)
8 Summary of Findings and Concluding Remarks
The present investigation has reaffirmed the fact that register is a factor that cannot
be ignored in studies of grammar and discourse organization. While this is becom-
ing an established truth in usage-based studies of English, it has as yet not been
visible in studies of Norwegian. Furthermore, the frequency information about
Norwegian adverbial clause placement has given a more accurate and nuanced pic-
ture of language use in this area than what has emerged from previous
descriptions.
The cross-linguistic comparison has shown that English and Norwegian are alike
in placing adverbial clauses predominantly in initial and end position while medial
position is rare. End position is the more common choice in both languages and in
both registers investigated. The first hypothesis presented in Sect. 4 was that
Norwegian would use initial position more often than English. The material showed
no consistent pattern: there was a greater proportion of adverbial clauses in initial
position in Norwegian fiction than in English fiction, but the other way round in the
news register. Thus, the register comparison also turned out to have conflicting
results: news has a greater proportion than fiction of its adverbial clauses in initial
position only in English. The hypothesis of news making more extensive use than
fiction of initial position for adverbial clauses was thus true only of contingency
clauses, not of temporal ones. No other semantic types were frequent enough to
show reliable patterns of variation between initial and end position.
It was clear that the syntactic type of an adverbial clause influences its position
in both languages: non-finite clauses occur less freely in initial position. Prepositional
finites (occurring in Norwegian only) follow the same positional tendencies as their
non-finite counterparts. Different semantic categories also have their own positional
preferences in both languages. The preferences are rather similar across languages
and registers. Contingency clauses are slightly more frequent in initial position in
news, and time clauses in fiction.
The study of information structure and iconic order concerned only time and
contingency clauses, as these were the only ones to be frequent enough in both ini-
tial and end position to study positional variation. The results show that adverbial
clauses containing anchored information are more likely to be sentence-initial, and
those with new information are more likely to be sentence-final. Initial clauses with
new information are likely to co-occur with new matrix clauses.
The principle of iconic order would predict that causes and conditions are men-
tioned before consequences and that temporal clauses are placed such that the order
of adverbial and matrix clause reflects the temporal succession of events. There was,
however, no clear evidence in the material that iconicism was vital to adverbial
clause placement, except possibly with regard to condition and purpose clauses,
which showed definite preferences for initial and end position, respectively. It is,
however, likely that the positional preferences of semantic categories are more
important than iconic order, since other semantic categories do not seem much
affected by iconicism.
The best predictors of adverbial clause placement thus seem to be finiteness and
semantic category. Among finite time and contingency clauses, information value is
also a good predictor of position. There were surprisingly few cross-linguistic dif-
ferences apart from frequency: Norwegian and English adverbial clauses seem to be
placed according to the same semantic and discourse-pragmatic principles.
The register comparison revealed the following tendencies: the frequencies of
adverbial clauses in both positions differed between registers but in opposite direc-
tions in English and Norwegian. Iconic order was slightly more frequent in fiction
in both languages. Anchored clauses were most common in initial position across
the material, but initial discourse-new clauses were more frequent in fiction than in
news in Norwegian, but more frequent in news than in fiction in English. Discourse-
new clauses were most common in end position in all the subcorpora but surpris-
ingly there was a slightly higher percentage of final anchored clauses in news (in
both languages).
The relatively inconclusive results, mainly due to the small size of the material,
call for further research into the positional variation of adverbial clauses across
languages and registers. Any further analysis of information structure and iconic
order would benefit from a larger sample as well as additional registers and a broader
text distribution.
References
Altenberg, B. (1987). Causal ordering strategies in English conversation. In J. Monaghan (Ed.),

Grammar in the construction of texts (pp. 50–64). London: Francis Pinter.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spo-
ken and written English. London: Longman.
Diessel, H. (2001). The ordering distribution of main and adverbial clauses: A typological study.
Language, 77(2), 433–455.
138 H. Hasselgård
Diessel, H. (2005). Competing motivations for the ordering of main and adverbial clauses.
Linguistics, 43(3), 449–470.
Diessel, H. (2008). Iconicity of sequence. A corpus-based analysis of the positioning of temporal
adverbial clauses in English. Cognitive Linguistics, 19, 457–482.
Enkvist, N. E. (1981). Experiential iconicism in text strategy. Text, 1(1), 97–111.
Faarlund, J. T., Lie, S., & Vannebo, K. I. (1997). Norsk referansegrammatikk. Oslo:
Universitetsforlaget.
Ford, C. E. (1993). Grammar in interaction. Adverbial clauses in American English conversations.
Ford, C. E., & Thompson, S. A. (1986). Conditionals in discourse: A text-based study from
English. In E. C. Traugott, A. ter Meulen, J. S. Reilly, & C. A. Ferguson (Eds.), On conditionals
(pp. 353–372). Cambridge University Press.
Fossestøl, B. (1980). Tekst og tekststruktur: veier og mål i tekstlingvistikken. Oslo:
Universitetsforlaget.
Hasselgård, H. (2010). Adjunct adverbials in English. Cambridge: Cambridge University Press.
Hasselgård, H. (2014a). Discourse-structuring functions of initial adverbials in English and
Norwegian news and fiction. In Lefer, M.-A. & S. Vogeleer (Eds.), Genre- and register-related
discourse features in contrast, Special issue of Languages in Contrast, 14(1), 73–92.
Hasselgård, H. (2014b). Conditional clauses in English and Norwegian. In H. P. Helland & C. M.
Salvesen (Eds.), Affaire(s) de grammaire (pp. 183–200). Oslo: Novus.
Hetterle, K. (2015). Adverbial clauses in cross-linguistic perspective. Berlin/Boston: de Gruyter
Mouton.
Hwang, S. J. J. (1994). Relative clauses, adverbial clauses, and information flow in discourse.
Language Research, 30(4), 673–705.
Kortmann, B. (1991). Free adjuncts and absolutes in English: Problems of control and interpreta-
tion. London/New York: Routledge.
Kreyer, R. (2007). Inversion in modern written English: syntactic complexity, information status
and the creative writer. In R. Facchinetti (Ed.), Corpus linguistics 25 years on (pp. 187–204).
Amsterdam: Rodopi.
Meier, E. (2001). “Since you mention it”: A contrastive study of causal subordination in English
and Norwegian. MA thesis, University of Oslo. www.hf.uio.no/ilos/forskning/prosjekter/sprik/
pdf/em/HovedoppgEinarMeier22.pdf
Prince, E. F. (1981). Toward a taxonomy of given–new information. In P. Cole (Ed.), Radical
pragmatics (pp. 223–255). New York: Academic Press.
Prince, E. F. (1992). The ZPG letter: Subjects, definiteness, and information-status. In W. C. Mann
& S. A. Thompson (Eds.), Discourse description: Diverse linguistic analyses of a fund-raising
text (pp. 295–326). Amsterdam: Benjamins.
Ramsay, V. (1987). The functional distribution of preposed and postposed IF and WHEN clauses
in written narrative. In R. Tomlin (Ed.), Coherence and grounding in discourse (pp. 383–408).
Amsterdam: Benjamins.
Scott, M. (2014). WordSmith Tools 6. Stroud: Lexical Analysis Software.
Thompson, S. A., Longacre, R. E., & Hwang, S. J. J. (2007). Adverbial clauses. In T. Shopen (Ed.),
Language typology and syntactic description. Volume II: Complex constructions (pp. 237–
300). Cambridge: Cambridge University Press.
Wiechmann, D., & Kerz, E. (2013). The positioning of concessive adverbial clauses in English:
Assessing the importance of discourse-pragmatic and processing-based constraints. English
Language and Linguistics, 17(1), 1–23. doi:10.1017/S1360674312000305.
Corpus Material
English-Norwegian Parallel Corpus (ENPC), excerpts from Toril Brekke, Jacarandablomsten/The

Jacaranda Flower (TB1), Lars Saabye Christensen, Jokeren/The Joker (LSC2) and Øystein
Lønn, Tom Rebers siste retrett/Tom Reber’s Last Retreat (OEL1), see www.hf.uio.no/ilos/eng-
lish/services/omc/enpc/
International Corpus of English, British component (ICE-GB).: www.ucl.ac.uk/english-usage/
projects/ice-gb/, texts W2C-001, 002, 015, 018, 020 (press reportage) and W2F-001, 002, 003,
007, 012 (fiction).
Norwegian newspapers – news articles from the online versions of some Norwegian national daily
newspapers 3 March 2011 (Dagsavisen, Aftenposten, VG, Vårt Land, Klassekampen, Nationen,
Dagens Næringsliv)
Coherence Relations and Information
Structure in English and French Political
Speeches
Diana Lewis
Abstract This study addresses the marking of additive coherence relations in

French and English political speeches. It is based on a balanced comparable corpus
of ministerial political speeches spanning the late 1990s and early 2000s. Additive
relations are expected to be the least marked relations, as where a discourse follows
on naturally from what has gone before, coherence is easily assured by continuity
intonation, a discourse continuity marker such as English ‘and’, or simple juxtapo-
sition. Density and variety of additive markers are found to be much greater in the
French speeches compared with the English, where additive relations are more
often left implicit, resulting in quite different discourse patterns. The role of mark-
ers is illustrated by a case study comparing the roles of en effet and its dictionary
equivalent indeed, which are found to function differently. The findings arguably
reflect the greater distance between literary and conversational French than is the
case for English. At the same time, the higher frequency of a number of the French
markers seems to go along with greater grammaticalization towards rhetorical, ‘pre-
sentational’ functions.
Keywords French-English • Discourse marking • Additives • Political discourse •

Grammaticalization • Bleaching
1 Introduction
As has been observed in a number of contrastive studies of French and English

(such as Chuquet and Paillard 1987; Guillermin-Flescher 1981), there appear to be
significant differences in the patterns of discourse marking between the two lan-
guages. There has been little agreement, however, on the nature of such differences.
While some have argued that markers of discourse coherence seem to be more nec-
essary in English than in French (v. Gallagher 1995; Mason 2001), others have
D. Lewis (*)
Department of English and Lerma Research Centre, Aix Marseille University,
e-mail: diana.lewis@univ-amu.fr

DOI 10.1007/978-3-319-54556-1_7
142 D. Lewis
claimed that French has a preference for a greater density of discourse marking
(e.g., Fetzer and Johansson 2010 on causation marking).
This paper takes a look at discourse marking in the genre of political speeches, a
genre of written-to-be-spoken language that is broadly-speaking persuasive in
intent. The study is based on a French-English comparable corpus of speeches.
The paper is organized as follows. Section 2 discusses the additive coherence
relation in the context of discourse coherence. Section 3 gives an overview of the
genre-specific comparable corpus on which the study is based – political speeches –
and describes the procedures. The findings on additive markers across the French
and English speeches are presented in Sect. 4. Section 5 focuses on the uses of two
additive markers that are commonly given as ‘dictionary equivalents’: French en
effet and English indeed. The implications of the findings are discussed in the con-
cluding Sect. 6.
2 D
iscourse Coherence, Information Structure and Additive
Relations
Discourse coherence concerns the level at which the speaker, putting together her
discourse, needs to enable the hearer to build an ongoing representation where each
upcoming ‘idea’ – theme or proposition – finds its place. Information structure
refers here to thematic progression, in the sense of structuring given and new infor-
mation, as well as informational salience: means used by the speaker to foreground
or background ideas, creating an information contour for the discourse.
Both coherence relations and information structure may be encoded in some
linguistic device (such as prosodic pattern, lexical expression/construction or syn-
tactic structure /construction), or may be left implicit for the hearer/reader to prag-
matically infer. Some particular linguistic device may mark simultaneously a
coherence relation and an information structural relation. In fact, some approaches
to discourse tie the two together so that each coherence relation has an inherent
information contour or grounding relation. This is the case, for instance, of
Rhetorical Structure Theory (RST) (Mann and Thompson 1986). Others, such as
Relational Discourse Analysis (RDA) (Oberlander and Moore 2001), distinguish
‘semantic’ coherence relations from ‘functional’ information structure.
Coherence relations (also known as discourse relations or rhetorical relations)
include such notions as ‘contrast’, ‘concession’, ‘result’, ‘elaboration’, ‘exemplifi-
cation’, ‘addition’, ‘justification’ and so on. They refer to the various ways in which
the segments (or groups of segments) of a text or discourse fit into the rest of the text
or discourse; that is, how each part relates to the parts that precede and follow it, and
thus contributes to the overall meaning of the text.
These types of meaning can themselves be thought of as propositional. (In fact,
they are referred to by Mann and Thompson (1986) as ‘relational propositions’, an
area of meaning that is relatively grammaticalized into particles and adverbs, but
Coherence Relations and Information Structure in English and French Political Speeches 143
Table 1 A partial simple model of discourse coherence relations

Consonant relations Dissonant relations
[other [other
addition cause relations] concession contrast antithesis relations]
Also,.. Because... although... But... on the
contrary, ..
For so that.. Even then ..
instance,..
which can also be ‘propositionalized’.) Attempts to draw up empirically satisfactory

taxonomies of coherence relations, using labels such as the ones above (contrast,
concession, etc.), have foundered on three main difficulties: the issue of constrain-
ing the number of relations, the degree to which the taxonomy is hierarchical and
the relationship between coherence and information structure. Moreover, each lan-
guage will have its own network of relations depending on the way relations are
typically drawn in the language in question. We do not adopt a taxonomic approach
here; descriptions of relations in Sects. 4 and 5 are not to be interpreted as labels
belonging to a particular taxonomy of predefined coherence relations, but simply as
indications of the types of meaning expressed in the corpus data.
For practical purposes, nevertheless, a working model is needed to delimit an
area for investigation. The approach adopted here is to view relations as a consonant-
dissonant cline from total or high consonance to low or zero consonance. High
consonance occurs where the ideas or sets of ideas expressed in consecutive dis-
course segments co-exist happily, being wholly compatible with one another (e.g.,
reformulation, exemplification). High dissonance occurs where adjacent discourse
segments express ideas that are wholly incompatible (e.g., polar opposites). (This
model is comparable to Murray’s (1997) model of continuous vs discontinuous rela-
tions; we prefer different terms to avoid confusion with Continuative relations,
which Murry subsumes along with causal relations under ‘continuous’). Table 1
illustrates such a simple partial model.
Relations may be explicitly marked or left implicit (v. Taboada 2009). Marking
takes many forms, more or less grammaticalized: syntactic pattern, subordinating
conjunction, non-subordinating conjunction, adverb, adverbial phrase, clause,
modal particle, and so on. Dedicated discourse markers are adverbial lexemes and
phrases such as however, even so, besides, for instance, moreover, and similar
expressions in other languages. A further function of many, if not all, discourse
markers is to signal the relative informational salience of the discourse segment they
attach to. They thereby help the hearer to appreciate the speaker’s evaluation of the
relative importance of the states of affairs related in the discourse. The expression
of discourse coherence is thus both subjective, indicating the speaker’s vision of
how the ideas expressed inter-relate, and intersubjective insofar as the speaker must
anticipate the expectations of the hearer.
The focus here is on the discourse marking of additive relations. An additive rela-
tion will be said to exist where a new idea in the upcoming discourse develops the
topic of the preceding discourse and is compatible with the preceding idea(s); sim-
144 D. Lewis
ply put, it is ‘more in the same vein’. (This use of ‘additive’ differs from that of
other authors such as Halliday (1994), for example.) The relation may be between
two states of affairs (‘content’ use) or between two speaker arguments (‘presenta-
tional’ use); often both types of relation obtain between two ideas (cf Hasselgård
2014: 72). A single occurrence of a discourse marker might therefore be interpreted
as encoding a state-of-affairs relation, an argumentational relation and an informa-
tion structural relation. In (1), for instance, What’s more can be interpreted as intro-
ducing an additional event and an additional speaker argument, as well as signalling
that the upcoming event/argument is more salient (rhetorically stronger for the
speaker) than the previous idea that it links to.
(1) if they had been cheating I would have known. What’s more , I would have
been the first to complain. [BNC CH7, newspaper]
The aim of the study is to compare the usages of additive coherence relation
markers by speakers of the political speech genre in the two languages and to iden-
tify potential discourse constructions built around an additive coherence relation.
Consonant relations in general are expected to be less marked (for example, by a
discourse marker) than dissonant relations. This is because ‘coherence’ in the lay
sense excludes incompatibility: the bare assertion of two apparently incompatible
ideas results in incoherence. Where a proposition may appear to the hearer to be
either at odds with what went before or irrelevant to it, some marker is called for to
at least acknowledge the counterexpectation. But where an idea follows on naturally
and unsurprisingly, it will usually be enough to use discourse continuity intonation,
a discourse continuity marker such as English ‘and’, or simple juxtaposition, for the
coherence to be understood. This can be seen from example (1), where the removal
of What’s more does not render the sequence incoherent. As Patterson and Kehler
point out, “the more difficult recovering the correct relation would be without a con-
nective, the more necessary it is to include one” (2013: 915). Additive markers are
therefore more optional than markers of other relation types.
This notion of uneven marking of relations is compatible too with the uniformity
of information density (UID) hypothesis, according to which predictability largely
explains variability in reduction. That is, the more predictable an upcoming item is,
the more likely it is to be reduced (phonetically, syntactically, discoursally) (Levy
and Jaeger 2007). Asr and Demberg (2012: 84) apply this hypothesis to discourse
marking and observe that easily inferable relations are on average marked more
ambiguously than relations which are less expected, in a fashion that arguably
reflects discourse-level information density smoothing.
3 Data: The Comparable Corpus of Political Speeches
The study is based on an English-French comparable corpus of around 760,000

words consisting of political speeches given during the late 1990s and early 2000s.
All the speeches are given by politicians in government in the course of their minis-
terial duties. The genre of ministerial speeches, in the European context, is a fairly
constrained one. The sociocultural parameters of the situations in which such texts
are produced are well-defined and similar across the two languages, so that identify-
ing comparable texts for a corpus is fairly straightforward. It is unidirectional public
language – produced by the specialist few (the political figures and their assistants)
and designed for reception by several constituencies, which can include, in addition
to the immediate (often specialist) audience, other politicians, other governments,
other institutions, the media and the wider public. A ministerial speech is typically
written to be spoken and contains a few thousand words at most. It expounds policy,
aims to impress and persuade, and seeks positive evaluation from its several audi-
ences. But its ceremonial role is also important: a speech is an integral part of many
ceremonial events and other regular gatherings in the calendar of each minister.
The comparable corpus on which the study of additive connectives is based con-
tains around 375,000 words in each language. Size-matching the parts of a compa-
rable corpus by number of words is, of course, a rough-and-ready way to proceed.
As is well known, typological differences mean that written French text tends to be
‘longer’ than written English text.1 For pairs of translated texts, for instance, the
French text tends to exceed the English by both mean word length and number of
words per sentence. The present corpus is no exception, with the mean length of
word in the French part 5.16 characters compared with 4.83 for the English part, and
mean sentence length 25.3 words in the French compared with 19.9 words in the
English. A more appropriate measure (one involving the number of opportunities
for the target constructions to occur – cf. discussion in Holmes 1994: 30) might be
the discourse segment or, for an unsegmented corpus, the sentence. But the English
speeches being on average notably longer than the French ones, by both word and
sentence counts, the smaller number of English speeches is somewhat counterbal-
anced by the larger number of English sentences (Table 2). Frequencies are given in
relation to word counts. Prosodic information is not available, the corpus speeches
being written versions only.
Table 2 The English-French comparable corpus of political speeches

French part English part
No. of words to nearest 000 372,000 384,000
No. of speeches 149 133
Average no. of words per speech 2497 2910
Average no. of sentences per speech 98 145
146 D. Lewis
4 Additive Marking across English and French Speeches
Starting from lists of potential additive markers in French and English, an overall
picture of marking was drawn up for the texts in the two languages. The lists of
markers were drawn up following consultation of a variety of sources: Danlos et al.
(2015), Roze (2009), the digital resource Dictionnaire des synonymes français,
Quirk et al. (1985) and Roget’s Thesaurus.
Discourse-connective and and et were excluded from the study as they typically
mark discourse continuity rather than addition, and often precede markers of other
coherence relations (cf. And yet, Et pourtant and so on). Donc and so were also
excluded for being still inherently causal, though both can arguably also mark dis-
coursal addition. The additive uses of the markers listed in Table 3 were counted.
Surprisingly, the frequency of besides, a fairly typical marker in English conver-
sation and other genres (cf. Hasselgård 2014), was zero. The most frequent 15
markers in each language are listed with their frequencies in Fig. 1.
French speeches clearly contain more frequent and more varied additive marking
than the English ones. Aussi, ainsi, également, enfin, en effet, par ailleurs, d’ailleurs,
de même, en outre, [et] puis, d’autre part all occur at more than 10 per 100 k words,
in an additive function, across a range of speakers. The English speakers, by con-
trast, rely largely on juxtaposition and on also; the only other frequent markers
being too, indeed, and as well. English additive discourse markers such as in addi-
tion, moreover, similarly, thus, further[more], likewise, what is more, in fact, in the
same way, here again, besides, etc. are rare (<10 per 100 k words). The use of the
French additive markers can be viewed as helping to create, or as reflecting, a par-
ticular style of parallelism, using additive marking to pile up consonant propositions
and create a layered, cumulative case. Each layer of ideas seems to add equal weight
to the overall argumentation, but may be internally structured into more salient and
less salient points. The English speakers, by contrast, rely more on juxtaposition
and structural similarity to create argumentation that is less explicitly cumulative.
Hobbs (1985) discusses parallelism as follows:
Considerations of coherence in general allow us to string together arbitrarily many parallel
arguments. But it is a convention of argumentation for there to be just three, and those
ordered by increasing strength. In political rhetoric, one also hears sequences of parallel
statements, but for maximum effectiveness, they should be more than just the semantic
parallelisms characterized by the theory of coherence. They should also exhibit a high
degree of lexical and syntactic parallelism. (Hobbs 1985: 27)
Table 3 Additive markers counted in the comparable corpus

French ainsi, aussi, d’ailleurs, d’autre part, dememe, dureste, dureste, effectivement,
également, en effet, en fait, en outre, en plus, enfin, ensuite, finalement, par ailleurs,
parallelement, pareillement, puis / et puis, qui plus est, surtout
English additionally, again, also, and of course, as well, besides, equally, further,
furthermore, here again, in addition, in fact, in the same vein, in the same way,
indeed, likewise, moreover, similarly, then, thus, too, too, what is more
effectivement
parallèlement
d'autre part
[et] puis
en outre
de même
d'ailleurs
ensuite
par ailleurs
surtout
en effet
enfin
également
ainsi
aussi
20 40 60 80 100 120 140 160 180 200
frequency per 100k words
what is more
then
likewise
furthermore
thus
again
similarly
moreover
and of course
equally
in addition
as well
too
indeed
also
20 40 60 80 100 120 140 160 180 200
frequency per 100k words
Fig. 1 Frequencies of the 15 most frequent additive markers in the French and English speeches
These devices are made quite explicit in the French speeches through both lexi-
cal and syntactic parallelism and the regular framing of arguments by discourse
markers. This kind of parallelism is exemplified in (2), which shows the coherence
markers (in bold, with the additives underlined) and the hierarchical structure
(indentation).
(2) L’euro n’est pas un joujou […]

L’euro c’est un projet politique […]
L’euro c’est aussi un enjeu économique que, là aussi , je résumerai
simplement […]
148 D. Lewis
L’euro c’est d’abord […] un facteur de stabilité […]

là aussi […]
L’euro c’est ensuite plus de sécurité […]
Enfin créer l’euro c’est donner à la monnaie européenne une masse
critique […]
Voilà pourquoi l’euro est une chance […]
N’est-ce pas d’ailleurs […] [Juppé, 27/11/996]
‘The euro is not a plaything …
The euro is a political project …
The euro is aussi an economic challenge that, là aussi, I will simply
summarize …
The euro is d’abord … a stabilising factor …
là aussi …
The euro ensuite means more security …
Enfin the creation of the euro provides European currency with a critical
mass …
That is why the euro is an opportunity …
Is it not true d’ailleurs …’
English speeches tend to manifest a different style of parallelism altogether, as in

(3). There is semantic parallelism here (underlined) in the repetition of the notion
‘impact on Britain’: a crescendo of impact from it matters through directly affect us
to enormous disruptive effect on us. And there is structural and semantic parallelism
(bold) in the three subordinate clauses even if …, whatever … and even if …. The
rhetorical cohesion is achieved without any discourse markers at all.
(3) It matters to Britain that EMU should succeed, even if we never join it.
The emergence of a euro-zone in the middle of our largest market, the
Single Market, will directly affect us in this country, whatever we do. We
want EMU to be solid, durable and stable because a euro-zone would
inevitably be our most important trading base. Already growth or recession
on the continent feeds quickly into the UK economy.
If a euro-zone failed, the disruptive effect on us would be enormous, even if
we were outside it.
[Clarke, 18/12/1996]
These speakers of French and English are using quite different rhetorical
templates.
5 En Effet and Indeed in the Political Speeches
Both en effet and indeed seem to be particularly typical of the genre of speeches.
And both, as sentence adverbials, are used overwhelmingly in the context of conso-
nant discourse relations. Both are anaphoric, dependent for interpretation on a
previous idea from a previous segment of discourse being accessible to the hearer.
They can both, therefore, be characterized as typical or ‘central’ additive markers.
Moreover, they are considered dictionary equivalents (e.g., Dictionnaire Le Robert
and Collins 2013). We shall see below, however, that although their functions over-
lap, they cannot be considered functional equivalents in the context of political
speeches.
To identify the probable functions of the discourse markers, the procedure was to
interpret, independently of the marker, the degree of coherence and the most plau-
sible type of relevance holding between the proposition in the host discourse seg-
ment and that in the previous discourse segment. This interpretation was then
compared to the interpretation with the marker.
5.1 Indeed
Indeed is conventionally described as a modal epistemic adverb of certainty, with a

transparent origin in the prepositional phrase in deed. Its development from PP to
discourse marker is traced in Traugott and Dasher (2002: 160–164). Indeed is asso-
ciated with formal registers and is more frequent (and differently distributed) in
writing than in speech (191pmw vs 166pmw in the British National Corpus). At
221pmw, then, the frequency in the political speeches is relatively high, in accor-
dance with the formality of the genre.
Núñez Pertejo (2008), working with the ICE-GB corpus, identifies three func-
tions of indeed: (i) as a speech-act adverb, (ii) as a narrow-scope adverbial modifier,
and (iii) as a discourse marker which confirms or reinforces a preceding argument
or assertion, this last being by far the most frequent use (2008: 725–731). Aijmer’s
(2008) analysis of indeed based on parallel corpora identifies it as a marker of
emphasis, of confirmation, and as focalizing or intensifying. And Aijmer (2007:
330) characterises indeed as further having “the social meaning speaker
authority”.
Table 4 shows the positions of indeed in the English speeches. The medial posi-
tion is limited to clauses with be or an auxiliary. (While pre-V position is frequent
for English -ly sentence adverbs like clearly and connectives like therefore, it seems
to be avoided for modal adverbs such as indeed, in fact, at least, after all, etc. used
as discourse markers. Where no other auxiliary is present, a do-construction is used.
Table 4 The position in the Position in the

sentence of occurrences of sentence % of occurrences
‘indeed’ in the English
Sentence-initial, 77
speeches (n = 85)
including 4 x And
indeed..
After Aux or be 15
After Adj or Adv 8
(constituent-final
position)
150 D. Lewis
Instances of such adverbials found pre-verbally (no Aux) in other corpora were
rarely connective.)
The different functions of indeed apparent in the corpus correlate closely with
position in the host (Table 5). Indeed in both final and medial positions is a modal
adverb. Final position corpus occurrences, after an AdjP or AdvP host, are all exem-
plars of the construction <very Adj|Adv indeed>, in which indeed combines with
very to indicate ‘extremely’ (4).
(4) … the rationale for having such a power is clear and we shall want to look
at it very closely indeed . [Lloyd, 09/06/1997]
In medial position (5), indeed stresses the veracity of its host where there may have
been doubt (cf. really, truly, definitely); it can be said to be counterexpectational.
(5) … the indications are that conditional fees are indeed widening access to
justice. [Hoon, 23/09/1997]
In this position indeed may also combine with an adversative marker to form a con-
cessive construction <indeed p, [adversative DM] q> as in (6), or with if to form a
concessive-conditional construction (7). In these constructions it can also be
described as counterexpectational.
(6) Companies are indeed observing those rules, but not always in a way which
positively informs shareholders and employees, or responds to their
concerns. [Becket, 04/03/1998]
(7) Who do you think should run such a bidding system, if indeed you are
persuaded by its attractions? [Aitkin, 15/03/1995]
Initial occurrences, by contrast, are all discourse connective; and by virtue of this
position, indeed acts as a presentative. The hosts are not all full clauses, as exempli-
fied in (8).
(8) Hong Kong stands as a monument to what the human spirit can, indeed will,
achieve. [Rifkin, 12/02/1997]
In the great majority of cases (v. Table 5), the indeed host is a wider, stronger
claim than the preceding one. The examples in (9) are typical. In contexts such as
(9a), the relation is usually expressed in French with au contraire (v. Lewis 2005:
45–46).
(9) a. NATO has not collapsed. Indeed – the best test of success – countries are
queuing for membership. [Portillo, 05/12/1995]
b. The new government in Britain has a clear plan about how it intends to
shape British foreign policy, and indeed to shape the world in which Britain
lives. [Symons, 10/10/1997]
c. Hong Kong, as so often in its history, has defied the pessimistic smart
Alecs. Indeed it has defied the odds. [Major, 04/03/1996]
In a few instances, the indeed host largely repeats the previous idea (10), or provides
some detail or additional information about it that exemplifies, clarifies or justifies
it (11). These contexts are the closest to the French contexts in which en effet is
found.
(10) IT offers immeasurable oppportunities to bring new, more efficient ways

of delivering public services shaped to meet the needs of the customer.
Indeed , IT presents an amazing opportunity to rethink fundamentally the
way that Government provides services [Freeman, 28/10/1996]
(11) … trade has always been the backbone of Anglo-Tunisian relations. Indeed,
our first formal treaty in 1662 was about commerce. [Hanley, 09/01/1997]
Table 5 summarizes the distribution of indeed in the corpus.

In the ICE-GB corpus, Aijmer (2008: 117–119) found indeed to be more than
twice as frequent in parliamentary debates and non-broadcast speeches (respec-
tively 80.9 and 61.9 per 100 kwords) as in other genres and in those contexts it typi-
cally conveys rhetorical strengthening: “The function of indeed in parliamentary
debates is to strengthen the assertion or argument (the rhetorical use of indeed) by
adding more certainty especially in the combination and indeed” (Aijmer 2008:
117). Simon-Vandenbergen and Aijmer (2007) likewise note that “‘x and indeed y’
… implies that y is informationally stronger than x” (2007: 120). The findings from
the present political speeches corpus bear this out, though the frequencies are much
lower at 22 per 100 kwords.
While operating as an epistemic modal adverb, emphasizing veracity, indeed,
like some other epistemic adverbs, can be used dialogically, persuasively, depend-
ing on the assumptions the speaker makes about the hearer’s state of knowledge and
Table 5 The functions of ‘indeed’ in the English corpus (n = 85)

Position Construction Function n= %
Host-final very {Adj|Adv} indeed Intensification 7 8
Medial Subj Aux indeed'V C Counterexpectation contexts 13 15
(post-Aux) (a) emphasis 8
(b) concession 3
(c) concessive-conditional 2
Initial p indeed q Rhetorical salience 65 77
(a) q is wider/stronger claim than p 54
(b) q gives detail of p 6
(c) unclear occurrences 5
152 D. Lewis
beliefs. This rhetorical function depends on indeed being seen in the wider context
of a discourse construction, where q is a wider or stronger claim than
p, set in a wider-still context of a thematic chunk of discourse.
5.2 En Effet
En effet in present-day French, as noted by Charolles and Fagard (2012: 137), is

used exclusively as a lexicalized particle, or ‘particule lexicalisée’, which functions
as a connective or discourse marker. It cannot be discourse-initial, but must have a
previous idea to refer back to. Like indeed, en effet as a connective goes back a long
way. It is attested already in the late fourteenth and early fifteenth centuries, consid-
erably earlier than other, similar connectives and, again like indeed, the earliest
widespread use seems to have been, as far as can be ascertained from available
sources, in legal prose and in records, followed by philosophical prose (Bertin 2002:
47–48). Bertin suggests that en effet may have been in competition in Middle French
with the declining epistemic adverb si, to which it offered a weightier and more
substantive alternative, and which it may have gradually replaced (2002: 48). It may
therefore be an example of the grammaticalization cycle whereby a highly gram-
maticalized form, become eroded and/or bleached, is overtaken by a periphrasis,
which then in turn undergoes further grammaticalization. En effet evolves from
high-certainty epistemic sentence adverb to connective, a typical development
cross-linguistically. It also occurs, from an early stage, as a complete dialogic turn
of confirmation, again like indeed.
In the Speeches corpus, en effet occurs overwhelmingly in post-verbal (post-
auxiliary where there is one) position (Table 6).
This post-verbal (or post-auxiliary) position is also the most frequent position
(over 70%) for two other high frequency discourse markers in the corpus: également
(87 per 100 kwords) and donc (122 per 100 kwords). By contrast, the less frequent
de même (15 per 100 kwords) in just over half its occurrences is in initial position,
just less than half being post-verb/auxiliary.
Using a corpus of literary texts, Schoonjans (2014) shows that, in declarative
sentences, this same post-verb/auxiliary position accounts for around 80% of high
frequency markers such as donc, seulement, quand même and tout de même.
Schoonjans likens this position in French to the well-known ‘middle field’ of
Table 6 The position in the Position in the sentence % of occurrences

sentence of occurrences of
Sentence-initial 21
‘en effet’ in the French
speeches (n = 210) Pre-subject, after a sentence- 3
initial adverbial
Post-subject, before the finite 2
verb
Post-verb/auxiliary 74
German modal particles (see also Schoonjans 2012 for similarities between French
and German particles). Given this kind of data, it looks as though these high-
frequency discourse markers may be at a relatively advanced stage of grammatical-
ization (in the broad sense), and may be part of an emergent discourse-level
schematic construction in which there is a post-verb/auxiliary ‘slot’ for the ana-
phoric marker.
In the speeches corpus, the en effet host is a full clause in every case, unlike
indeed. En effet occurs mainly in declaratives, but also in interrogatives. The
speeches being monologues, the interrogatives are, of course, rhetorical questions.
When the en effet host (the segment to which it attaches) contains a speaker-attitude
predicate, there can be some ambiguity as to whether the marker has (pragmati-
cally) scope over the speaker attitude, over the following proposition, or both. The
position of the discourse marker, along with the context, suggest that in most cases
it is at least the speaker attitude and often both (12).
(12) L’action du gouvernement repose sur l’ouverture d’un débat public. J’ai
en effet la conviction que les solutions ne peuvent être imposées d’en haut
à la société. [Jospin, 25/08/1997]
‘The government’s actions depend on setting up a public debate. I am
en effet convinced that solutions cannot be imposed on society from above.’
Previous work on en effet has identified a range of related functions, suggesting

that it is polysemous. Charolles and Fagard (2012), for instance, argue that uses of
en effet can be attributed to one of three functions: (i) confirmation of an idea
expressed in the preceding cotext, most often in dialogue; (ii) confirmation of an
expected event; (iii) justification or explanation of the previous idea. Rossari (2016)
argues that the Justificative use of en effet emerges from its dialogic use: “L’adverbe
signale l’approbation de ce qui a été énoncé précédemment et le segment p qui le
suit donne une raison de cette approbation” (2016) (‘The adverb signals approval of
what has been said previously and the segment p that follows gives a reason for this
approval’). Rossari (2016) further suggests that the dialogic origin of the Justificative
usage may be in a truncated concessive: “La valeur justificative propre à l’emploi de
en effet et effectivement dans certaines configurations monologiques coïncide avec
un schéma concessif tronqué” (‘The justificative sense of en effet and effectivement
in certain monologic contexts matches a truncated concessive schema’). A dialogic
concessive involves a three-element construction, , where an idea (p)
(attributed to the hearer or a third party) is acknowledged and confirmed (p1), but
dispreferred or considered not relevant by the speaker compared with some follow-
ing idea (q) that she wishes to promote (cf. Couper-Kuhlen and Thompson 2000).
En effet in the political speeches does seem to share with Concession the notions of
given information and of dialogic confirmation, as discussed below. But no evi-
dence of a dispreferred idea that might suggest truncated concession was found.
In the majority of cases of en effet in the political speeches corpus, the host picks
up on and elaborates in some way on the previous idea(s), to reformulate it (13),
justify having expressed it (14), or provide evidence that it is true (15). But en effet
154 D. Lewis
often occurs with a less specific elaborative relation, especially a move from the
general (in the previous idea) to the particular (in the host idea). This typically
involves reiterating the thematic element of the idea and providing greater detail
(16) and (17).
In (13) the same idea is expressed in both clauses. The effect of the discourse
marker is to emphasize their equivalence; without it, ‘a lot being at stake’ might
come across as stronger than ‘particularly important’.
(13) … ce texte dont nous débattons aujourd’hui revêt une importance

particulière. L’enjeu est. en effet de taille. [Guigou, 29/02/2000]
‘The text we are discussing today is particularly important. What is at stake
is en effet considerable’.
Example (14) illustrates the typical justification use, the en effet host being the jus-
tification for the speaker not going into detail.
(14) … le collectif prévoit une diminution voisine de 3,3 milliards d’euros par
rapport à la LFI, sur laquelle je ne m’étends pas: votre rapporteur général
a en effet décrit l’ensemble des évolutions prévues par ce collectif de
manière exhaustive dans son rapport écrit . [Mer, 29/07/2002]
‘… the revised budget involves a reduction of around 3.3bn euros from the
initial budget; I will not go into that in detail: your Rapporteur-general has
en effet described all the changes involved in the revision thoroughly in his
written report.’
In (15), evidence for the first assertion is presented in the second. At the same time,
the evidence provides a justification for making the first statement, so that evidence
and justification are closely linked.
(15) je sais qu’il n’est point nécessaire de vous convaincre que la recherche
universitaire doit aujourd’hui s’inscrire résolument dans un espace
européen. Votre colloque annuel qui s’est tenu voici 2 mois à Bordeaux
était en effet consacré pour une large part à la discussion de la
comunication de la Commission intitulée “ Vers un espace européen de la
recherche”. [Schwartzenberg, 18/05/2000]
‘I know you do not need to be convinced that university research today
must be firmly anchored in a European context. Your annual conference
held two months ago in Bordeaux was en effet largely devoted to
discussion of the Commission paper entitled “Towards a European research
area”’.
The en effet host in example (16) can be interpreted as elaborating in more detail on
what women point out; but also as explaining why new legislation is not the obvious
answer or justifying the speaker’s statement that it is not the obvious answer.
(16) Du point de vue même de ce que demandent les femmes, c’est-à-dire la

justice et l’égalité, la création de dispositifs légaux ne va pas de soi, elles
sont d’ailleurs nombreuses à le dire. Elles ne veulent pas en effet être des
“femmes alibis” qui seraient choisies sur d’autres critères que la
compétence. [Juppé, 11/03/1997] [116]
‘From the point of view of what women are demanding, that is to say justice
and equality, creating new legislation is not the obvious answer, as many of
them point out. They do not want en effet to be ‘token women’ selected on
criteria other than their competence’.
In all, around 5% of occurrences clearly involved Reformulation, 37% justifica-

tion and 3% evidence. Overall, 40% involved a move from a more general idea to a
more particular idea in the en effet host (17). Where a point the speaker wishes to
make is split in two, so to speak, into a topic-introducing segment and an explanatory
or enhancing segment, it is easy to see a dialogic echo, with a tacit response between
the two conjuncts, followed by en effet acting as affirmation (‘yes’) and an elabora-
tive, justificative, or explanatory sequel.
(17) a. Ce régime est plus sévère que celui de la loi de 1995. En effet , le seuil au
dessus duquel les condamnations à une peine d’emprisonnement avec
sursis simple ne sont pas amnistiées a été abaissé par rapport à la loi de
1995: il passe en effet de neuf mois à six mois. [Perben, 23/07/2002]
‘This regime is more severe than that of the 1995 act. En effet, the
threshold
beyond which ‘simple’ suspended sentences cannot be amnestied has been
lowered from that of the 1995 act: it has gone en effet from nine months to
six months’
b. … vous vous inscrivez dans une de nos plus anciennes traditions. C’est
en effet au milieu du XVIIIe siècle … que les premiers prix du concours
furent discernés. [Darcos, 02/07/2002]
‘… you are joining of one of our most ancient traditions. It was
en effet in the middle of the 18th century that the first competition prizes
were awarded.’
In one example in the corpus en effet might perhaps be interpreted as concessive

(18), but it is not clear. One interpretation of (18) is that the use of en effet conforms
to the elaborative pattern: the notion of paradox is introduced, then the paradox is
specified; the two elements comprising the paradox are marked by ‘d’un côté’ and
‘mais de l’autre’ which together frame the contrast. Another is that the two contrast-
ing elements are marked by ‘d’un côté’ and ‘de l’autre’, while ‘en effet … mais’
frames a concession.
156 D. Lewis
(18) J’ai évoqué tout à l’heure le paradoxe agricole de notre pays. Mais celui-ci
se double d’un paradoxe rural. D’un côté en effet , nous assistons à un
certain renouveau démographique de nos campagnes. Mais de l’autre, nos
compatriotes ruraux s’interrogent devant la méconnaissance par la France
urbaine de certaines spécificités de leur modes de vie …
[Gaymar,04/07/2002]
‘I spoke just now about the agricultural paradox in our country. But there is
also a rural paradox. On one hand, en effet, we are witnessing a certain
demographic renewal in the countryside. But on the other hand, our rural
compatriots are concerned that urban France is ignorant of the
particularities of their way of life …’
In several cases what is striking is the way en effet occurs as part of a series of
discourse markers that together create a rhetorical frame for a chain of interlinked
ideas, as seen in Sect. 4, each with its anaphoric marker. In (19) the en effet host is
a simple repetition, after a parenthesis, of a previous proposition (‘This law will be
exemplary’ – ‘Our future law will be exemplary’). To maintain coherence, it needs
to be marked as old information, the function of en effet here.
(19) Je ne souhaite pas que cette disposition …puisse masquer le fait que la
France, par l’adoption de ce projet de loi, sera l’un des pays les mieux
armés pour lutter contre la corruption internationale.
Je me prononcerai donc en faveur de l’amendement [1] …
Enfin , j’approuve également l’amendement [2] …
Notre future loi sera ainsi exemplaire, et je tiens une fois encore à remercier
votre Commission et Monsieur Jacky DARNE, votre rapporteur, pour son
utile contribution à l’élaboration de ce dispositif législatif.
Cette loi sera en effet exemplaire, d’abord par son effet dissuasif …
Elle traduira ainsi le souci de la France de combattre sans relâche ce fléau
économique et social que constitue la corruption nationale et
internationale.[Guigou, 29/02/2000]
‘I do not want this provision to be able to conceal the fact that France, in
passing this bill, will be one of the countries best equipped to fight
international corruption.
I will donc vote in favour of amendment [1] …
Enfin, I approve également of amendment [2] …
Our future law will ainsi be exemplary, and I would like once again to
thank your Commission and Mr Jacky Darne, your rapporteur, for their
useful contribution to the drafting of this legislative package.
This law will en effet be exemplary, first of all due to its disuasive effect …
It will ainsi answer France’s concern to fight relentlessly the economic and
social scourge of national and international corruption.’
For many occurrences, then, more than one relation plausibly holds between the
conjuncts; for others, there seems to be no relation other than continuity. We suggest
that the range of contexts in which en effet occurs in the political speeches genre
reflects its vagueness rather than polysemy. Across different context types, it implies
consonance and helps validate or in some way reaffirms the previous idea.
To summarize, en effet links its host segment to the previous segment, thereby
creating a two-segment discourse pattern. The en effet host expresses an idea that is
entirely consonant with the previous idea, which it reformulates or expands on with
a more particular, or, more rarely, a broader idea. There is a range of similar rela-
tions with which use of en effet is compatible, and its removal does not result in
incoherence. The frequency and contexts of en effet point to its being highly
bleached, and rather than consider that en effet is polysemous, it better fits these data
to characterize it as vague: we can hypothesize that these relations are contiguous in
conceptual space.
As mentioned above, en effet occurred in full clauses. In this genre, a theme is
typically introduced in general terms in one clause and then fleshed out or expanded
on in the next. Insofar as the en effet host provides the additional detail, it is infor-
mationally subordinate to the previous segment (a ‘nucleus-satellite’ relation typi-
cal of elaboration, in RST terms, or a ‘core-contributor’ relation in RDA terms).
Oberlander and Moore (2001) cite corpus studies showing that, in English at least,
a discourse marker is much less likely to be used when there is nucleus-satellite
(core-contributor) order, since this order is easy to process, and marking is superflu-
ous. All this suggests that there may be reasons other than coherence marking and/
or information structure marking for such frequent occurrence of en effet. And when
seen in wider rhetorical context, it appears that en effet forms part of a network of
markers providing thematic continuity and lending a particular rhetorical rhythm to
the discourse through parallelism.
Two discourse constructions for en effet can be identified in this genre: (i) and (ii) the more frequent where q is <Subj – V/aux – en effet –
Compl>. While the relation is the same for both (p is any proposition and q is pre-
sented as confirming or expanding on p), the information structure differs, reflecting
that of the higher-level constructions (i) and (ii) where q is <Subj –
V/aux – DM – Compl>. The regularity of the post-verb/auxiliary position, shared
with other very high-frequency connectives, suggests the second is the more
grammaticalized.
5.3 Comparison
Both en effet and indeed are modal adverbs that retain some epistemic sense but
have now taken on discourse structuring functions too. Both are found overwhelm-
ingly in contexts of elaboration in this genre.
Halliday describes ‘elaboration’ as where “one clause elaborates on the meaning
of another by further specifying or describing it” (1994: 225). In paratactic elabora-
158 D. Lewis
tion, the secondary (elaborating) clause may have one of three functions: (i) “to
restate the thesis of the primary clause in different words, to present it from another
point of view, or perhaps just to reinforce the message”, (ii) to develop the thesis of
the primary clause “by becoming more specific about it, often citing an actual exam-
ple” and (iii) to clarify the thesis of the primary clause, “backing it up with some
form of explanation or explanatory comment” (1994: 226). This sense of elabora-
tion comes close to matching the predominant political speech use of en effet, which
is found in all three contexts.
Connective indeed is used more narrowly, either to present a stronger version of
the same claim, or to make a further and stronger claim related to the first claim. Its
initial position and parenthetical syntax are typically presentative. There is thus a
significant difference in the information structuring functions of the two expres-
sions, en effet marking its host as old or given information (from a new aspect or in
more detail), while indeed introduces a new and more surprising claim
(counterexpectation).
A second difference, as we have seen, is that en effet appears to be more gram-
maticalized than indeed, which ties in with its much greater frequency and its
bleached semantics that allows it to occur in a wider range of contexts.
Finally, the markers should be seen in the context of the wider rhetorical patterns
of the genre. En effet contributes, along with other markers, to a pattern of parallel
ideas each explicitly linked to the previous discourse. The English speeches make
more use of juxtaposition, so that indeed does not function as part of a network of
markers.
6 Conclusion
It has been seen that the overall effect of use of additive markers in French political
speeches is to create even-paced stretches of discourse where each segment forms a
link in a well-constructed chain of arguments and where the hierarchical structures
(the rhetorical dependencies) are transparent and conventional. One of these con-
ventions is the regular, almost rhythmic use of additive discourse markers such as
également, de même, en effet, ainsi, all occurring overwhelmingly in the same post-
verb/auxiliary position in the host, acting as the ‘hooks’ attaching each segment to
the previous discourse in a series of parallelisms. Metaphorically-speaking, these
markers can be seen as pinning the content of the discourse to its rhetorical
backcloth.
The use of dedicated connectives – for coordination, subordination and discourse
connectivity – has been linked to literacy. Speakers conjoin fewer consituents than
writers. Non-literate languages rely heavily on juxtaposition and often lack gram-
maticalized coordination or acquire it through language contact (Mithun 1988). The
density of additive marking in the French speeches does convey a literary impres-
sion as well as a degree of formality that is less striking in the more conversational-
sounding English ones. This is no doubt a reflection of the greater distance between
literary and conversational French than is the case for English. Additive markers
combine with other coherence markers to form a network that knits the discourse
together into a tightly-structured whole. In the English speeches, by contrast, addi-
tive discourse relations are more often left implicit, and the resulting discourse is
more loosely woven.
As seen in Sect. 2, markers of consonant discourse relations are expected to be
relatively infrequent because discourse coherence can be established by juxtaposi-
tion within a logical ordering. These most frequent French markers, however, seem
to function in this genre as text-structuring devices marking information flow more
than as relational propositions. In political discourse, a regular filling of this French
discourse-marker ‘slot’ seems almost obligatory. The French markers are more fre-
quent, more bleached and arguably more grammaticalized than their English
counterparts.
Further research will need to situate discourse marking in this genre with respect
to other genres and discover to what extent the discourse constructions frequent in
political speeches are used across other genres, and how these constructions may be
evolving.
Note
1. Translation agencies regularly advise their clients that the ‘expansion rate’ in
translation from English to French is between 15 and 20% by word count. See,
for example, <http://www.kwintessential.co.uk/translation/articles/expansion-
retraction.html>,<http://translation-blog.trustedtranslations.com/prices-according-
to-source-word-count-2010-02-25.html>, <https://e2f.com/203/> and <http://
www.andiamo.co.uk/resources/expansion-and-contraction-factors>. Conversely,
translations from French to English are shorter by word count. Armstrong (2015)
discusses “the high expansion rate usually seen in translation from English to
French” (Armstrong 2015: 193).
References
Aijmer, K. (2007). Modal adverbs as discourse markers. A bilingual approach to the study of
indeed. In J. Rehbein, C. Hohenstein, & L. Pietsch (Eds.), Connectivity in grammar and dis-
course (pp. 329–344). Amsterdam: John Benjamins.
Aijmer, K. (2008). The actuality adverbs ‘in fact’, ‘actually’, ‘really’ and ‘indeed’ – Establishing
similarities and differences. In M. Edwardes (Ed.), Proceedings of the BAAL conference 2007
(pp. 111–120). London: Scitsiugnil Press.
Armstrong, N. (2015). Culture and translation. In F. Sharifan (Ed.), The Routledge handbook of
language and culture (pp. 181–195). London: Routledge.
Asr, F. T., & Demberg, V. (2012). On the Information conveyed by discourse markers. Proceedings
of the Workshop on Cognitive Modeling and Computational Linguistics, Sofia, 08 August 2013,
pp. 84–93. Association for Computational Linguistics.
160 D. Lewis
Bertin, A. (2002). L’émergence du connecteur en effet en moyen français. Linx, 46, 37–50.
Charolles, M., & Fagard, B. (2012). ‘En effet’ en français contemporain: de la confirmation à la
justification/explication. Le Francais Moderne, 80(2), 171–197.
Chuquet, H., & Paillard, M. ([1987] 1989). Approche linguistique des problèmes de traduction
anglais<>français, revised edition. Paris: Ophrys.
Couper-Kuhlen, E., & Thompson, S. A. (2000). Concessive patterns in conversation. In E. Couper-
Kuhlen & B. Kortmann (Eds.), Cause, condition, concession, contrast: Cognitive and dis-
course perspectives (pp. 381–410). Berlin: Mouton de Gruyter.
Danlos, L., Colinet, M., & Steinlin, J. (2015). FDTB1, première étape du projet « French Discourse
Treebank » : repérage des connecteurs de discours en corpus. Discours, 17 . doi:10.4000/
discours.9065.http://discours.revues.org/9065
Dictionnaire des synonymes français. Digital resource developed by the CNRS, University of
Lyon 1 and University of Caen. http://dico.isc.cnrs.fr/dico/fr
Dictionnaire Le Robert & Collins français-anglais et anglais-français. 2013. Editions Le Robert.
Fetzer, A., & Johansson, M. (2010). Cognitive verbs in context. In S. Marzo, K. Heylen, & G. de
Sutter (Eds.), Corpus studies in contrastive linguistics, special issue of the International
Journal of Corpus Linguistics, 15(2), 240–266.
Gallagher, J. D. (1995). L’effacement des connecteurs adversatifs et concessifs en français mod-
erne. In M. Ballard (Ed.), Relations discursives et traduction (pp. 201–220). Lille: Presses
Universitaires de Lille.
Guillermin-Flescher, J. (1981). Syntaxe comparée du français et de l’anglais. Problèmes de tra-
duction. Paris: Ophrys.
Halliday, M. A. K. (1994). An introduction to functional grammar. London: Edward Arnold.
Hasselgård, H. (2014). Additive conjunction across languages: ‘Dessuten’ and its correspondences
in English and French. In S. O. Ebeling, A. Grønn, K. Rå Hauge, & D. Santos (Eds.), Corpus-
based studies in contrastive linguistics. Oslo Studies in Language, 6(1), 69–89.
Hobbs, J. R. (1985). On the coherence and structure of discourse (Report No. CSLI-85-37, Center
for the Study of Language and Information). Stanford: Stanford University.
Holmes, J. (1994). Inferring language change from computer corpora. ICAME Journal, 18, 27–40.
Levy, R., & Florian Jaeger, T. (2007). Speakers optimize information density through syntactic
reduction. In B. Scholkopf, J. Platt, & T. Hoffman (Eds.), Proceedings of the twentieth annual
conference on neural information processing systems (pp. 849–856). Cambridge, MA: MIT
Press.
Lewis, D. M. (2005). Mapping adversative coherence relations in English and French. In K. Aijmer,
H. Hasselgård, & S. Johansson (Eds.), Contrast in context. Special issue of Languages in
Contrast, 5(1), 33–48.
Mann, W. C., & Thompson, S. A. (1986). Relational propositions in discourse. Discourse
Processes, 9(1), 57–90.
Mason, I. (2001). Translator behaviour and language usage: Some constraints on contrastive stud-
ies. Hermes, 26, 65–80.
Mithun, M. (1988). The grammaticalization of coordination. In J. Haiman & S. A. Thompson
(Eds.), Clause combining in grammar and discourse (pp. 275–329). Amsterdam: John
Benjamins.
Murray, J. D. (1997). Connectives and narrative text: The role of continuity. Memory and Cognition,
25(2), 227–236.
Nùñez Pertejo, P. (2008). The multifunctionality of ‘indeed’ in contemporary spoken and written
English. English Studies, 89, 716–736.
Oberlander, J., & Moore, J. D. (2001). Discourse cues: Further evidence for the core contributor
distinction. Cognitive Linguistics, 12(3), 317–332.
Patterson, G., & Kehler, A. (2013). Predicting the presence of discourse connectives. In Proceedings
of the conference on Empirical Methods in Natural Language Processing EMNLP, pp. 914–
923. Association for Computational Linguistics.
Roget’s Thesaurus. online at http://www.roget.org/
Rossari, C. (2016). L’approbation dans un dialogue devient-elle une concession dans un mono-
logue ? Etude de ‘certes’, ‘en effet’, ‘effectivement’, ‘d’accord’, ‘OK’. In L. Sarda, D. Vigier,
& B. Combettes (Eds.), Connexion et indexation. Ces liens qui tissent le texte. Mélanges pour
Michel Charolles. Lyon: ENS Éditions. http://books.openedition.org/ enseditions/6847.
Roze, C. (2009). Base lexicale des connecteurs discursifs du français. Master’s dissertation,
University of Paris Diderot.
Schoonjans, S. (2012). Topologie contrastive des particules de démodulation. Comparaison de
l’allemand et du français. Leuven Working Papers in Linguistics, 1, 62–76.
Schoonjans, S. (2014). Oui, il y a des particules de démodulation en français. CogniTextes, 11.
Simon-Vandenbergen, A.-M., & Aijmer, K. (2007). The semantic field of modal certainty. A
corpus-based study of English adverbs. Berlin: Mouton de Gruyter.
Taboada, M. (2009). Implicit and explicit coherence relations. In J. Renkema (Ed.), Discourse, of
course (pp. 125–138). Amsterdam: John Benjamins.
Traugott, E. C., & Dasher, R. (2002). Regularity in semantic change. Cambridge: Cambridge
University Press.
Part III
Contrastive Analysis Across Genres of
English
Callbacks in Stand-Up Comedy:
Constructing Cohesion at the Macro Level
Within a Specific Genre
Catherine Chauvin
Abstract The paper is a discussion of the type of cohesive devices that can be
found in stand-up comedy, focusing more specifically on callbacks. Other cohesive
devices are also mentioned so as to provide some background on how stand-up
comedy shows are structured. Stand-up comedy shows are indeed quite generally
ignored in the discussion of genre-related cohesion-building mechanisms, and this
paper aims at filling this gap. The paper uses as theoretical backdrop the functional
linguistics analyses of cohesion, as well as some of the discussions of topic continu-
ity and sequencing done in Conversation and Discourse Analysis. A short compari-
son with some of the devices used in literary narratives is also proposed, using the
tools of French structuralist narratology (Genette’s analepses, in particular), which
allows us to delve further into the specificities of the genre. It is shown that the call-
back technique used in stand-up comedy offers very interesting data on how a dis-
course can be made coherent at a macro level, vs. the inter-sentential one; such
techniques should therefore be included in the repertoire of cohesion-building
tokens when these are discussed across genres.
Keywords Stand-up comedy • Cohesion • Callbacks • Narratology • Narrative •

Genre
1 Introduction
The topic we will be dealing with in this paper is that of callbacks in stand-up com-
edy. We will be discussing what they are, the extent to which they may be said to be
constitutive of a specific genre, and, more particularly, what they can bring to the
analysis of the inventory of cohesion-building devices. Cohesion-building devices
have been vastly studied in linguistics, with, of course, Halliday and Hasan (1976)
constituting a seminal background study. Moreover, Conversation Analysis and
C. Chauvin (*)
Department of English, University of Lorraine, Nancy, France
e-mail: catherine.chauvin@univ-lorraine.fr

DOI 10.1007/978-3-319-54556-1_8
166 C. Chauvin
analyses of discourse and/or pragmatic markers have provided interesting hypoth-

eses on such questions. Cohesion-building devices that work at the macro level have
nonetheless tended to receive less attention than those that work at a micro, or
linear-sequential, level, even though classical rhetoric as well as contemporary nar-
ratology have dealt with devices that actually do work at the level of a whole “story”
rather than at the local level (cf. the forthcoming discussion of prolepsis and ana-
lepsis as found in Genette 1972). Yet, this was applied to literary or “traditional”
forms of narratives, not to comedy. Stand-up comedy is in fact largely missing from
theoretical discussion of coherence or of narrativity,1 and this constitutes a gap in
the literature, since it offers very interesting data that ought to be included in the
analyses. There is, therefore, a double aim to this study: (1) to show how cohesion
is built in stand-up comedy, and (2) to see what the inclusion of stand-up comedy
can bring to the wider debate, as it can be assumed that the analysis of practices that
are entrenched in a given genre may also illustrate general questions in interesting
ways, if only by enriching the typologies. We will therefore, after the introduction
in Sect. 1, first introduce the corpus in Sect. 2 and give a very brief overview of
stand-up comedy in Sect. 3. In Sect. 4, we will see that although stand-up comedy
shows tend to be intrinsically heterogeneous, there are devices and techniques that
are used to make them more cohesive. In Sect. 5, we will focus more precisely on
describing callbacks. Then in Sect. 6, we will discuss the types of conclusions that
can be drawn from callbacks as a practice: in Sect. 6.1, we try and see what call-
backs can bring to the study of cohesion marking, and in Sect. 6.2, deal with their
relation to narrativity (always keeping in mind, in this paper, its link to cohesion and
coherence issues); finally, in Sect. 6.3, we briefly examine the relations between
cohesion, narrativity and humour. We will finally propose a general discussion of
our results in Sect. 7.
2 The Corpus
The corpus is a series of full-length stand-up comedy shows, so that coherence can
be examined at the level of a whole show. We focus on the shows of British come-
dians performing in English, the earliest ones being performed in the 1990s and the
most recent one in 2016. A previous study on cohesion-marking in stand-up comedy
was published in 2015 (in French); references to it are included in this chapter.
1
There are a few studies that have been carried out; there are several student papers some of which
can be found online, sometimes without a clear author’s name (http://rudar.ruc.dk/, http://library.
binus.ac.id/; the topic of comedy seems to have become popular for both Master’s Theses and
seminars). Schwarz’s study (Schwartz 2010) does not specifically focus on cohesion; other studies
may not be strictly speaking linguistic, or not linguistic, but they may shed interesting light on the
genre and the problems we are discussing here (e.g., Glick 2007; Bolens 2015). The question of
narrativity will be dealt with here in relation to cohesion; other dimensions of narrativity may not
be included in the paper.
Callbacks in Stand-Up Comedy: Constructing Cohesion at the Macro Level Within… 167
The shows that are mentioned in the references (27 of them; see references) were
all watched with this question of cohesion-marking in mind, the relevant passages
being noted down, with a few notes on what seemed to be the technique involved.
The average length of a show is about one hour and a half to two hours. The shows
of Bill Bailey (four; see references) and Eddie Izzard (seven; see references, as well
as performed versions of Force Majeure (Reloaded) and Stripped) were more spe-
cifically focused upon, but attention was paid to varying the sources, and references
are made to the other comedians and shows, too (see the next sections). Another
source that was used as a complement to the shows themselves was texts about the
shows: certain interviews, in particular the DVD commentary to Force Majeure
which contains a few remarks on writing, and self-help books or documents written
for young comedians (again, see references), giving advice on how to write and
perform shows. Excerpts are quoted in Sect. 5.1.
Some shows were dealt with in greater depth because there were a number of
interesting examples, Bill Bailey and, in particular, Eddie Izzard, being astute users
of cohesion devices, and, maybe even more, of callbacks.2 Other shows, being more
or less entirely organized around one-liner jokes –Tim Vine’s shows are a case in
point–, are arguably less directly relevant to the study of cohesion-building mecha-
nisms. But, in fact, certain devices can still be present even in such apparently non-,
or a-, coherent routines,3 and a number of shows make use of certain or all of these
devices, as is illustrated by some of the examples used in this paper. Izzard’s shows
are long monologues in which the comedian goes from one topic to another in a
seemingly random way, and they are known for their surrealist streak (Izzard has
famously been called “the lost Python” by John Cleese). Bailey is also a musician,
so the shows alternate between monologues, video-based passages and musical
numbers. Eddie Izzard’s and Bill Bailey’s shows are, of course, therefore partly
idiosyncratic, because they are shows by specific comedians in specific contexts;
but even if they may use the mechanisms in specific ways, these mechanisms can be
found in other shows as well (see, again, the examples discussed in this paper), and
callbacks in particular can be deemed to be characteristic of the genre because: (1)
if some comics use them less often or less strikingly than others, a number of them
do make use of the device in some way or another, and (2) in the documents written
for would-be comedians, they are often mentioned as a good technique to use, as
will be seen in Sect. 5.1. It can be noted, to conclude this section, that a certain level
of idiosyncrasy probably has to be present in some way or another in comedy, as
probably is the case in all creative genres: creative genres do not, or even cannot,
aim at complete reproduction, because of their very creative nature. But a number
of codes also tend to be followed, and certain practices tend to be shared. We intro-
duce some of these in the next section.
2
Harry Hill is also reputed for his callbacks, but we have not got yet to analyzing his shows.
3
Certain one-liners follow each other thematically, for instance, which creates (usually very) short
sub-sections within the shows. Shows made entirely of a succession of one-liners, vs monologues,
are nonetheless the exception rather than the rule in our corpus (also see next section).
168 C. Chauvin
3 Some Introductory Notes on Stand-Up Comedy
We will only focus here on what is relevant to the analysis; Double (2014) can be
consulted for a more detailed presentation.
Stand-up comedy is a type of comedy that was for a long time particularly asso-
ciated with the English-speaking world, although it is now much more international.
Its birth and development in the U.S. and the U.K. has sometimes been linked to
such local traditions as that of the music hall, or of the vaudeville (Double 2014). It
started around the 1950s–60s in the U.S., and gained prominence in the 1960s–70s.
In the U.K., a creative alternative4 scene also emerged in the 1980–90s and, nowa-
days, many comedians have become household names, some of them (Eddie Izzard,
in particular) filling stadiums and arenas as big as Wembley and the O2.5 Some of
them are television personalities as well as performers. The comedians that are
quoted by others as “models”, or the founding fathers, so to speak,6 tend to be
American: Richard Pryor, for instance, is often quoted as a model by fellow come-
dians. But they may also, albeit less commonly, be British: Billy Connelly is thus
often quoted by fellow comedians as an inspiration. Since we will be focusing on
U.K. shows in this paper, we will be dealing with a relatively long-living, and active
scene.
A stand-up comedy show is a form of live humorous performance during which
the comedian is typically alone on stage, with very few or no accessories except a
microphone, and speaks on a variety of topics for a given length of time. There are
exceptions to this, since Bill Bailey, as was mentioned in the preceding section, is
also a musician and uses instruments on stage, and some (Jimmy Carr, Bill Bailey,
Dylan Moran) also use screens and projections. The use of props nevertheless tends
to be kept to a minimum. A succession of one-liners can constitute the whole of the
show (as in Tim Vine’s case), but more often than not, one-liners, if they are present,
are included in the structure of the show, or can be present in certain dedicated
sequences (Jimmy Carr), with other sequences combining videos and one-liners, or
interaction with the audience. A performance (a show, or gig) can last from a few
minutes, for collective shows and/or for newcomers, to somewhere between 1 and 2
h for an individual performance. During this time span, the comedian speaks, and
entertains the audience. The formats and types of humour are truly varied. Some of
the shows can be rude or crude, but clearly not all shows are: Izzard’s and Bailey’s
are not. There also is a tendency for observational humour to constitute the default
4
Although it is difficult to explain it in just a few lines, “alternative” – the word is used by the
comedians themselves – refers to a form of comedy that is (meant to be) different from what
existed before, in its contents (for instance, maybe, self-reference, i.e., using one’s own life as
comedy material, for U.S. performers in particular, but not only them) or form (improvisation;
performances in small venues such as pubs and clubs; humour is no longer based on jokes…).
5
Wembley Stadium in London sits 90,000 people and the O2, 20,000.
6
Or mothers, although a number of them are men. Examples of women performers are, for instance,
Ellen DeGeneres and Elaine May in the U.S.; in the U.K., Jo Brand and Sarah Millican, who are
mentioned in this paper, are also women.
basis of a show. The comedian either shares personal anecdotes with the audience
or points out “observed” facts (have you noticed that this always happens when..?),
focusing for instance on weird behaviour or surprising connections. As mentioned
in the preceding section, it is also important for each comedian to create their own
identifiable persona. This comedy form is therefore both varied and unified, but
there are strong marks that “delineate” it and signify that the show is a stand-up
show, like the microphone, the quasi-absence of props, the energetic monologue.
What can be said to be partly constitutive of the genre is also the relationship that
is built between the performer and his/ her audience. Unlike traditional theatre,
stand-up comedy generally does not recognize the existence of a “fourth wall”, the
invisible frontier that separates the stage from the rest of the room and assumes that
the space of the stage is also the space of fiction, a different world that has no direct
interaction with the floor. There is no symbolical opposition between the stage and
the floor in stand-up comedy, which means that interaction with the public is pos-
sible, although, again, the amount of interaction may depend on the personality and
style of the comedian. Some interaction can be forced upon the comedian and not
be elicited when a member of the audience directly hails the comedian (that is called
heckling). The more famous and renowned the comedian and the larger the room,
the less interaction there may also be and the more the space of the stage becomes a
symbolically separate space again. There is a connection between this aspect of
stand-up comedy and its content: what is often discussed on stage is supposed to be
“real life”, and normally not fiction, which is another interesting dimension of the
genre.
The subject matter of shows also clearly varies, as yet another specificity of the
genre that is directly relevant to this study is the fact that comedians move from one
topic to another with constant shifts. Ross Noble, for instance, is known for his
“tangents”. This means that the shows might not always be, or at least appear to be,
structured, which also makes the analysis of cohesion markers and coherence-
building central and interesting. Callbacks, which we will define further down, are
one of the ways in which a continued flow of heterogeneous topics can be made to
form a whole. But it is also clear that no stand-up comedy show is truly or evidently
a “whole”: it is often made up of successive parts that are clearly disconnected.
Performances are both collections of unrelated items and a specific kind of unit,
which we will now explore.
4 Cohesion-Building in Stand-Up Comedy: An Overview
This section will be an overview of some devices that are used in stand-up comedy
shows and can be considered to be cohesive devices. We will just present them here
so that they can provide the background against which callbacks have to be assessed.
The connection between callbacks and these cohesive devices will also be briefly
explored.
170 C. Chauvin
4.1 Building a Whole Linguistically: A Few Definitions
As is well known, cohesion has to do with the linguistic (or, sometimes, paralinguis-
tic; see below) features that make a text a text, and not just a collection of random
utterances, whether spoken or written. The term is therefore used in a similar way to
that found in Cohesion in English (Halliday and Hasan 1976), which famously
listed ways in which the very textuality of a text affects its structure: anaphora/ ref-
erence, ellipsis, substitution, conjunction, lexical cohesion. Cohesive devices are
linguistic devices, perhaps sometimes paralinguistic, too (again, see below), that are
there because a given utterance is part of a whole and so comes after something else
and before something else. Part of the difficulty may consist in drawing up a list of
such devices, although a number of studies have done that, in different ways (e.g.,
Halliday and Hasan 1976; Duchan et al. 1995). Another problem is to show why,
and how, a given device actually creates cohesion. Coherence is to be sought at a
semantic, or logical level; it has to do with how relevant and/or logically sound the
different connections that are made between the ideas, arguments or events are.
Cohesion-building devices may contribute to making a discourse more coherent,
but the connection between the two is notoriously complex: a cohesive text may be
entirely incoherent (?This house is blue, because it likes it), and it is possible to
build a very coherent reasoning with no known cohesive devices at all (He came. I
was happy. We all wept.) Links may be construed without them being necessarily
expressed, and be computed pragmatically, or discursively. The connection or
absence of connection between cohesion and coherence may come to play a role in
stand-up comedy, as it can be a source of humour (Chauvin 2015). This question
will be briefly taken up in Sect. 6.3. The ways in which a given series of utterances
is made to be a spoken text have also been studied in Conversation Analysis, and
will be used as background hypotheses in this study (cf. the presence and construc-
tion of topic continuity). The recognition and discussion of the existence of a level
of subordinate structures (cf. sequences) is also of relevance. A large number of
analyses of “pragmatic” or “discourse” markers have also resulted in the addition of
such elements to the list of cohesive devices; well, oh, so, etc. have all been studied
in relation to the building of cohesion, at the ideational but also the interpersonal
level (for instance, Schiffrin 1987). They will only be mentioned briefly here but a
few examples will be cited in relation to the main questions discussed in the paper.
Cohesion devices have been studied in a number of genres, and, partly, as was
said before, written texts, or conversation. Now, choosing the relevant framework
for stand-up comedy may be a question to be dealt with at the outset. Stand-up com-
edy is neither “text” nor “conversation” – it is prepared speech, but spoken, with
possible room for improvisation. The routines have a clear conversational dimen-
sion (see Sect. 3 and the absence of the fourth wall), too, but are also mostly mono-
logues. Narratives, which could perhaps be considered to be specific kinds of texts,
have been analysed separately by (literary) narralogists and linguists, but stand-up
comedy is not one organized narrative, and comedy shows do not belong to the type
of texts that have been described in such classical studies as that of fairy tales by
Propp 1970, or even the spoken narratives such as those analysed in Labov and
Waletzky 1967, Labov 1997, 2001, 2004, 2006, or such as the “frog narrative”
(Slobin 2005). The French structuralists, Barthes, Genette, Todorov in particular
(cf., for instance, Barthes 1966; Genette 1966, 1972; Todorov 1966, 1967, 1971),
have dealt with the organization of (again, literary) narrative in ways that turn out to
be partly relevant here, but probably not entirely. The humorous dimension of shows
may as well of course leave its mark on the type of devices that are used. These
links, and differences, have to be kept in mind.
4.2 Cohesion and Coherence in Stand-Up Comedy
Looking at stand-up comedy shows with the study of cohesion devices in mind
reveals a number of interesting facts. As was said before, one point of clear interest
is the fact that in stand-up comedy, there is very little –at least, obvious– topic con-
tinuity. A stand-up comedy show is not supposed to be “about” one given topic or
even one specific story; stand-up comedy routines are made up of a succession of
unrelated anecdotes with swift and regular shifts, so that they do not have an overt
coherent form. Some comedians may just keep it random, but a number of comedi-
ans do try and build something out of very unsystematic material.
A previous study of the question (Chauvin 2015) has allowed us to describe
some of the devices that are found in shows. Among the techniques that can be
found are:
–– Structural repetition;
–– Incremental topic shift;
–– The use of discourse markers;
–– NP-based topic introduction.
These devices may be used separately or conjointly; we will start by describing
them with the help of a few examples.
Structural repetition is very close to rhetorical (vs. linguistic) anaphora, where a
given word or structure is repeated at the beginning of each verse. In our case, each
sequence7 may be introduced with the same phrasing. Even though the sequences
have nothing or little to do with each other, repetition creates artificial cohesion and
makes the show “feel” like a whole. In Qualmpeddler (Chauvin 2015), Bill Bailey
introduces three very different sequences in his way (example 1):
(1) A. So I was in China…

B. So, anyway, I was in South Africa… (repeated after a parenthesis) So I was
in South Africa…
C. So I was in America, obviously…
We will not discuss how the notion of sequences can be applied to shows in detail here.
7
172 C. Chauvin
Incremental topic shift is something that according to Conversation Analysis can

be found in spontaneous conversation as well (see, for instance, the synthesis pro-
posed by Levinson in Levinson 1983). If the way in which shifts occur is examined,
the amount of topic discontinuity, which is present in stand-up comedy, can be seen
to be less than assumed. In stand-up comedy too, one topic can be –sometimes
loosely– connected to another, which in turn is connected to another, so that at the
macro level there is a great deal of discontinuity, but at the micro level there are
continuity links between topics. An example was discussed in (Chauvin 2015) from
Qualmpeddler (example 2). Bill Bailey introduces himself. This leads to the men-
tion of TV shows, which in turn brings about the mention of celebrities. This leads
to politicians as celebrities, and reality-TV celebrities. This is followed by an anec-
dote involving a reality-TV celebrity, which (because of its content) brings in a
discussion of cognitive dissonance and of the naming of body parts. If topics are
examined independently, they seem entirely random, but there are local connections
which account for the diversity of topics. Another example is the following excerpt
from Eddie Izzard’s Circle (example 3): a mention of the Pope is followed by a
discussion of the Beatles and a reflection on the British English phrase “It’s the
dog’s bollocks”, then the discussion goes back to popes. This does involve digres-
sions, but it is also the fact that the Pope at the time is called John-Paul that leads to
the Beatles (John Paul George Ringo), and as he discusses the fact that it sounds
“cool” he uses the phrase “the dog’s bollocks”, which he pauses to comment upon,
and finally he comes back to the names of popes. It may be slightly more random,
but it is still (often) locally connected. It can be noted that the presence of local links
may be an aesthetic device, as it may also be used to emulate “spontaneous” speech,
but, of course, it may simply also make the routine easier to memorize. Other tech-
niques may be involved. For instance, one topic may branch off into two. One thread
is followed, and then the one that was left aside is taken up again, so that the con-
nection is not local but has to be found at a preceding node in the show: Bill Bailey
does that several times in Dandelion Mind, where, for instance, he introduces the
topic of doubt, which then branches off into first a sequence on science, and then
another one on pictures and sculptures representing St Thomas, the doubter. When
a topic is introduced, a series of possible illustrations may also be proposed: Twitter
is mentioned, and then a series of anecdotes are mentioned that are related to his
own account, Obama’s account, and the Queen’s account (example 4). Connections
therefore can be strictly speaking local, but also start from hubs at certain moments
in the show.
Discourse markers. The use of discourse markers is not specific to stand-up com-
edy, but they are used in ways that also contribute to the structuring of the show,
whether their use is spontaneous on the comedian’s part –it just comes “naturally”
since it corresponds to a common context of use in English–, or whether it is inte-
grated into its structure –when they are gimmicks, or are parts of rehearsed intro-
ductory phrases or transitions. So seems8 to be frequently used to resume talk on a
These remarks are provisional.

8
previous topic after a digression. Right and anyway, or combinations, such as so

anyway, are also found. Jo Brand, for instance (example 5), seems to use right regu-
larly when an endpoint has been reached. Whether they are natural or used as a
device, their presence signals the sequencing to the audience, and they can therefore
be said to indicate topic continuity and function as sequence markers. An example
of a discourse marker used as a gimmick is so…yeah in early Eddie Izzard (exam-
ple 6). What was perhaps originally just an idiosyncrasy came to be a stylistic
marker used by imitators9 or even used self-referentially –i.e. it becomes typical of
a comedian’s style and can come to be used against the expectation that the audience
will recognize it. Like the “writing-on-hand” gesture,10 such markers can become
gimmicks an audience recognizes and therefore develop into a possible vector of
bonding (see Sect. 6.3), as well as “just” being discourse markers.
NP-based topic introduction. When a topic is abruptly introduced, it can be done
openly, typically by simply using a noun phrase, preceded or not by a discourse
marker: And – NP! So… NP! Eddie Izzard does this very often; in Circle, he does it
several times to introduce a new topic within the show (example 7):
(7) So yeah, now –The Pope. What’s going on there?
Force Majeure starts in a similar way: a first topic is openly introduced as a new
topic as he pretends to be thinking about a possible way to start (example 8):
(8) So where should we start this show? Ah, yes –human sacrifice! That’s a good
place to start.
These last two examples may seem to emphasize the fact that shows are basically
discontinuous, but they also mean that sequences may be organized around topics,
and routines organized thematically and formally. Announcing topics in this way
may be abrupt, but this procedure also draws the audience’s attention to topic intro-
duction or change, and functions as a warning. At the discourse level there may be
discontinuity, but this is partly amended at the interpersonal level, as the hearer is
warned that there is going to be an unexpected jump.
So all in all, in spite of its apparent randomness, stand-up comedy does make use
of cohesive devices. A number of these, if not all, may not be specific to the genre,
but the uniqueness of stand-up comedy may be found in the combination of such
features. Now callbacks may be more specific to the genre. Their nature and role
will now be examined.
9
Cf. Phill Jupitus in Quadrophobia (or other shows, cf. QI, Series 10, Episode 3) who uses yeah
–as well as Good thing, and True story– to imitate Eddie Izzard.
10
When the audience does not seem to respond to something, Izzard, speaking to himself, says
something along the lines of Do not ever mention that again, Never use these two together again,
and pretends to write it on his hand for future reference. This has become a well-known gesture and
is used across shows.
174 C. Chauvin
5 Callbacks in Stand-Up Comedy
Callbacks are a type of “device” that is, in fact, known to performers; the term is not
introduced theoretically from an outsider’s point of view. Now the term may also be
a cover term that refers to different types of techniques, and some of the implica-
tions of the use of such (a) technique(s) need to be delved into.
5.1 What Are “Callbacks”?
As was just said, callbacks may constitute one device or perhaps a family of devices.
A number of websites and books written for would-be or professional comedians
openly mention callbacks as being one of the ways in which a heterogeneous rou-
tine can be made to function as a whole:
A callback is a reference a comedian makes to an earlier joke in a set. Callbacks are usually
made in a different context and remind the audience of an earlier joke, creating multiple
layers and building more than one laugh from a single joke. When used at the end of a set,
callbacks can bring a comic’s routine full circle and give closure to the set. Also Known As:
call back Callbacks, Glossary of Comedy by Patrick Bromley, Comedians Expert, About :
Entertainment, http://comedians.about.com/od/glossary/g/callback.htm
Callback — A punchline that refers, or “calls back,” to a joke or premise from earlier in the
performance […]. One of the most reliable comedy tricks, a callback can elevate a marginal
joke to legendary. “And then he closed strong by tying it all together with a callback to his
opening joke about lupus.”http://www.creatingacomic.com/comedy-glossary/
Callbacks. A callback is when you call back, or mention again, something you brought up
earlier in the act. (Carter 2001)
Call Back. An invaluable trick of the comedian’s trade is the “callback.” Imagine a guest
coming out later in Conan O’Brien’s show wearing Google Glass; the host could get big
laughs by miming a punch. The writer’s version of a callback is a glancing reference to a
detail, metaphor or phrasing from earlier in the piece. The device flatters readers and adds
to the continuity of the work, so give it a try. Please. https://theamericanscholar.org/seven-
things-writers-can-learn-from-stand-up-comedians/#.VVjZerntnBo
The call back. A callback is a reference to something said earlier in a routine or sketch. The
reference is usually a previous joke, but stand-up comics often use callbacks after interact-
ing with the audience –an audience member’s name will be inserted into a later joke. For a
callback to work, the time between the original reference and the callback must be rela-
tively brief. Repeated callbacks can be used (but never more than three times, of course11).
(Helitzer and Shatz 2005: 247)
The author’s are probably thinking of the “rule of three” (three is linked to good rhythm) often
11
discussed by comedians. This will not be developed here.

Reincorporation. A routine – also known in the U.S. as callbacks12– in which a comedian

does a couple of gags about a certain subject, then moves on to another subject, and then
goes back to the initial subject intermittently. (Ritchie 2012: 12)
The authors of those practical guidelines say that callbacks create continuity. They
also often emphasize another function –they are supposed to produce laughter
because they build a connection with the audience (our bold characters):
Audiences like callbacks because repeated references cause them to feel as if they are
part of a shared experience. (Helitzer and Shatz 2005 : 247)
This gives the routine cohesion and involves the audience, who have to work out what the
comedian is on about. (Ritchie 2012: 12)
It is a clever and inclusive strategy which makes the audience feel more involved in the
show because they have to work it out. (Ritchie 2012: 121)
A slightly different explanation is mentioned in the following quotation, as it is

assumed that callbacks create a bond with the audience because they are thought to
be clever:
The laughter […] comes from […] the fact that the audience are kept on their toes, realis-
ing and appreciating the cleverness of the set structure and the fact that the comedian
has kept the joke going over a prolonged period of time. (Ritchie 2012: 121)
Despite the heterogeneous nature of shows, many practitioners do try to build in

some sort of structure. One of the ways they try to do that consciously is to use the
callback technique, which is supposed to be a good, or refined way of structuring a
show.
5.2 Callbacks in Stand-Up Comedy Shows
Callbacks therefore consist in re-injecting material that was mentioned earlier in the
show at a later point in the same show. Let us now describe this using a series of
attested examples.
5.2.1 Content-Based Callbacks
A classification of diverse cases can be proposed. Some callbacks can be said

to be based on the content of the show, “content” being taken in a broad sense, i.e.
something that was referred to or mentioned earlier in the show is taken up again
later in the same show. Callbacks of this type can be: character-based, topic/
12
Although the last quotation says that callback is not used in the U.K., many examples were
found. So reincorporation is just a different name for it.
176 C. Chauvin
anecdote-based, “emblem”-based (we provisionally propose this term and will

describe what it refers to), word- or word-form based, i.e. (openly) based on linguis-
tic material.
Character-based callbacks are typical of Eddie Izzard’s shows. A character-
based callback is a device that was apparently consciously introduced in earlier
shows (Definite Article).13 Eddie Izzard’s shows have a surrealist streak, as charac-
ters of all types and all periods meet across species and across historical eras (see
Glick 2007 for an application of chronotopes to Izzard’s shows). One of the ways in
which cohesion was built in his early shows was to bring characters that were men-
tioned before (e.g., a cat, a dinosaur) and have them all come back at the end of the
show, in the last scene. In more recent shows, this is can also be done at random
points in the show. In Stripped (example 9), a raptor and a cow driving a car reap-
pear at different moments, as well as a squirrel, jazz chickens (chickens who play
jazz), a giant squid, and Spartan sheep; and they all reappear at the end. In Force
Majeure (example 10), a passage dealing with Lord of the Rings includes a first
mention in passing of crack-smoking, which was referred to in a preceding section
about pipe-smoking. The mechanical chicken that is supposed to be Caesar’s advi-
sor suddenly reappears among the Lords of the Rings characters, as a very brief
reference is made to a rocket, which again points back to the crack-smoking passage
in which a deluded crack-addict suggested among other odd –and drug-induced,
stupid– things, to “staple [his] foreskin to a rocket”. This creates unity, and the type
of strangeness it may generate is also a trademark dimension of the shows (see Sect.
6.3).
Another way of recalling something is to mention again a topic that was used
before, or refer to an anecdote. These callbacks are therefore topic- or anecdote-
based. In Alan Davies (2013), Life is Pain (example 11), an anecdote referring to
former girlfriends is alluded to again later in the show, as well as one about being
unable to change one’s underwear as a child. It is only done twice, but it does con-
stitute a callback. In Phil Jupitus (2011), Quadrophobia (example 12), the idea that
new technologies have led to people finally exchanging more pointless information
than before leads him to say that friends just jocularly send insults to each other.
When a message is mentioned again later in the show, the insult is used again.
Another way in which reference to an anecdote can be made is through a combi-
nation of gestures, or gesture/ tone of voice/ word(s) that together are supposed to
refer to some previous part of an anecdote. A complex sign is used that refers to a
whole contextualized situation. This is what, for lack of anything better, we propose
to call provisionally an “emblem”. In Izzard’s Definite Article (example 13), Italians
are first mentioned, and as they are, the “Italian” character is made to say “Ciao!”
with a specific intonation/ accent, and a hand wave is made that is to be interpreted
as the wave of the Italian character from a scooter (Glick 2007 also discusses the
presence of such impersonation in Izzard’s shows; Bolens 2015 also mentions the
“kinesic” dimension of the shows). Later on, when something Italian is mentioned
again, the same “Ciao!” gesture/word/intonation/accent combination is used. (In
13
He mentions it in the commentaries of the Force Majeure DVD.
example (10), the rocket reference is also accompanied by a rapid upward gesture,
but the gesture, this time, does not replicate what was done for the first mention, but
is used to refer to the preceding passage visually as well as verbally, which is partly
different.) The next examples in which simple words are replicated may also some-
times be considered to be emblems when they are in fact a combination of intona-
tion/accent/word rather than just the repetition of a word.
A repeated word, or word form, can, thus, also be used. Bill Bailey, who uses
rhetorical anaphora in Qualmpeddler (see example 1), often also uses word- or
word-form-based callbacks. In these cases the word itself is used as a formal refer-
ence to an earlier part of the show; it is not (just) the content of it but also the phras-
ing that is brought to the fore. An example is “turns out” (example 14). Turns out is
first used in the anecdote that is told about the reality-TV star (see example 2).
When she realizes that the sun and the moon were not one and the same thing, she
is quoted as saying, Turns out they’re not the same thing. Bill Bailey then uses the
phrase “turns out” to introduce another statement, isolating turns out from the rest
of the sentence and making it stand out. Another example is dorsum (example 15):
the same anecdote about the reality-TV star leads to a discussion of the names of
body parts (see, again, example 2), and the name of the top of the foot (the rare,
technical term dorsum) is revealed and discussed. The word is re-injected again at
the end of the show in a reference to “the blood of Christ, the dorsum of Christ”.
Another example, this time not just word- but word-form-based, is the use of
hashtags and acronyms (example 16). In Qualmpeddler, Bill Bailey uses acronyms
early in the show, and then more acronyms are mentioned and created. And at the
end of the show, as a video about the (non-)appeal of the Christian faith to young
people is shown, it is made to conclude on a still that says “The Church OMG! #just-
prayin’”, the use of the hashtag being itself another callback, since at the beginning
of the show, #justsaying is used several times. In Force Majeure (example 17),
Eddie Izzard uses the French phrase Et voilà, explains that it can be used in all sorts
of contexts –for instance, he says as he pretends to be wiping his mouth with the
back of his hand, when you have just “vomited on the head of a child”–, moves on
to other things, and then later in the show concludes one of his sequences by saying,
Et voilà.
5.2.2 Audience-Based Callbacks
Although they may not entirely qualify as a different type of callback, audience-
based callbacks can be used when back references are not woven into the fabric of
the show, but based on something that comes from the audience (this is mentioned
in one of the quotations used in Sect. 5.1, Helitzer and Shatz 2005 : 247). Jimmy
Carr does it often (example 18), as well as Sarah Millican (example 19). The tech-
nique consists in engaging in talk with a member of the audience, preferably from
one of the front rows, or a heckler,14 and get their names and/or some information
14
See Sect. 3 for an explanation.
178 C. Chauvin
on who they are. Then the comedian re-uses their names or part of the information
that was given to them later in the show. The content of the information cannot be
anticipated, but the fact that something is asked of one member of the audience, and
memorized to be used again, has to be.
5.2.3 Delayed Closure
Delayed closure can be found in Bill Bailey’s shows, in particular Qualmpeddler,

but also Dandelion Mind. This may still count as a “callback” but in fact, we would
argue that it may in fact also constitute an independent device, perhaps to be classi-
fied within the broad category of callbacks. In Qualmpeddler it is used in relation to
one sequence –that of the owl, which we will explain now (example 20). Bill Bailey
is a frequent traveller and a lover of plants and animals, and an anecdote is told
about an incident that happened in China, where Bailey was apparently offered a
live owl at a restaurant to eat. He decides to buy the owl, but to release it again in the
open instead of killing it, and an adventure ensues in which he rents a car and tries
to find a place to release the owl. The anecdote suddenly comes to an –apparent–
end as the difficulties of getting the owl to fly away are mentioned. The owl is no
longer talked about and it seems that the conclusion of this sequence has been
reached. But at the end of the show, in the encore, a video showing the owl in full
flight is shown, which, in fact, actually constitutes the true (happy) ending of the
anecdote that was told before. This clearly is a cohesive device in which a sub-
section of the show that constitutes a narrative finally turns out to have been split in
two across the show, with the first apparent “conclusion” finally revealed as the
last-but-one step in the narration. The same technique is used again in Dandelion
Mind (example 21). Several apparently unimportant details that were included in
earlier parts of the show are brought back to mind. At one point in the show, a shop
assistant is incidentally said to have told the comedian that a given plant was “good
to stand next to”, which, as he points out, is a strange remark to make. At another
point in the show, he evokes the idea of wearing a bucket on one’s head when eating
Revels (which are sweets), so as not to guess the taste of their fillings thanks to their
colours. At the very end of the show, a video is shown in which he is in a park, stand-
ing by a plant and then sitting by it on a bench with a bucket on his head. As he is
standing with his plant next to him, he is also carrying plastic bags and ranting,
which is a reference to the “strange men carrying plastic bags shouting at cars at
crossroads” who were also mentioned in the course of the show.
5.2.4 Reference to Previous Shows as a Specific form of Callback
References to previous shows may also be used in specific cases; the comedian has
to be at least minimally famous and have formed a strong fan base in order to be
able to do that. This could be very close to the use of (external) cultural references,
except that the common culture that is recalled is that of the shows. Eddie Izzard
does it in Force Majeure, in which a retake of one of his famous routines, the “Death
Star canteen”, is proposed (example 22),15 and some characters that were just men-
tioned in passing but have become pet names for the audience, like “Mr. Stevens,
head of catering”, reappear and take center stage. This device can be seen as the
equivalent of “hit” songs in concerts, and the references are not entirely equivalent
to callbacks, but they partly have the same function(s) since they bring cohesion,
this time, to a whole series of shows rather than just one given show.
The use of structural repetition and of topic branching, may also be broadly
assimilated to callbacks, as they may also have the role of bringing something that
was mentioned before back to mind. The typology that we have just proposed shows
that there are, in fact, different forms of devices involved. A series of cases can be
described and distinguished that are both functionally close and formally different.
We will now examine some of the implications of the existence of such examples.
6 Callbacks, Cohesion, Narrativity, and Humour
6.1 Callbacks as a Cohesive Device
We will start by discussing what seem to be two possible consequences of the use
of callbacks in terms of cohesion. One has to do with the importance of the macro
level (and its interaction with the micro level); the other one also concerns the imbri-
cation of the local and the general level in terms of interpretation and
understanding.
It seems that one of the essential elements that callback techniques can bring to
the discussion of cohesion-building is the fact that they function at a macro level,
that of a whole “text”, i.e., here, the show, or sometimes (cf. example 22) several
shows, rather than at a linear-intersentential level. Although they do also function
more loosely, a number of traditionally recognized linguistic devices that are used
to create cohesion, like connectors (subordinators, coordinators) or anaphora create
local links between consecutive utterances, or between adjoining parts of a given
text. Of course, they may also join together elements that are further apart, or func-
tion at a level that is larger than that of individual consecutive sentences, but the type
of cohesion that is created here functions at the macro level and is not (entirely)
based on consecution.16
Callbacks also provide evidence of the fact that a full “text”, “conversation”, and
in this case, show, is not (just) a succession of utterances: the fact that they belong
to a whole is woven into their very structure. We may now wonder if in this case, a
This retake on a previous routine is what Bolens 2015 mostly focuses on.
15
Anaphors are of course known to (sometimes) function across whole paragraphs, and anaphor
16
chains also function across a whole paragraph, or text. What makes the kind of techniques we have
described before perhaps specific is that they necessarily function at the level of the show.
180 C. Chauvin
callback is truly linguistic in nature, i.e. if it affects the actual linguistic form of the
show, the language, the structure, the words that are used, or if the callbacks are
(“just”) an(other) story-telling technique. In fact, it has already been shown that the
very wording of the storytelling can be impacted. It can have an influence on the
turn of phrases that are used in the show (see examples 13–17), as is the case with
the word-based callbacks. It may also affect the way in which a given word or
phrase is uttered (example 13; maybe 14, 15, 17), as the intonation/accent/voice
quality may, for instance, mimic that of the previous use so as to make the connec-
tion clearer. The presence of the macro level has at least occasionally, and perhaps
more generally, a true impact on the form of the micro level.
Another impact, which has to do with cohesion but also narrativity and humour,
as will be seen later, is on how the interpretation of certain words or phrases is to
function. A word or phrase that is used as a callback is both present in a given con-
text, and made to refer to a preceding context. This may lead to an interpretation in
which several contexts are, in fact, merged, as opposed to forming an interpretation
that is made independently at a given point in discourse. A word or phrase that is
included in a callback is no longer a word that is used in a single utterance, or in a
single sequence; it is a word that is both used at a given moment and indexed to a
preceding context of use. Although this is something that, again, may be true of all
(or most?) words or phrases, it is specifically relevant in the case of callbacks, in
which the audience is openly encouraged to pay attention to connections. When
something is re-used in an alien environment, it is not just referred to: it may need
to be newly interpreted on the basis of where it occurs in the discourse as well as
what it refers back to. The use of OMG in The Church OMG (example 16) is thus
not interpreted in the same way after a first series of attacks against acronyms in the
first part of the show as it would have been had it been a first occurrence.
6.2 Callbacks and Narrativity
The preceding remarks can lead us to a short17 discussion of whether, because of

these characteristics, a “narrative” is created, or not. As was said before, a stand-up
comedy show is, and is not, one narrative, even though some shows may be closer
to narratives than others. A stand-up comedy show does not normally consist in the
telling of one anecdote that has a beginning, a middle, and an end. It is, at best, a
collection of short narratives. Eddie Izzard calls the different “parts” of the show
“vignettes” in one interview.18 Obviously, literary narratives and everyday-
conversation narratives are not necessarily linear either, and the more complex the
narrative, the less linear it might be. But the kind of organization, with a sequence
17
We will focus on what is directly linked to callbacks here. A more general discussion, which
would necessarily have to be more detailed, will have to be left for elsewhere.
18
In the commentary that is to be found in the Force Majeure DVD.
of events, that is typical of a narrative (cf. Labov and Waletzky 1967; Labov 1997,
2001, 2004, 2006) is generally not found at the level of the show.
The complexities of ordering in literary narratives were discussed by Genette
(1972) in particular, who noted that the temporal ordering of narratives was often
intricate. He discusses, for instance, the presence of prolepses, which hint at some-
thing that will happen later, and analepses, which refer back to something that was
mentioned before. He also mentions previous references to the question, such as the
use of the term Rückgriffe by Lämmert, for which Genette suggests a translation as
“retroceptions”.19 Analepses and Rückgriffe can in fact be reminiscent of what has
been discussed here under the term “callback” for stand-up comedy. But partly
because of the heterogeneity of the content of shows again, the comparison may not
be entirely viable as there may be no real sustained story line involved. The opposi-
tion that he also makes between homodiegetic and heterodiegetic devices (devices
that function within the story or are, basically, comments from the outside) would
have to be delved into in more detail to see to what extent it may apply to stand-up
comedy shows or not. Genette also argues that some (homodiegetic) analepses are
completive (Genette 1972), as they add something that was left unsaid; others are
called repetitive. Delayed closure (examples 20, 21) may thus be completive in
nature, although random material occurs between the two “parts” of a (sub-)story in
ways that it does not in a more traditional, self-contained narrative. In fact, a typol-
ogy of cases may be devised: sometimes, the material that is re-imported is part and
parcel of the new context in which it is used, which we could call grafting, as in the
Et voilà example in Force Majeure (example 17), where the fresh use of the phrase
does conclude the new section relevantly as well as being a callback. In the cases of
delayed closure (examples 20, 21), something is mentioned again but is not neces-
sarily woven into a new context; it is just a delayed continuation of one preceding
sequence. Now, whether they are graftings or maybe “just” re-injections, the pres-
ence of callbacks generates connections and creates parallels, which, even in the
absence of strong connections, can contribute to the creation of thematic or formal
motives: as he discusses the formal parallels between two sequences of Definite
Article, Glick (2007) uses the terms: “subtle poetic repeating patterns”. They con-
struct something that is, at least, a formal whole, in which an impression of unity is
created. And if interpretation, again, is to be sought at the general level and not just
linearly, this also goes towards creating a unified, though not unique, narrative as
well.
6.3 Callbacks, Cohesion, Narrativity, and Humour
Humour is obviously a central dimension of comedy shows, and the fact that the
shows are supposed to be funny can certainly have an impact on their form(s), and
vice versa. Although this theme will not be developed in full in this paper, it might
19
Bauformen des Erzählens, Stuttgart, 1955, 2nd part (Genette 1972: 95).
182 C. Chauvin
as well, or at least, be mentioned, since an analysis of comedy cannot leave the

humorous dimension entirely aside.
The interplay between cohesion and coherence can be made to have a comedic
use. It has now been repeatedly mentioned that stand-up comedy shows are gener-
ally speaking heterogeneous, and the use of cohesive devices does not mean that
coherence is, or is obviously, fully, present. This can be exploited comedically. For
instance, discourse markers that signal cohesion may be used in the absence of
coherence with a comedic effect. The presence of “concluding”, “therefore” so
when there is no open connection can be used to create a comedic effect (Chauvin
2015). When Eddie Izzard says so, or so yeah, and introduces a completely unex-
pected20 topic (as is the case, for instance, in example 721), this sometimes creates a
big laugh. The presence of a cohesion marker does not create the impression of
discontinuity, but can contribute to making it even more salient. The use of obvi-
ously, in So I was in America, obviously (example 1) when the connection is nothing
like obvious, can more evidently be perceived as funny for the same reason.
As was said before, the presence of callbacks and other cohesive devices can also
lead the audience to try and have a coherent approach to what is apparently incoher-
ent, which can contribute to creating meaning and can also make something more
humorous by forcing a merged interpretation. As acronyms become more and more
present, for instance, their very nature of acronyms becomes part of the content, and
some comedic effects can be based upon this. The characters that re-appear at dif-
ferent moments in Eddie Izzard’s shows, the raptor and the cow driving a car in
Stripped, the mechanical chickens in Force Majeure (cf. examples 9–10), make for
more cohesion, but their reappearance in a different, and unusual, situation can also
contribute to the creation of laughter.
Another dimension that may be thought of mainly by practitioners is the “bond-
ing” effect that is mentioned in the quotations cited in Sect. 5.1. The pleasure of
recognizing something that was heard before can create a form of relationship
between the audience and the comedian, and the audience and the content, as they
participate in it more actively. Although the bonding dimension may no longer
strictly be part of (content-based) cohesion, it may be another force at work that
makes comedians try and link things together, not just for the sake of creating cohe-
sion, but for other reasons, too. The bonding or the admiration that the presence of
a clever trick can generate is also an element that professional comedians may seek.
There may even be a commercial dimension that is potentially involved: some refer-
ences may be also expected by theatre-goers who want to listen to what they have
seen in previous shows, or on a DVD. The retake on the “Death Star Canteen”
20
We are not using the word “incongruous” or discussing the incongruity theory of humour on
purpose as this would require a specific discussion. Incongruity may create humour, but is, possi-
bly, not the sole source, or a necessarily straightforward source, of humour. We will therefore
deliberately not go into this debate here.
21
It is only partly the case in his example, as this is the beginning of a sequence, and so is known
to be used when a topic is introduced; this is arguably not a “therefore” so. The next example con-
sequently illustrates the kind of problem we are dealing with here in a clearer way.
(example 22) creates cohesion between shows (it is, partly, delayed closure at the
level of a series of shows), but it also serves other purposes: recognition, bonding,
with the comedian and between members of the audience that feel that they belong
to a group of people who know the same things. These dimensions, which are part
of the theatrical requirements, may also come into play, and lead to the use of cohe-
sive devices, not (just) for their own sake, but for other functional reasons of the
kind we have just mentioned.
7 Conclusion
The aim of this synthetic presentation of callbacks in stand-up comedy was to show
that it was interesting, and possibly important, to include the genre and the
technique(s) in the list of possible cohesive devices. Stand-up comedy has a number
of specificities that seem to make it interesting to study as a genre, in itself, and so
as to see what it can bring to more general debates. As far as cohesive devices are
concerned, we have emphasized the fact that callbacks induce a form of cohesion
that functions at the macro level, but with possible implications on the micro level,
too: cohesive devices can have an impact on the wording of certain forms, or on how
parts of the routines may have to be interpreted at both the local and the general
level. The ways in which shows can be considered to form, and not to form, a whole
has also been discussed. The fact that shows are both heterogeneous and partly
made to be homogeneous means that stand-up comedy offers very interesting
ground for the analysis of such questions, again, both within the genre, and more
generally. Some dimensions at work within stand-up comedy may, as was sug-
gested, be considered to be functional (e.g., how comedy shows are supposed to
function and “work”: bonding with, or within, the audience; possible reference to
past shows; and, of course, humour), which can be related to their presence within
the genre, with certain functional requirements leading to certain uses and practices.
More aspects could have been included, like the fact that the use of topic continuity
makes shows close(r) to “spontaneous” discourse but the use of isolated NPs and
structural anaphora are in fact more typical of formal discourse (Chauvin 2015).
The ways in which reference to something can be made to be more or less evident
can also be discussed further (cf. how “emblems” work, for instance, in a system-
atic, or less systematic, way); and the links between the different techniques and the
ways in which they are implemented by different comedians and in different shows
continue to be explored. Now as was seen, callbacks illustrate problems that can
overlap into literary studies, the analysis of comedy, and, of course, linguistics. It
therefore seems that such techniques clearly ought to be included in the repertoire
of cohesion-building devices when these are discussed across genres.
184 C. Chauvin
References
Barthes, R. (1966). Introduction à l’analyse structurale des récits. Communications, 8, 1–27.

Bolens, G. (2015). Les comédiens de stand-up et la preuve par le rire : le récit comme acte cognitif
dans Star Wars Canteen 1 & 2 d’Eddie Izzard. Cahiers de narratologie, 28. http://narratologie.
revues.org/7187
Carter, J. (2001). The Comedy Bible: From stand-up to Sitcom–The comedy writer’s ultimate how-
to guide. New York: Fireside.
Chauvin, C. (2015). Passer d’un thème à l’autre : Construction de la cohésion/ cohérence dans la
stand-up comedy. Etudes de stylistique anglaise 7, Traversées/ Crossings, 141–164.
Double, O. (2014). Getting the joke: The inner workings of stand-up comedy (2nd ed.). London:
Methuen Drama.
Duchan, J., Bruder, G. A., & Hewitt, L. E. (Eds.). (1995). Deixis in narrative: A cognitive science
perspective. New York: Lawrence Erlbaum.
Genette, G. (1966). Frontières du récit. Communications, 8, 164–172.
Genette, G. (1972). Figures III. Paris: Seuil, Collection “Poétiques”.
Glick, D. J. (2007). Some performative techniques of stand-up comedy: An exercise in the textual-
ity of temporalization. Language and Communication, 27, 291–306.
Halliday, M. A. K., & Hasan, R. (1976). Cohesion in English. London: Longman.
Helitzer, M., & Shatz, M. (2005). Comedy writing secrets. Cincinnati: Writer’s Digest Books,
F+W Publications, Inc.
Labov, W. (1997). Some further steps in narrative analysis. Journal of Narrative and Life History,
7, 395–415.
Labov, W. (2001). Uncovering the event structure of narrative. Georgetown University Round
Table 2001. http://www.ling.upenn.edu/~wlabov/uesn.pdf. Accessed 19 Nov 2016.
Labov, W. (2004). Ordinary events. In C. Fought (Ed.), Sociolinguistic variation: Critical reflec-
tions (pp. 31–43). Oxford: Oxford University Press.
Labov, W. (2006). Narrative preconstruction. Narrative Inquiry, 16, 37–45.
Labov, W., & Waletzky, J. (1967). Narrative analysis. In J. Helm (Ed.), Essays on the verbal and
visual arts (pp. 12–44). Seattle: U. of Washington Press. (Reproduced in 1997 in Journal of
Narrative and Life History, 7, 3–38.)
Levinson, S. (1983). Pragmatics. Oxford: Oxford University Press.
Murray, L. (2010 [2007]). Be a great stand up: Teach yourself. London: Hodder Education.
Propp, V. (1970). Morphologie du conte. Paris: Seuil.
Ritchie, C. (2012). Performing live comedy. London: Methuen Drama.
Schiffrin, D. (1987). Discourse markers. Cambridge: Cambridge University Press.
Schwartz, J. (2010). Linguistic aspects of verbal humour in stand-up comedy. PhD Universität der
Saarlandes.
Slobin, D. (2005). Relating narrative events in translation. In D. Ravid & H. B. Shyldkrot (Eds.),
Perspectives on language and language development: Essays in honor of Ruth A. Berman
(pp. 115–129). Dordrecht: Kluwer.
Todorov, T. (1966). Les catégories du récit littéraire. Communications, 8, 125–151.
Todorov, T. (1967). Littérature et signification. Paris: Larousse.
Todorov, T. (1971). Poétique de la prose. Paris: Seuil.
Web Pages
Alexander, C. J. Creating a Comic. Bombing, killing, and other occupational hazards of stand-up
comedy (blog about « breaking in stand-up comedy »), http://www.creatingacomic.com/
comedy-glossary/. Accessed 17 May 2015.
Bromley, P. Callbacks, Glossary of Comedy by Patrick Bromley, Comedians Expert, About :

Entertainment, http://comedians.about.com/od/glossary/g/callback.htm. Accessed 17 May
2015.
Callbacks (Comedy) – Wikipedia (English), http://en.wikipedia.org/wiki/
Callback_%28comedy%29. Accessed 17 May 2015.
Peck, J.. Secret Comedy Writing Technique – Callbacks, http://prohumorist.com/secret-comedy-
writing-technique-callbacks/. Accessed 17 May 2015.
Seven Things Writers can Learn from Stand-up Comedians., https://theamericanscholar.org/seven-
things-writers-can-learn-from-stand-up-comedians/#.VVjZerntnBo. Accessed 17 May 2015.
Reddit discussion of would-be comedians and fans on callbacks,http://www.reddit.com/r/Stand-up/
comments/29gch1/good_examples_of_callbacks/. Accessed 17 May 2015.
What is the best example of a callback in comedy ? – Quora,http://www.quora.com/What-is-the-
best-example-of-a-callback-in-comedy. Accessed 17 May 2015.
Corpus
Bailey, B. (2004). Part Troll. DVD Universal.

Bailey, B. (2008). Tinselworm, Live at Wembley. DVD Universal.
Bailey, B. (2010). Dandelion Mind. DVD Universal.
Bailey, B. (2013). Qualmpeddler. DVD Universal.
Brand, J. (2003). Barely Live. DVD Universal.
Carr, J. (2013). Laughing and Talking. DVD Channel 4.
Davies, A. (2013). Life is Pain. DVD 2 Entertain.
Djalili, O. (2009). Omid Djalili Live in London. DVD Anchor Bay.
Djalili, O. (2012). Tour of Duty. DVD Anchor Bay.
Izzard, E. (1996). Definite Article. DVD Universal.
Izzard, E. (1997). Glorious. DVD Universal.
Izzard, E. (1998). Dressed to Kill. DVD Universal.
Izzard, E. (2002). Circle. DVD Universal.
Izzard, E. (2003). Sexie. DVD Universal.
Izzard, E. (2009). Stripped. DVD Universal.
Izzard, E. (2013). Force Majeure. DVD Universal.
Izzard, E. (2015–2016). Force Majeure Reloaded (tour).
Jupitus, P. (2011). Quadrophobia. DVD Universal.
Lock, S. (2008). Sean Lock Live. DVD Universal.
Lock, S. (2010). Lockipedia Live. DVD Universal.
Mack, L. (2010). Going Out Live. DVD Anchor Bay.
Millican, S. (2011). Chatterbox Live. DVD Channel 4.
Noble, R. (2013). Mindblender. DVD Universal.
O’Briain, D. (2012). Craic Dealer. DVD Universal.
Vine, T. (2004). Live. DVD Entertain Video.
Vine, T. (2011). The Jokeamotive Live. DVD Spirit Entertainment Ltd.
Vine, T. (2010). Punslinger Live. DVD Spirit Entertainment Ltd.
Bush and Obama’s Addresses to the Arab
World: Recontextualizing Stance in Political
Discourse
Laura Hidalgo-Downing and Yasra Hanawi
Abstract The present article compares the stance styles in the two speeches
addressed to the Arab World by US Presidents George W. Bush in Abu Dhabi in
2008 and Barack Obama in Cairo in 2009. The main theoretical concepts addressed
are stance and recontextualization. Halliday’s (An introduction to functional gram-
mar, 2nd edn. Arnold, London, 1994) model of stance is adopted in the present
study to establish stance categories and degrees of subjectivity. Additionally, co-
occurrence with personal pronouns and negation is explored. Corpus methodology
and discourse pragmatic analysis are used in combination. The main claim is that
the higher frequency of markers of stance in Obama’s speech, in particular modal
verbs and negation and their co-occurrence with first person pronouns, evokes inter-
textually his predecessor’s speech and stance towards the Arab World, together with
commonly held assumptions about the relations between the US and the Arab
World. Results show significant differences in the choice and frequency of markers
of modality and negation and their co-occurrence with personal pronouns in the two
speeches. A different stance style characterizes each speech, with an effort in
Obama’s speech to recontextualize and reformulate the predominant discourse and
social practice in US foreign policy.
Keywords Political discourse • American English • Stance markers •

Recontextualization • Modality • Negation • Personal pronouns
L. Hidalgo-Downing (*) • Y. Hanawi

Department of English, Facultad de Filosofía, Letras Universidad Autónoma de Madrid,
Madrid 28049, Spain
e-mail: laura.hidalgo@uam.es; yasra59@hotmail.com

DOI 10.1007/978-3-319-54556-1_9
188 L. Hidalgo-Downing and Y. Hanawi
1 Introduction
The motivation for the present study is the socio-historical impact of the speech
delivered by US President Barack Obama to the Arab World in Cairo in 2009, early
on after his election as US president. His speech, entitled ‘A new beginning’ was
met with great expectations not only in Arab countries but also in Israel and the
whole Western world. According to US opinion polls from 2009, Obama’s speech
had a very positive overall world-wide reception, and created expectations of a new
era of international cooperation and stability. These expectations stood in stark con-
trast with the preceding two decades, which had witnessed the two Iraq wars, the
invasion of Afghanistan and the 9/11attacks of 2001. This preceding period of ten-
sion between the US and Arabic countries had taken place under the Presidency of
George W. Bush, who, after the 9/11 attacks declared the ‘War on terror’ against
what was called ‘the axis of evil’, constituted by Iran, Iraq and Afghanistan.
The present study explores the linguistic resources used by the two politicians to
index their political and personal stances with regard to the topic addressed and
their positioning towards their audience in the context of the Arab World and the
Middle East. The main argument is that the differences in stance styles which
emerge from the quantitative study point to the crucial role played by the analyzed
linguistic resources in the recontextualization of the political relations between the
US and the Arab World. In other words, in order to put forward his ‘New beginning’
in the relations between the US and the Middle East, Obama makes extensive use of
linguistic resources such as modality, the use of first person pronouns and negation,
which refer intertextually to previously held assumptions on these relations.
Negation, in particular, plays a crucial role in the modification and correction of
previous assumptions; additionally, Obama uses modality and first person pronouns
to engage with his audience and open up a common space of collaboration.
The objectives of the study are the following:
1. To identify the main types of stance following Halliday’s (1994) model of types
of subjectivity.
2. To measure the frequency and statistical significance of stance markers, specifi-
cally, personal pronouns, modality markers, mental verbs and negation by means
of corpus tools.
3. To explore and discuss the co-occurrence of the selected features and their dis-
course pragmatic functions as markers of stance in each speech.
4. To argue how the higher frequency of stance markers in Obama’s speech in com-
parison with Bush’s previous speech in the same context (The Arab World)
points at a strategy of intertextual recontextualization of previous discourses,
social practices and held assumptions about the relations between the US and the
Middle East.
Bush and Obama’s Addresses to the Arab World: Recontextualizing Stance… 189
2 Theoretical Background
The present study draws, first, on previous studies on the concepts of stance and
markers of modality and negation; second on the concept of recontextualization as
a process in which a piece of discourse refers intertextually to previous discourses
and social practices and to held assumptions on a particular topic (Linell 2009,
Semino et al. 2013); third, on the literature on political discourse and the signifi-
cance of uses of linguistic features such as modality and personal pronouns in the
indexing of stance (Boyd 2014a, b; Charteris-Black 2004, 2011; Chilton 2004;
Evans and Chilton (2010); Fairclough 1989, 1992, 2010; Fetzer 2013; Marín-Arrese
et al. 2013; Marín-Arrese 2015; Wilson 1990). Finally, corpus linguistic tools are
used to explore the frequency and co-occurrence of stance markers related to the
areas of indexicality, modality and negation, which are then discussed from a quali-
tative perspective (see, for example, Simon-Vandenbergen and Aijmer 2007; Biber
et al. 1999; Charteris-Black 2004; Rayson 2008).
Stance has been the focus of attention of numerous studies in discourse grammar
and discourse pragmatics (Halliday 1994; Biber and Finegan 1988; Biber et al.
1999; Englebretson 2007; Hunston and Thompson 2000; Marín-Arrese et al. 2013;
Martin and White 2005; Thompson and Alba-Juez 2014, among others). What
emerges from these studies is that there is a complex relation between the concepts
of evaluation, stance and positioning in discourse. Numerous proposals have been
put forward in order to identify categories of stance and the linguistic resources
which characterize different stance styles. Biber and Finegan (1988: 93) define
stance as ‘The lexical and grammatical expression of attitudes, feelings, judge-
ments, or commitments concerning the propositional content of a message’. In the
present study we address grammatical stance, in particular as described by Halliday
(1994). Halliday proposes a classification of stance types based on degrees of sub-
jectivity. Table 1 below shows a distinction between subjective explicit stance,
Table 1 Degrees of subjectivity in stance

Category Type of realization Example
Subjective
(a) explicit I/we (do not) think, believe, feel We believe that peace is possible.
I/we + be + (not) + attribute I’m (not) sure
(b) (Not) + can (cannot), may, might, could, will, You will have no better friend
implicit would, must than the United States of
America.
The people of the world can live
together in peace.
Objective
(c) explicit Probably, certainly, possibly + (not) Most naturalists probably don’t
know…
(d) It’s (not) likely, it’s (not) certain It is not likely that man should
implicit have succeeded in selecting…
Adapted from Halliday (1994: 355)
c haracterized by the use of first person pronouns and mental verbs (believe, know,
think), subjective implicit stance, marked by the use of modal verbs, objective
explicit stance, marked by the presence of stance adjectives and adverbs, and,
finally, objective implicit stance, indicated by the impersonal structure ‘it + be +
stance adjective’. We have included the negative form not in brackets as part of the
marking of stance because, as argued above, the frequency and distribution of nega-
tion across stance types is significant for the discourse pragmatic interpretation of
the differences between the two political speeches.
With respect to the relation between negation and stance, it is worth pointing out
that scholars who address the grammatical marking of stance focus on its realization
by means of modality (see Biber et al. 1999; Halliday 1994; Halliday and Matthiessen
2004; Hidalgo Downing and Núñez-Perucha 2013; Thompson 2004; Marín-Arrese
et al. 2013; Givón 1993). From this perspective, negation is considered by several
scholars as one of the language modalities; as explained by Givón, modalization in
language is a cline which goes from strong positive assertion, modality and irrealis,
presupposition, to strong negative assertion (adapted from Givón 1993: 170).
Negation and modality are particularly interesting because of the relative values
introduced by modal terms and because of the capacity of negation to evoke presup-
posed concepts and to introduce strong negative assertions.
Indeed, the use of negation is a well-known strategy in political discourse, in
which, as argued by Jordan (1998), two-part or three-part structures are used in
order to correct a previous assumption and pave the path for a new idea. A classic
example is the opening of Mark Anthony’s speech in Shakespeare’s play Julius
Caesar (Shakespeare 1991):
(1) Friends, Romans, countrymen, lend me your ears;

I come to bury Caesar, not to praise him (Act III, scene ii)
Example (1) illustrates a two-part structure in which the speaker uses negation to
defeat possible expectations held by his audience and correct them.
Du Bois’s notion of ‘the stance triangle’ is particularly significant for the under-
standing of stance as a dialogic and intersubjective phenomenon which underlies
the process of recontextualization discussed in the present paper. Du Bois argues
that we use language to establish relations with texts, with the topic at hand and with
other speakers (2007: 163). This concept of stance is based, as in other scholars (see
Martin and White 2005), on the dialogic view of discourse (Bahktin 1981). In this
sense, all discourses refer to previously produced texts and discourses intertextually.
The role of negation in this process is particularly significant, since in order to deny
an idea or defeat an expectation, the idea or expectation needs to be mentioned. In
the present article we argue that negation plays a crucial role in the process of recon-
textualization in political discourse as a social practice. Linell (2009) describes
recontextualization as a process in which language is re-used and adapted to new
contexts and situations, including new genres. He distinguishes three types of
recontextualization, intratextual, intertextual and interdiscursive. In the present
study we make use of the second type of recontextualization, intertextual
recontextualization (see also Boyd 2014a, b). From this perspective, Obama’s
speech in Cairo in 2009 can be seen in the light of a process of recontextualization
which involves a series of significant contextual changes in the genre of the presi-
dential address to a foreign community as a social practice: a change in the US
political program, which is the result of the change of president, a change in time,
from 2008 to 2009, and a consequent change in the socio-political context. Within
Critical Discourse Analysis and Political Discourse Analysis, this process has been
described as one of ‘re-imagining’ a social practice (Fairclough 2010; Boyd 2014a,
b). In the case analyzed in the present paper, recontextalization does not occur
across genres, but within the same genre by two different politicians at two different
moments in time. These differences in personal identity and time shape the recon-
textualization process as one in which what is re-imagined by the world community
is the relation between the US and Arab countries, and, consequently, the US for-
eign policy in international affairs. This is clearly consistent with the title of Obama’s
speech ‘A New Beginning’.
With regard to political discourse, we draw on studies which approach this type
of discourse as social practice, which consequently has ideological implications
(Boyd 2014a, b; Charteris-Black 2011; Chilton 2004, Fairclough 1989, 1992, 2010;
Wilson 1990). Though Bush’s and Obama’s discourses have been the object of
extensive study by numerous scholars (Boyd 2014b), the present article contributes
to current scholarship in this field of study by focusing on the two US Presidents’
approaches to the issue of the Arab World, a particularly conflictive one in the US
policy, and the way their speeches appeal intertextually both to the issue at hand and
to the assumptions and expectations of the audience they address.
The role of features such as modality, pragmatic markers, personal pronouns and
metaphor has been discussed by numerous scholars (see for example, Charteris-
Black 2004, 2011; Chilton 2004; Fetzer 2013; Boyd 2014a, b; Marín-Arrese et al.
2013; Marín-Arrese 2015). However, the role of negation has not received sufficient
attention as a strategy used by politicians to deny previous concepts and simultane-
ously introduce new ideas (for an example of this strategy in scientific discourse see
Hidalgo Downing 2014). A great part of the discussion in the present study focuses
on the interaction between modality, personal pronouns and negation, and on how
this interaction articulates the process of recontextualization in Obama’s speech.
3 Data and Method
3.1 Data
The data consists of the two speeches delivered by US Presidents George Bush Jr.
and Barak Obama to the Arab World. The former was delivered by President Bush
in Abu Dhabi on June 13th 2008 and is 3380 words long. The latter, entitled ‘A New
Beginning’, was delivered by President Obama in Cairo on June 4th 2009 and is
5871 words long. The socio-historical significance of the speeches has already been
described in the introduction to the present article.
3.2 Method
The methodology is a combination of quantitative corpus analysis and qualitative

discourse-pragmatic analysis. Once the relevant stance types have been identified as
explained in the section of Theoretical Background above, a search was made of the
relevant linguistic items for the selected categories in each political speech by using
a Concordancer (Monoconc). Frequency per 1000 words is calculated for each of
the selected categories and statistical significance is determined by means of a con-
tingency table χ2 statistical test. The degree of significance is established as signifi-
cant at p < 0.05 and as extremely significant at p < 0.01. The test is performed in the
case of stance categories and negation in relation to the whole corpus of each speech,
while in the rest of the categories the test is performed among the subtypes of a
category in order to determine the significance of the sub-categories (types of per-
sonal pronouns, types of modality). The reason for this is that in the case of nega-
tion, the different subtypes indicate preferences in register but not differences within
the functional and pragmatic category of negation in discourse; however, in the case
of personal pronouns and modality, the difference in the choice between first, sec-
ond and third person pronouns and types of modal verbs (epistemic and deontic) has
significant implications for the signaling of the positioning and stance of each poli-
tician. The statistical frequency of stance markers is complemented by the reference
to the keyness factor of selected lexical items, pronouns and modal verbs in each
speech as compared to the other speech. Keyword significance is established accord-
ing to the method proposed by Rayson (2008). This consists of a statistical method
which calculates the significance of a word in a corpus against its possible signifi-
cance in a reference corpus. In the present study the two speeches are used as cor-
pora and each is analyzed with reference to the other. The quantitative results are
followed by a discussion of the discourse pragmatic functions of significant catego-
ries selected from concordance lists of stance markers and keywords, namely, modal
verbs and adverbs, negation, personal pronouns, mental verbs, etc. More specifi-
cally, the modal verbs which were searched for are the following: can, cannot, may,
might, could, must, should, will, would. The modal adverbs and adjectives identified
were: possible, possibly, likely; the negative words/particles were: not, no, *n’t; the
mental verbs which were identified for were: know, believe, consider, feel. Finally,
the personal pronouns searched for were: I, we, our, you, your.
4 Results
Figure 1 below shows the frequency per 1,000 words of the stance categories fol-
lowing Halliday’s (1994) model: subjective explicit (SE), subjective implicit (SI),
objective explicit (OE) and objective implicit (OI), as explained in the section on
Theoretical Background above. The frequency per 1,000 words is calculated with
regard to the number of words in the whole corpus of each speech. The result of the
χ2 test is p < 0.005, which indicates that the overall differences with regard to stance
categories in each political speech are statistically significant.
Taking into consideration each stance category, the greatest difference is revealed
in the category Subjective Implicit stance, in which items are much more frequent
in Obama’s speech than in Bush’s. Additionally, a difference is observed in the pref-
erence of each politician in the objective stance category: while Bush’s speech
shows a preference for objective implicit stance, Obama’s speech shows a prefer-
ence for objective explicit stance.
Figure 2 below shows the various types of modality in the two political speeches:
epistemic and deontic modality. The frequency per 1,000 words is calculated within
the overall category of modal verbs in each corpus. The result of the χ2 statistical test
is p < 0.005, which indicates that the differences in each political speech are statisti-
cally significant.
Results in Fig. 2 reveal that Obama’s speech shows an overall higher frequency
of markers of modality, in particular of epistemic modality, followed by deontic
modality. Deontic modality shows a low frequency in Bush’s speech.
Fig. 1 Types of stance in Bush’s and Obama’s speeches (SE subjective, explicit, SI subjective,
implicit, OE objective, explicit, OI objective, implicit)
Fig. 2 Modality types in Bush’s and Obama’s speeches
Fig. 3 Negation and distribution of negation types in Bush’s and Obama’s speeches
Figure 3 above shows the frequency per 1,000 words of overall negation types
and the distribution of types of negation. Statistical significance is determined on
the total number of negative types in terms of frequency per 1,000 words in each
corpus.
Results in Fig. 3. show that the difference in the frequency of use negation in the
two speeches is extremely significant, with p < 0.001
Fig. 4 Distribution of personal pronouns in Bush’s and Obama’s speeches
Fig. 5 First person pronouns in Bush’s and Obama’s speeches
Figure 4 shows the frequency per 1,000 words of personal pronouns in the two
political speeches with regard to the total number of pronouns in each speech. The
statistical test shows differences are extremely significant, with p < 0.0001. Results
show Obama’s preference for first person pronouns and Bush’s preference for sec-
ond and third person pronouns.
Figure 5 shows the frequency and distribution of first person pronouns in Bush’s
and Obama’s speeches. Frequency per 1,000 words is calculated within the total
number of first person pronouns. The statistical test shows that results are extremely
Fig. 6 Second person pronouns in Bush’s and Obama’s speeches
Table 2 Keyword significance in Bush’s speech vs. Obama’s

Keywords in Bush’s vs. Obama’s Arab world speeches
Keyword N. % N.REF. % REF. LOGG
Your 44 1.37 4 0.07 + 65.30
You 47 1.46 11 0.19 + 48.89
East 20 0.62 2 0.04 + 28.96
Free 15 0.47 1 0.02 + 23.84
Middle 19 0.59 3 0.05 + 23.71
Their 37 1.15 19 0.34 + 20.42
Freedom 18 0.56 5 0.09 + 16.93
President 8 0.25 0 0.00 + 16.22
Liberty 7 0.22 0 0.00 + 14.20
Neighbors 6 0.19 0 0.00 + 12.17
Reconciliation 6 0.19 0 0.00 + 12.17
Societies 6 0.19 0 0.00 + 12.17
Stability 6 0.19 0 0.00 + 12.17
Terrorists 5 0.16 0 0.00 + 10.14
Democracy 12 0.37 4 0.07 + 9.95
significant, with p < 0.0001. While Obama uses the first person pronouns I and we
much more frequently than Bush, Bush shows a preference for the second person
pronoun you.
Figure 6 shows the frequency of second person pronouns per 1,000 words within
the total number of second person pronouns. The statistical test shows that
Table 3 Keyword significance in Obama’s speech vs. Bush’s

Keywords in Obama’s vs. Bush’s Arab world speeches
Keyword N. % N.REF. % REF. LOGG
Be 57 1.01 4 0.12 + 29.95
I 55 0.97 7 0.22 + 20.05
Must 32 0.57 2 0.06 + 17.68
But 35 0.62 3 0.09 + 16.64
Countries 16 0.28 0 0.00 + 14.42
Communities 14 0.25 0 0.00 + 12.62
Why 14 0.25 0 0.00 + 12.62
Our 64 1.13 15 0.47 + 11.30
d ifferences are extremely significant, with p < 0.0001. Bush’s speech has a much
higher frequency of use of second person pronouns.
As a complement to the quantitative results provided so far, keyword signifi-
cance of lexical words, personal pronouns and modal verbs in each speech is pro-
vided in Table 2 and Table 3 above.
Results in Table 2 show the keyword significance of lexical items, personal pro-
nouns and modal verbs in Bush’s speech. Results show that the second person pro-
nouns you and your are highly significant in terms of keyness, occupying the first
two places in the list. No other markers of stance discussed in the present study
occur as significant in terms of keyness. The lexical items, however, reveal prefer-
ences in Bush’s discourse which stand out against lexical choices in Obama’s
speech. For example, the terms free, freedom and liberty are highly frequent in
Bush’s discourse, in line with the title of his address. The term terrorist, though not
used frequently is a significant keyword, since this term is absent in Obama’s
speech.
Table 3 shows keyness significance of selected lexical items, pronouns and
modal verbs in Obama’s speech. Results reveal the keyword significance of the
stance markers I, first person pronoun, our, plural first person possessive pronoun,
but, concessive, and must, deontic modal verb.
5 Discussion
The present section provides a detailed discussion of the results in the preceding
section by analysing selected concordances from each of the political speeches.
Each of the categories is discussed in turn.
5.1 Stance Types in Bush’s and Obama’s Speeches
5.1.1 Subjective Explicit and Subjective Implicit Stance
The quantitative results show that President Obama’s speech has a significantly
higher frequency of markers of stance than Bush’s speech. In particular, Obama
makes use of subjective implicit stance, that is, a stance style that is characterized
by the use of modal verbs, followed by subjective explicit stance, which is charac-
terized by the use of first person pronouns and mental verbs. Bush, on the other
hand, shows a preference for subjective explicit stance, followed by subjective
implicit stance. In addition to this main difference in the stance preferences of the
two politicians, the distribution of pronominal forms and their co-occurrence with
the stance markers of modality, together with the frequency of negation, reveals
further differences in intersubjective positioning in the two politicians.
5.1.2 Subjective Explicit Stance: First Person Pronouns + Mental Verbs
Within the category of subjective explicit stance, Bush shows a preference for the
co-occurrence of the first person plural we (referring to the US government) and
mental verbs, as in examples (1) to (5) below:
(1) We [[believe]] that trade and investment is the key to the future of hope and
opportunity.
(2) We [[believe]] that stability can only come through a free and just Middle East.
(3) We [[believe]] that peace is possible, though it requires tough decisions.
(4) Yet we also [[know]] that for all the difficulties, a society based on liberty is
worth the sacrifice.
(5) We [[know]] that democracy is the only form of government that treats
individuals with the dignity and equality that is their right.
(6) We [[know]] from experience that democracy is the only system of government
that yields lasting peace and stability.
Obama, by contrast, shows a preference for the use of the first person pronoun I; the
main difference in the use of this pronoun by Bush and Obama is that while Bush
uses it to refer to his identity as President of the US, Obama uses it both as a referent
to his public identity as President but also as a referent to his personal history. The
use of I as referent to the public identity of the US Presidents is clear in the conven-
tional expressions of gratitude that characterize the openings and closings of politi-
cal speeches, as in examples (7) and (8), from Bush’s speech below:
(7) [[I]] am honored by the opportunity to stand on Arab soil and speak to the
people of this nation.
(8) And [[I]] appreciate the fact that your country sent a delegate.
The high frequency of use of the pronoun I in Obama’s speech shows the President’s
intention to position himself as personally committed to the task he has taken on
board towards the Arab world; this indicates personal involvement in addition to the
guarantee provided by the use of the public I for his positioning as US President.
This difference can be observed in examples (9) to (12), which illustrate the use of
private or personal I, and examples (13) and (14), which illustrate the use of public
or Presidential I (note that examples (9) and (10) do not illustrate co-occurrences
with mental verbs):
(9) Part of this conviction is rooted in my own experience. [[I]] am a Christian, but
my father came from a Kenyan family that includes generations of Muslims.
(10) We see it in the history of Andalusia and Cordoba during the Inquisition.
[[I]] saw it firsthand as a child in Indonesia.
(11) As a student of history, [[I]] also know civilization’s debt to Islam.
(12) [[I]] know, too, that Islam has always been a part of America’s story.
(13) The fourth issue that I will address is democracy. [[I]] know there has been
controversy about the promotion of democracy in recent years.
(14) Those are mutual interests. That is the world we seek. But we can only
achieve it together. [[I]] know there are many – Muslim and non-Muslim
– who question whether we can forge this new beginning.
(15) And [[I]] consider it part of my responsibility as President of the United

States to fight against negative stereotypes of Islam wherever they appear.
If we consider the uses of the pronoun we, it is worth pointing out that while Bush
makes use of this pronoun exclusively to refer to the US government, as in examples
(1) to (6) above, Obama uses we both to refer to the US government and to refer to
the American people and people from the Middle East countries, including Israel,
Palestine and other Arab countries, as in examples (16) and (17):
(16) These needs will be met only if [[we]] act boldly in the years ahead.
(17) Some suggest that it isn’t worth the effort – that [[we]] are fated to disagree,
and civilizations are doomed to clash.
Some examples may be ambiguous, as examples (18) and (19) below:
(18) The people of the world can live together in peace. We [[know]] that is God’s
vision. Now, that must be our work here on Earth. Thank you.
(19) So whatever we [[think]] of the past, we must not be prisoners of it. Our
problems must be dealt with through partnership.
5.1.3 Subjective Implicit Stance: Modal Verbs
As pointed out above, subjective implicit stance characterizes Obama’s speech,

which displays a significantly higher frequency of modal verbs. The modal verbs
which occur in the speeches of the two politicians are will, can, may, must and
should. The modal might is used once by Bush and the modal could is used on three
occasions by Obama. The modal will is similar in frequency in both politicians; the
use of this modal verb is illustrated by examples (20) to (30) below from Bush’s
speech, and in examples (31) to (35) from Obama’s speech. In the examples from
Bush’s speech, reference to the US government is realized by means of noun phrases
or the pronoun we as subjects/agents. These show their willingness to support both
Israeli and Arabic countries, a commitment to the process of reconciliation and a
prediction of better days to come for democracy in the Middle East:
(20) The United States [[will]] always stand with Israel in the face of terrorism.
(21) And as you build a Middle East growing in peace and prosperity, the United
States [[will]] be your partner.
(22) The United States [[will]] continue to support you as you build the institutions
of a free society.
(23) And by supporting the legitimate aspirations of both sides, we [[will]]

encourage reconciliation between the Israeli and Palestinian people.
(24) And when that good day comes, you [[will]] have no better friend than the
United States of America.
(25) The United States [[will]] help you build the institutions of democracy and
prosperity.
(26) And we [[will]] not abandon you to terrorists or extremists.

(27) To the leaders across the Middle East who are fighting the extremists:
The United States [[will]] stand with you as you confront the terrorists
and radicals.
(28) And as you struggle to find your voice and make your way in this world,
the United States [[will]] stand with you.
(29) And in a free and just society, individuals can rise as far as their talents
and hard work [[will]] take them.
(30) The day [[will]] come when the people of Iran have a government that
embraces liberty and justice.
The use of the modal will by Obama also shows commitment to the role of the US
government in the reconciliation process, but, additionally, it shows numerous co-
occurrences with the first person pronoun I, demonstrating Obama’s personal com-
mitment to this process of change.
(31) America [[will]] align our policies with those who pursue peace, and say in
public what we say in private to Israel.
(32) And we [[will]] also expand partnerships with Muslim communities to

promote child and maternal health.
(33) The sooner the extremists are isolated and unwelcome in Muslim communities,
the sooner we [[will]] all be safer.
(34) And [[I]] will host a Summit on Entrepreneurship this year to identify how
we can deepen ties between business leaders.
(35) That is why I ordered the removal of our combat brigades by next August.
That is why we [[will]] honor our agreement with Iraq’s democratically-
elected government to remove combat troops from Iraq.
Obama also makes use of will to make reference to the difficult issues that need to
be addressed by the US and the countries in the Middle East, as in example (36):
(36) It [[will]] be hard to overcome decades of mistrust, but we will proceed with
courage, rectitude and resolve.
While the modal verbs used by both politicians include epistemic will, may, might,
can, could and deontic can, must and should, there is a great difference in the fre-
quency of use of specific modal verbs by each politician. Thus, the modal verbs can,
must and should are significantly more frequent in Obama’s speech than in Bush’s.
The uses of these modal verbs are illustrated in examples (37) to (43) below from
Bush’s speech, together with occurrences of the negative form cannot (examples
(44) to (47):
(37) And in a free and just society, individuals [[can]] rise as far as their talents
and hard work will take them.
(38) All know the lasting stability that only freedom [[can]] bring.
(39) The Palestinian people aspire to build a nation of their own – where they
[[can]] live in dignity and realize their dreams.
(40) We believe that stability [[can]] only come through a free and just Middle
East.
(41) Power is a trust that [[must]] be exercised with the consent of the governed.
(42) And the people of the Middle East [[must]] continue to work for the day
where that is also true of the lands that Islam first called home.
(43) As we demand you open your markets we [[should]] open ours, as well.
The negative form of can, cannot, is used by Bush in co-occurrence with the second
person pronoun you as impersonal you, and to refer to third persons.
(44) You [[cannot]] build trust when you hold an election where opposition
candidates find themselves harassed or in prison.
(45) You [[cannot]] expect people to believe in the promise of a better future
when they are jailed for peacefully petitioning their government.
(46) And you [[cannot]] stand up a modern and confident nation when you do not
allow people to voice their legitimate criticisms.
(47) The terrorists and extremists [[cannot]] prevail.
Examples (48) to (54) illustrate the use of must in Obama’s speech, indicating an
effort on the part of the US and countries in the Middle East to carry out a joint
endeavor to achieve peace and stability. These examples additionally illustrate the
tendency for first person we to co-occur with modal verbs.
(48) This cycle of suspicion and discord [[must]] end. I have come here to seek a
new beginning.
(49) There [[must]] be a sustained effort to listen to each other.

(50) Our problems [[must]] be dealt with through partnership.
(51) So whatever we think of the past, we [[must]] not be prisoners of it.
(52) Progress [[must]] be shared.
(53) We [[must]] face these tensions squarely.
(54) We [[must]] finally confront together.
Obama shows a great degree of commitment in demanding all the parties’ involve-
ment by using the modal must to appeal to each of the countries that are addressed:
(55) Palestinians [[must]] abandon violence.
(56) Israel [[must]] also live up to its obligations to ensure that Palestinians can
live, and work, and develop their society.
(57) Finally, the Arab States [[must]] recognize that the Arab Peace Initiative was
an important beginning.
The modal can, of course part of the logo of the presidential campaign, ‘Yes we
can’, is used extensively by Obama to point out how peace can be achieved in
partnership,
(58) That is the world we seek. But we [[can]] only achieve it together
(59) I will host a Summit on Entrepreneurship this year to identify how we [[can]]
deepen ties between business leaders, foundations and social entrepreneurs
in the United States and Muslim communities around the world.
(60) The people of the world [[can]] live together in peace.
Can is also used to refer to the doubt or skepticism shown by certain communities
with regard to the process of peace, as in examples (61) and (62):
(61) Muslim and non-Muslim – who question whether we [[can]] forge this new
beginning.
(62) Many more are simply skeptical that real change [[can]] occur.
5.1.4 Objective Explicit and Implicit Stance
The results show that markers of stance in these categories are virtually not used by
either Bush or Obama (2 and 3 markers by each politician as shown in the results
section above). This may be interpreted as a preference in the two political speeches
for Subjective stance, which allows for the display of a subjective identity and posi-
tioning which may have a stronger persuasive effect. This is clear in the co-
occurrence of personal pronouns with mental verbs and modal verbs in the two
speeches.
5.2 Negation
The last section of the Discussion addresses the role of negation in the two speeches.
As observed in previous sections, negation is significantly more frequent in Obama’s
speech than in Bush’s. The use of negation in Bush’s speech is illustrated in exam-
ples (63) to (69) below:
(63) They hate your government because it does [[not]] share their dark vision.
(64) We will [[not]] abandon you to terrorists or extremists.
(65) History teaches us that the road to freedom is [[not]] always even, and
democracy does not come overnight.
(66) No terrorist or tyrant [[can]] take that away.
(67) The road to freedom is not always even, and democracy does [[not]] come
overnight.
(68) Most people do [[not]] want war and bloodshed and violence.
(69) They say that the Arab people are [[not]] “ready” for democracy.
The examples above show a use of negation to introduce general statements (the
road to freedom is not always even, and democracy does not come overnight, most
people do not want bloodshed and violence) or to present the dark vision of the rela-
tion between the US and the Middle East, where loaded lexical items such as hate
and terrorist stand out (examples (63) and (64)).
Negation in Obama’s speech is used extensively to correct previous assumptions
on the relations between the US and the Middle East, as well as to correct assump-
tions the US may have about Arabic countries and Arabic countries may have on
the US:
5.2.1 C
orrecting Assumptions About Relations Between the US
and Arabic Countries
(70) America and Islam are [[not]] exclusive, and need not be in competition.
(71) You must maintain your power through consent, [[not]] coercion.
(72) Just as Muslims do [[not]] fit a crude stereotype, America is [[not]] the crude
stereotype of a self-interested empire.
(73) America and Islam must be based on what Islam is, [[not]] what it isn’t.
(74) So whatever we think of the past, we must [[not]] be prisoners of it.
(75) But we should choose the right path, [[not]] just the easy path.
Numerous uses of negation consist of two-part structures in which an assumption is

negated and a new idea is introduced, as in ‘we should choose the right path, [[not]]
just the easy path’.
5.2.2 C
orrecting Assumptions Arab Countries Are Thought to Hold
About the US
Negation is used to correct assumptions held by Arab countries on the US, as in the
following examples:
(76) Those are [[not]] just American ideas, they are human rights, and that is why
we will support them everywhere.
(77) These are not opinions to be debated; these are facts to be dealt with. Make
no mistake: we do [[not]] want to keep our troops in Afghanistan. We seek
no military bases there.
(78) America does [[not]] presume to know what is best for everyone, just as we
would not presume to pick the outcome of a peaceful election.
(79) In Ankara, I made clear that America is [[not]] – and never will be – at war
with Islam.
5.2.3 C
orrecting Assumptions the US are Thought to have of Arab
Countries
Negation is used to correct assumptions held by the US on countries in the Middle

East, especially Arab countries.
(80) The attacks of September 11th, 2001 and the continued efforts of these
extremists to engage in violence against civilians has led some in my
country to view Islam as inevitably hostile [[not]] only to America and
Western countries, but also to human rights.
(81) Islam is [[not]] part of the problem in combating violent extremism – it is an

important part of promoting peace.
5.2.4 R
einforcement of His Personal Commitment and Personal Story
as Exemplary and Positioning of the US Government
Finally, Obama makes use of negation to reinforce his personal commitment to his
new policy and to make reference to his personal experience as a guarantee of his
position towards the Arab World.
(82) No system of government can or should be imposed upon one nation by any
other. That does [[not]] lessen my commitment, however, to governments
that reflect the will of the people.
(83) Much has been made of the fact that an African-American with the name
Barack Hussein Obama could be elected President. But my personal story
is [[not]] so unique.
(84) You must maintain your power through consent, [[not]] coercion; you must
respect the rights of minorities, and participate with a spirit of tolerance and
compromise.
(85) Just as Israel’s right to exist cannot be denied, neither can Palestine’s.
The United States does [[not]] accept the legitimacy of continued Israeli
settlements.
6 Conclusions
The present study has addressed the differences between the stance styles of Bush’s
and Obama’s addresses to the Arab World in 2008 and 2009 respectively. In the
quantitative results and the qualitative discussion, it has been argued that the higher
frequency of subjective stance markers (mental verbs and modal verbs) and their
co-occurrence with first person pronouns and negation in Obama’s discourse can be
interpreted as an attempt to recontextualize and ‘re-imagine’ the position of the US
policy towards the Middle East. The combination of stance markers and negation in
Obama’s speech evokes commonly held assumptions about the relations between
the US and the Middle East in order to correct these assumptions and propose ‘A
new beginning’ based on cooperation and partnership. Obama’s preference for the
first person pronoun I and his more frequent use of epistemic and deontic modality
show a more personal involvement both with the topic addressed and with his audi-
ence, revealing an attempt to engage actively and personally in a change in the rela-
tions between the US and the Middle East through cooperation and partnership.
Bush’s speech, with a significantly lower frequency of stance markers and nega-
tion and a preference for second person pronouns instead of first person pronouns,
shows a more conventional discourse. The low frequency of negation seems to indi-
cate that there is no need to deconstruct previous assumptions about the status quo
in US international affairs. With regard to the low frequency of modal markers, and
in particular of deontic modals, together with a higher frequency of second person
pronouns, Bush’s speech shows a preference for unmodalized assertions, and con-
sequently, a more authoritative stance. This authoritative stance is reinforced by the
use of the impersonal you and the first person plural we referring to the US govern-
ment, which tend to indicate an avoidance of responsibility on the part of the
speaker.
In brief, while Bush’s stance in his Abu Dhabi speech seems to maintain a previ-
ously accepted status quo in the relations between the US and the Middle East,
Obama’s stance in his Cairo speech seems to be staged as a deconstruction of previ-
ously held assumptions. Obama’s speech can be interpreted as a riskier speech
which relies heavily on the identity and personal projection of the speaker, sup-
ported by the high frequency of first person pronouns referring to his personal iden-
tity, personal I, presidential I, we as the US government and we to refer to all the
parties concerned in the proposed change of international policy. The high fre-
quency of negation suggests that the topic is perceived as controversial and a strong
stance is required. Hence, the politician’s voice is positioned against previously held
assumptions which need to be corrected, though it shows that his frequent use of
markers of modality shows that he is open to other possible alternatives. Indeed,
seen in the light of recent changes in the US political scenario, the present study is
also open to alternative interpretations.
Acknowledgements This study has been carried out as part of the research work of two research
projects, the first one funded by the Ministerio de Ciencia e Innovación (FFI-2008-01471FILO),
and the second one funded by the Ministerio de Economía y Competitividad (FFI-201-30790) to
whom we are grateful.
References
Bakhtin, M. M. (1935 [1981]). The dialogic imagination: Four essays (M. Holquist, Ed. and
C. Emerson & M. Holquist, Trans.). Austin: University of Texas Press.
Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). The Longman grammar of
spoken and written English. London: Longman.
Biber, D., & Finegan, E. (1988). Adverbial stance types in English. Discourse Processes, 11, 1–34.
Boyd, W. (2014a). (New) participatory framework on YouTube? Commenter interaction in US

political speeches. Journal of Pragmatics, 72, 46–58.
Boyd, W. (2014b). Participation and recontextualisation in New Media Political Discourse Analysis
and YouTube. In B. Kaal, I. Maks, & A. van Elfrinkhof (Eds.), From text to political positions
text analysis across disciplines (pp. 245–268). John Benjamins: Amsterdam.
Bush, G. Jr. (2008). Address in Abu Dhabi on ‘Freedom and Extremism in the Middle East’.
Accessed at http://www.americanrhetoric.com/speeches/gwbushabudhabi.htm
Charteris-Black, J. (2004). Corpus approaches to critical metaphor analysis. Palgrave/MacMillan:
Basingstoke.
Charteris-Black, J. (2005/2011). Politicians and rhetoric. The persuasive power of metaphor.
Palgrave/MacMillan: Basingstoke.
Chilton, P. (2004). Analysing political discourse: Theory and practice. London: Routledge.
Du Bois, J. W. (2007). The stance triangle. In R. Englebretson (Ed.), Stancetaking in discourse
(pp. 139–182). Amsterdam: John Benjamins.
Englebretson, R. (Ed.). (2007). Stancetaking in discourse. Amsterdam: John Benjamins.
Evans, V., & Chilton, P. (Eds.). (2010). Language, cognition and space: The state of the art and
new directions. London: Equinox Publishing.
Fairclough, N. (1989). Language and power. London: Longman.
Fairclough, N. (1992). Discourse and social change. Cambridge: Polity Press.
Fairclough, N. (2010). Critical discourse analysis: The critical study of language. London:
Longman.
Fetzer, A. (Ed.). (2013). The pragmatics of political discourse. Amsterdam: John Benjamins.
Givón, T. (1993). English grammar: A function-based introduction, I & II. Amsterdam: John
Benjamins.
Halliday, M. A. K. (1994). An introduction to functional grammar (2nd ed.). London: Arnold.
Halliday, M. A. K., & Matthiessen, C. M. I. M. (2004). An introduction to functional grammar (3rd
ed.). London: Arnold.
Hidalgo Downing, L. (2014). The role of modal-negative synergies in Charles Darwin’s The
Origin of Species. In G. Thompson & L. Alba-Juez (Eds.), Evaluation in context (pp. 259–
279). Amsterdam: John Benjamins.
Hidalgo Downing, L., & Núñez-Perucha, B. (2013). Modality and personal pronouns as indexical
markers of stance: Intersubjective positioning and construction of public identity in media
interviews. In J. I. Marín-Arrese, M. Carretero, J. Arús Hita, & J. van der Auwera (Eds.),
English modality: Core, periphery and evidentiality (pp. 379–410). Berlin: Mouton de Gruyter.
Hunston, S., & Thompson, G. (Eds.). (2000). Evaluation in text: Authorial stance and the con-
struction of discourse. Oxford University Press: Oxford.
Jordan, M. (1998). The power of negation: Text, context and relevance. Journal of Pragmatics,
29(6), 705–752.
Linell, P. (2009). Rethinking language, mind, and world dialogically: Interactional and contextual
theories of human sense-making. Charlotte: Information Age Publishing.
Marín-Arrese, J. I. (2015). Epistemicity and stance: A cross-linguistic study of epistemic stance
strategies in journalistic discourse in English and Spanish. Discourse Studies, 17(2),
210–225.
Marín-Arrese, J. I., Carretero, M., Arús Hita, J., & van der Auwera, J. (Eds.). (2013). English
modality: Core, periphery and evidentiality. Mouton de Gruyter: Berlin.
Martin, J., & White, P. (2005). The language of evaluation. Appraisal in English. New York:
Palgrave.
Obama, B. (2009). Address in Cairo on ‘A new beginning’. Accessed at http://www.nytimes.
com/2009/06/04/us/politics/04obama.text.html?pagewanted=all&_r=0
Rayson, P. (2008). WMatrix: A web-based corpus processing environment. Lancaster: Computing
Department, Lancaster University. http://ucrel.lancs.ac.uk/matrix
Shakespeare, W. (1991). Julius Caesar. Dover drift editions.
Semino, E., Deignan, A., & Littlemore, J. (2013). Metaphor, genre and recontextualization.
Metaphor and Symbol, 28(1), 41–59.
Simon-Vandenbergen, A. M., & Aijmer, K. (2007). The semantic field of modal certainty. A
corpus-based study of English adverbs. Berlin: Mouton de Gruyter.
Thompson, G. (2004). Introducing functional grammar (2nd ed.). London: Hodder Education.
Thompson, G., & Alba-Juez, L. (Eds.). (2014). Evaluation in context. Amsterdam: John Benjamins.
Wilson, J. (1990). Politically speaking. Oxford/Cambridge MA: Basil Blackwell.
The Role of Metadiscourse in Genre Analysis:
Engagement Markers in Undergraduate
Textbooks and Research Articles
Tereza Guziurová
Abstract Metadiscourse was probably first introduced into applied linguistics in

the 1980s and it has attracted continuous interest ever since, despite the fact that
some researchers point out its theoretical and methodological shortcomings.
Drawing on the model introduced by Hyland (Metadiscourse, Continuum, London/
New York, 2005) which presents metadiscourse as one of the significant attempts to
conceptualize the interpersonal aspects of language, this study aims to compare two
academic genres, undergraduate textbook and research article, within one disci-
pline – linguistics. It specifically focuses on one category, engagement markers, and
compares their occurrence and use in the two genres. The results have shown that
the most frequent engagement marker is inclusive we, which however plays a differ-
ent role in both genres. The study also assesses the potential advantages and draw-
backs of the integrative approach to metadiscourse.
Keywords Academic writing • Metadiscourse • Genre analysis • Research article •

Undergraduate textbook • Engagement markers
1 Introduction
In the last 30 years or so, metadiscourse has been used quite frequently as a tool for
characterizing various genres, especially the genres of academic discourse. It might
be surprising that its popularity endures despite numerous definitions of the concept
that vary significantly. It is probably not possible to say any more that metadis-
course is “undertheorized”, as Hyland and Tse suggested in 2004; however, the
differences between its conceptualizations are not negligible, but have important
methodological and practical implications for research. The common aspect of a
number of earlier definitions was their stress on the non-propositionality of meta-
discourse. Crismore et al., for example, defined it as “linguistic material, written or
T. Guziurová (*)
Faculty of Arts, Centre for the Research of Professional Language, University of Ostrava,
Ostrava, Czech Republic
e-mail: tereza.guziurova@osu.cz

DOI 10.1007/978-3-319-54556-1_10
212 T. Guziurová
spoken, which does not add anything to the propositional content” (1993, p. 40).
However, this criterion has not proved to be satisfactory, so writers have instead
looked for a definition of metadiscourse in Halliday’s theory of metafunctions.
Halliday’s ideational metafunction was connected with propositional content; meta-
discourse, on the other hand, was believed to convey either interpersonal or textual
meanings (Vande Kopple 1985, pp. 84–85).
Basically, it is possible to identify two approaches to metadiscourse1 – integra-
tive and non-integrative – depending on whether it includes only text-organizing
elements and elements referring to the text itself, or also the writer’s epistemic and
affective attitude to the text and interaction with the reader. The integrative
(“broad”) approach investigates linguistic elements revealing how the text is orga-
nized, but it also focuses on the writer’s presence in the discourse, the ways in which
he or she comments on the text or expresses his or her attitudes towards it. The
broad approach includes ‘stance’ as a metadiscursive category, usually under head-
ings such as validity markers (hedges, emphatics) and attitude markers. This strand
has been applied in many studies, and it was probably taken furthest by Hyland,
who suggests that “all metadiscourse is interpersonal in that it takes account of the
reader’s knowledge, textual experiences and processing needs” (Hyland 2005,
p. 41).
The non-integrative (“narrow”) approach primarily investigates aspects of
text organization and elements referring to the text itself. This conception of meta-
discourse does not include the writer’s presence in the text in general; rather, meta-
discourse is an explicit expression of a writer’s awareness of the current text.
Therefore, it is defined as “the writer’s explicit commentary on his or her own ongo-
ing text” (Mauranen 1993, p. 154). The second important feature of the non-
integrative approach is the ‘current text’, meaning that references to other texts are
not included within this approach. The narrow approach has been applied (with
some modifications) for example by Mauranen (1993), Schiffrin (1980) and Bunton
(1999).
Drawing on the integrative approach to metadiscourse, this study focuses on one
category, engagement markers, and aims to compare their occurrence and use in two
academic genres, the research article and the undergraduate textbook, within one
discipline – linguistics. Since the analysis is part of a larger research project aiming
at the description of metadiscursive features in the undergraduate textbook, the
chapter also briefly discusses the integrative approach to metadiscourse, represented
especially by Hyland (2005). Despite its popularity among researchers (e.g.
Crismore and Farnsworth 1990; Luukka 1994; Hyland 2005; Bondi 2001; Boggel
2009; Kuhi 2012), the integrative approach has recently been criticized for covering
disparate language phenomena that cannot easily be put under one umbrella term
(text-organizing elements, the expression of stance, writer-reader interaction).
1
The two approaches were probably first distinguished by Mauranen (1993). For a detailed discus-
sion, see, for example, Ädel (2006).
The Role of Metadiscourse in Genre Analysis: Engagement Markers in Undergraduate… 213
Therefore, the study also assesses the potential advantages and drawbacks of this
model.
2 Data and Methodology
The investigation is carried out on the basis of a corpus consisting of seven under-
graduate textbooks and eight research articles written by native speakers of English.
They are all from one discipline, linguistics, in order to avoid the disciplinary varia-
tion in metadiscourse shown by previous studies. The textbooks were published
between 1997 and 2010 and they are all “introductory” textbooks in the sense that
they are regarded as introductions to linguistics or relevant linguistic disciplines
(e.g. phonology, morphology). One or two chapters from each textbook were ana-
lyzed so that the total number of words would be approximately the same (see
Table 1 below). The resulting material consists of over 51,500 words altogether. The
research articles were taken from two well-established journals, English for Specific
Purposes and Journal of Pragmatics, published between 2002 and 2008. Eight
complete articles have been analysed, totalling 53,145 words (see Table 1). All the
articles were selected randomly, but they needed to comply with two criteria: to be
written by native speakers of English, and to be single-authored. The second crite-
rion is particularly important for the present analysis of engagement markers
because one of the aims was to find out how the pronoun we functions in both
genres. Since all the textbooks were single-authored, I tried to compile a parallel
corpus of research articles.
As mentioned above, this study is part of larger research focusing on metadis-
course in the genres of undergraduate textbook and research article. Table 2 below
shows the overall distribution of metadiscourse categories in my corpus. The indi-
vidual categories overlap with Hyland’s (2005), with the exception of transitions in
which I followed Mauranen’s approach (1993) and considered only inter-sentential
connectors. The present study thus applies the integrative approach to metadis-
Table 1 The composition of the corpus

Number of Number of
Textbooks words Research articles words
TB1 Meyer 2009 8003 RA1 Banks 2005 3614
TB2 Knowles 1997 7384 RA2 Moore 2002 6766
TB3 Yule 2010 7458 RA3 Flowerdew 2003 5732
TB4 Lyons 2002 6881 RA4 Charles 2006 6930
TB5 Roach 2006 7258 RA5 Crossley 2007 7045
TB6 Penhallurick 2003 7247 RA6 Spencer-Oatey 2007 7528
TB7 Lieber 2010 7291 RA7 House 2006 8271
RA8 Lumsden 2008 7259
Total 51,522 53,145
214 T. Guziurová
Table 2 The distribution of metadiscourse in textbooks and RAs compared

Textbooks Research articles
Total Items per Total Items per
Category number 1000 words SD number 1000 words SD
Transitions 170 3.3 1.49 181 3.4 0.79
Code glosses 507 9.8 2.52 276 5.2 1.81
Endophorics 242 4.7 3.21 311 5.9 2.15
Frame markers 116 2.3 1.26 121 2.3 0.85
Evidentials 232 4.5 3.31 389 7.3 1.91
Hedges 412 8.0 1.96 586 11.0 2.53
Boosters 156 3.0 1.16 156 2.9 0.85
Attitude markers 196 3.8 1.78 311 5.9 1.38
Self mention 37 0.7 0.80 164 3.1 3.20
Engagement 752 14.6 10.85 271 5.1 4.60
markers
Total 2820 54.7 2766 52.1
Note: SD standard deviation
course, which covers both textual and interpersonal features. However, it does not
aim to discuss all the categories; rather, it focuses on engagement markers which, as
Table 2 indicates, account for the largest proportion of metadiscourse elements in
the textbooks (14.6 items per 1000 words) and they also present the greatest differ-
ence between the genres, with only 5.1 items per 1000 words in the research articles.
The quantitative results are a little higher than the data from Hyland’s research car-
ried out on larger corpora – 8.4 in textbooks and 2.5 in RAs (2005, p. 144; 162), but
the ratio is similar.
As for the individual categories, Hyland (2005, pp. 50–54) characterizes them as
follows:
Transition markers are mainly connectives and adverbial phrases which help read-
ers interpret connections between steps in an argument (e.g. furthermore, in
addition, but).
Frame markers signal text boundaries (e.g. first, next, finally). Items included here
also announce discourse goals (my purpose is) and explicitly label text stages (to
summarize).
Endophoric markers are expressions that refer to other parts of the text (e.g. noted
above, in chapter 1).
Evidentials express the intertextual character of academic writing and are defined
as representations of an idea from another source.
Code glosses supply additional information by rephrasing or explaining what has
been said (e.g. in other words, for example).
Hedges allow writers to withhold complete commitment to a proposition (e.g.
might, perhaps).
Boosters allow writers to express certainty in what they say (e.g. clearly,
obviously).
Attitude markers express the writer’s affective, rather than epistemic, attitude to
propositions (e.g. I agree, unfortunately).
Self mention concerns the explicit authorial presence in the text expressed by first-
person pronouns and possessive adjectives (I, exclusive we, me, our etc.).
Engagement markers explicitly address readers in order to focus their attention or
include them as discourse participants (e.g. inclusive we).
It should be noted that the quantitative approach to metadiscourse has some limi-
tations. Characterizing genres by the amount of metadiscourse elements can be
problematic because while the quantitative differences may be significant, it is also
the qualitative differences that seem to be genre-specific. For example, the number
of hedges was expected to be significantly higher in RAs than in textbooks because
the former is more argumentative in nature; it has been described as “the arena for
conflicting views” (Myers 1992, p. 6). While this has been proved, the analysis has
shown that textbooks also include a considerable number of hedges, but of different
type. The most frequent hedging devices in the RAs proved to be epistemic lexical
verbs, whereas the textbooks showed the highest incidence of adverbs, specifically
approximators (see Guziurová 2014). Therefore, the quantitative analysis will only
be a starting point, and I will focus on the different functions metadiscourse ele-
ments fulfil in the two genres. The next section deals with engagement markers and
their use in the corpus; specifically it discusses the use of pronoun you, imperatives,
questions and it focuses on the pronoun we, which proved to be the most frequent
engagement marker.
3 Engagement Markers
Engagement markers are defined as devices that “explicitly address readers, either
to focus their attention or include them as discourse participants” (Hyland 2005,
p. 53). Two main purposes are distinguished: firstly, to meet readers’ expectations
of inclusion and disciplinary solidarity, addressing them as participants in the dis-
course (mainly by the pronouns you, your and inclusive we); secondly, to engage
readers in the discourse at critical points, predicting possible objections and guiding
them to particular interpretations (by questions and directives such as note, con-
sider) (Hyland 2005). Table 3 shows an overall distribution of engagement markers
in the undergraduate textbooks and research articles in the corpus.
The most common device functioning as an engagement marker is the pronoun
we (inclusive we, but see below), accounting for approximately 70% of engagement
markers in both genres, and it will thus be the main focus of this study. The other
engagement markers included the second person pronouns, questions, imperatives
and various comments whose function is to address readers as participants; they will
be discussed shortly in the following passage. Hyland’s list of engagement markers
also includes obligation modals; however, these were mostly regarded as attitude
markers in the analysis since they often seem to express the writer’s attitude to
216 T. Guziurová
Table 3 Engagement markers in undergraduate textbooks and research articles

Textbooks Research articles
Total Items per 1000 Total Items per 1000
Category number words % number words %
WE (+ our, us) 525 10.19 69.8 194 3.65 71.6
YOU (+ your) 127 2.46 16.9 19 0.36 7.0
Imperatives 55 1.07 7.3 18 0.34 6.6
Questions 36 0.70 4.8 39 0.73 14.4
Others 9 0.17 1.2 1 0.02 0.4
Total 752 14.59 100 271 5.10 100
Table 4 Distribution of engagement markers (EM) in undergraduate textbooks (raw number/

number per 1000 words)
Textbooks TB1 TB2 TB3 TB4 TB5 TB6 TB7
EM 16/2.0 10/1.3 183/24.5 48/7.0 190/26.2 150/20.7 155/21.3
Table 5 Distribution of engagement markers (EM) in research articles (raw number/number per
1000 words)
RAs RA1 RA2 RA3 RA4 RA5 RA6 RA7 RA8
EM 9/2.5 15/2.2 4/0.7 11/1.6 0/0 64/8.5 92/11.1 76/10.5
propositions, as in the following sentence: “Clearly, these findings need to be veri-

fied with much larger corpora.” (RA4) It is true that certain structures with obliga-
tion modals, such as “It should be noted that”, are aimed at readers, signalling that
an important point is coming, but since they do not address readers explicitly, they
were not regarded as engagement markers but rather as attitude markers, expressing
emphasis.
As for the distribution of engagement markers, the individual textbooks and RAs
differed in their use (see Tables 4 and 5 ). Two textbook authors used them occasion-
ally while four of them frequently engaged readers in their texts, mainly by means
of personal pronouns. It is interesting to note that RAs also differed in their use;
three articles published in the Journal of Pragmatics (RA6, RA7, RA8; see Table 5
below) included an overwhelming majority of engagement markers – 232 out of the
total 271, which makes 86%. It could be argued that not only disciplinary conven-
tions, but also the house style of the journal affects the use of metadiscourse. It is
possible that the papers in the Journal of Pragmatics, which addresses “a number of
questions that are essential to our understanding of how language works in com-
municative and social interaction”,2 are more likely to contain engagement markers,
especially inclusive we. The corpus is, however, too small to draw any general
conclusions.
http://www.journals.elsevier.com/journal-of-pragmatics/. Accessed on 6th September 2014.

2
3.1 Pronoun You, Imperatives, Questions
The pronoun you accounts for almost 17% of engagement markers in the text-
books, but only 7% in the RAs. The low frequency especially in the RAs is not
surprising since generic you is considered rather informal (Quirk et al. 1985) and
you with a specific reference refers directly to the addressee but excludes the
speaker. As Kuo (1999) suggests, “from the perspective of the reader-writer rela-
tionship in a journal article, you could sound offensive or detached since it separates
readers as a different group from the writer”. Since the writer appeals to the readers’
approval of the claims he makes, it is not desirable to use exclusive you – it is his
peers in the same disciplinary community he addresses. Second-person you and
your thus occurred only in a single article from the Journal of Pragmatics, in which
the author wanted to illustrate the cooperative principle and opted for an example
showing the alternation of I and you:
(1) In a situation where you wish to borrow my car, ask how it is running and
I say, “I have just had it thoroughly checked,” you would take me to be
implicating that it is in good order. That depends on you supposing I am
being cooperative […].3 (RA8: 1898)
The other occurrences of you were used in similar hypothetical examples.

In the textbooks, the pronoun you was typically used in longer examples, illus-
trating a point in the exposition. For instance, in explaining the concept of presup-
position, the textbook author used the second person pronoun, addressing students
directly and making it more personal:
(2) If someone tells you Your brother is waiting outside, there is an obvious
presupposition that you have a brother. If you are asked Why did you arrive
late?, there is a presupposition that you did arrive late. (TB3: 133)
In addition, the author can address his or her readers (students) directly if they
can try and practice certain things by themselves, as in the description of articulators
in the chapter on phonetics or in the production of certain consonants (example 3).
(3) The hard palate is often called the “roof of the mouth”. You can feel its
smooth curved surface with your tongue. (TB5: 9)
However, the pronoun you might also reflect the unequal relationship between
participants. In example 4, the writer fulfils the role of an expert who gives training
In all cases, emphasis in the examples is mine.

3
218 T. Guziurová
to novices (addressing them directly by a personal pronoun), who might however

have difficulties in understanding it.
(4) If you have not thought about such things before, you may find some difficulty
in understanding the ideas that you have just read about. (TB5: 40)
Moreover, the distribution of power is clearly visible in phrases such as you

should remember.
Imperatives can be considered “risky face strategies” (Hyland 2000, p. 126)
since they direct the audience to some action, but in textbooks such a display of
authority is probably presupposed. The use of imperatives was higher in textbooks
(1.07 items per 1000 words) than in RAs (0.34 items per 1000 words); moreover,
half of the instances in the articles took the form “let us (consider)”, making it more
into a suggestion. The most common imperative in the textbooks was consider,
which is regarded as a cognitive act by Hyland (2002), initiating readers into a new
argument. See, which was also quite common in both genres, is regarded as an endo-
phoric marker in this study rather than an engagement marker since it directs read-
ers to another part of the text or to another text, which is the primary function of
endophoric markers (5):
(5) This was not the role in the fourteenth century of the church (see section
5.1), nor of a French-speaking court. (TB2: 53)
Generally, directives (as Hyland calls them) can be used to guide readers through
the discourse, but more importantly to guide readers’ reasoning, making sure they
understand a point in a certain way (e.g. note, think of, compare).
Finally, questions are regarded as a good strategy to engage readers in a dis-
course. Textbook authors used questions mainly for instructional purposes; they
asked a question and immediately provided an answer, as in example 6. However, in
some cases the questions were not so simple to answer, but their main purpose was
to promote the readers’ interest and show them that the topic is complex. The ques-
tions in example 7 also served as an introduction to a new topic, as they were situ-
ated at the beginning of a chapter. Interestingly, questions appeared several times in
clusters, adding emphasis to the writer’s claims and stressing that despite being an
expert he can only speculate about certain things (example 8). Together with inclu-
sive we, they might evoke an atmosphere of solidarity.
(6) How do we usually know which meaning is intended in a particular sentence?

We usually do so on the basis of linguistic context. (TB3: 129)
(7) But what is it that makes us doubt the realness of the reality we experience?
Could it be language? (TB6: 70)
(8) If we want to test the realness of this reality, how can we? By stepping outside
language? Is that possible? (TB6: 72)
Surprisingly, the research articles in the corpus contained more questions than
textbooks. However, these questions did not function as typical engagement mark-
ers but it seems more appropriate to say that they presented the issues which the
authors aimed to explore.
(9) Surprisingly, however, there has been very little explicit consideration of the
interrelationship between the two concepts. For example, to what extent are
identity and face similar or different? How may theories of identity inform
our understanding of face, and how may they aid our analyses of face?
This paper takes up the challenge of exploring these questions. (RA6: 639)
The authors presented the questions that were addressed in their articles or, alter-
natively, the questions that should be addressed in further research.
3.2 Pronoun We
Much attention has been given to the use of pronoun we in academic writing (e.g.
Kuo 1999; Myers 1989; Harwood 2005; Fløttum et al. 2006). The pronoun we has
been traditionally studied in terms of the different discourse functions it fulfils in
journal articles (e.g. Kuo 1999; Harwood 2005), or in other genres of academic
discourse, even in the context of mathematics classrooms (Rounds 1987). Two
aspects are regularly investigated in connection with we: semantic reference (who
the pronoun refers to) and discourse functions or rhetorical motivations for its use.
This subchapter is going to focus on the semantic reference and the use of pronoun
we mostly in the genre of undergraduate textbook because it was used rather fre-
quently in that genre in the corpus, accounting for 70% of all engagement
markers.
The pronoun we is traditionally perceived as having two semantic functions:
inclusive, in which the addressee is included (I + you), and exclusive, in which the
addressee is excluded (I + they) (Rounds 1987). In addition, special uses of we are
sometimes distinguished, for example ‘inclusive authorial we’ and ‘editorial we’
(Quirk et al. 1985, p. 350). Fløttum et al. (2006) acknowledge the fundamental ref-
erential vagueness of we; it includes the author(s), but there is variation in terms of
whether the reader is included and whether others are, as well as who these potential
others might be (p. 95). Furthermore, the pronoun can be used figuratively, referring
to a single author or even to readers. Rounds (1987) speaks about “semantic remap-
pings” (traditional inclusive and exclusive we are within the range of semantic map-
ping for we) and she recognizes three such cases: (1) we used about the speaker (we
220 T. Guziurová
Table 6 Semantic references Semantic references of WE in the textbooks %

of WE in undergraduate
People (language users) 104 19.8
textbooks
Writer + readers 303 57.7
Writer 41 7.8
Writer + other linguists 72 13.7
Students 5 1.0
Total 525 100
for I), (2) we used about the readers (we for you), and (3) we whose actual referent
is anyone who does the action.
There were 525 instances of the pronoun we (our, us) in the corpus which were
considered explicit or implicit engagement markers. However, their semantic refer-
ents varied and the inclusive/exclusive distinction was not sufficient to describe all
of them, especially due to the fact that all the textbooks were single-authored.
The range of semantic reference of we is given in Table 6; however, the division
is only tentative since the pronoun can be ambiguous, which might be one of the
reasons why it is so popular in academic writing.
The majority of instances referred to the writer and his readers. Writers orga-
nized their texts, stating what was done or what is going to follow, as in the follow-
ing examples:
(10) In the previous chapter, we focused on conceptual meaning. (TB3: 127)
(11) In the next chapter, we will investigate what these normal stages are.
(TB3: 166)
These structures thus function as endophoric markers, referring to other parts of

the text and guiding readers through the discussion, which should facilitate compre-
hension. Writers also remind their readers of the salient parts of the text, often sum-
marizing the previous passage (example 12). Moreover, the inclusive phrase as we
shall see is supposed to “enhance the reader-friendliness of the text and construct
positive politeness by treating the readership as equals” (Harwood 2005, p. 362).
(12) As we have seen, the Neogrammarians took the view that linguistics, in so
far as it is scientific and explanatory, must necessarily be historical.
(TB4: 218)
Another expression of positive politeness is showing solidarity with readers,

which also mitigates the asymmetrical relationship between the writer and the stu-
dents. Example 13, which indicates that “slips of the tongue” are commonly expe-
rienced both by the writer and by the readers showing that teachers are human as
well (contrary to what some students may think), is followed by the humorous com-
ment with the aim of relating the writer to students even more. Even though this we
is more general in that it does not include only the writer and the readers, but also
other people, the interactive character is foregrounded (I + we) since it appeals to

our common experience. Thus it is perhaps less general than the use of we described
below (we as human beings) when authors make general statements about language,
using present tense (see examples 16 and 17). It corroborates Wales’ assumption
that the distinction between specific and generalized we is not clear-cut but rather
forms a scale (1996, p. 59; see below).
(13) We have all experienced difficulty, on some occasion(s), in getting brain and
speech production to work together smoothly. (Some days are worse than
others, of course.) (TB3:160)
Inclusive pronouns also cluster around longer examples or pictures that are sup-
posed to illustrate a point in the exposition. The chapter on pragmatics starts with a
brief definition and then continues with two pictures which should illustrate that
often more is being communicated than is said (example 14). The passage ends with
a more generalised statement that we (human beings) are actively involved in the
interpretation of what we read and hear.
(14) In the other picture, […] we can recognize an advertisement for a sale of
clothes for those babies and toddlers. The word clothes doesn’t appear in
the message, but we can bring that idea to our interpretation of the message
as we work out what the advertiser intended us to understand. We are
actively involved in creating an interpretation of what we read and hear.
(TB3: 129)
Another strategy is what Kuo (1999) calls “assuming shared knowledge, goals,
beliefs, etc.” (p. 131). He analysed 36 scientific journal articles and found that this
discourse function was the most frequent one for inclusive we in his corpus.
Textbook authors also used this strategy in order to mark that readers’ background
knowledge can be presupposed as well as their ability to follow the arguments.
(15) When he [Caxton] transferred his printing press to England, he set it up in

Westminster, which was not only close to the Chancery, but also the ideal
place for him to contact his aristocratic customers. With the benefit of
hindsight we know he was successful. (TB2: 60)
In approximately a fifth of all the cases, we referred to the shared experience of

us as people, or rather as language users (16). Occasionally, this was stated explic-
itly (17).
(16) All the sounds we make when we speak are the result of muscles contracting.
(TB5: 8)
(17) Some researchers have noted that, as language-users, we all experience

occasional difficulty in getting the brain and speech production to work
together smoothly. (TB3: 165)
222 T. Guziurová
We thus includes the writer, his or her readers and all the other people using lan-
guage. Wales (1996) distinguishes between specific exophoric reference of we,
functioning within the immediate context of situation, and generalised/homophoric
we, functioning within the context of culture (p. 59). The distinction between them
is not clear-cut, but rather resembles a continuum. Wales also points out that “even
generalised reference has a strong inter-personal base, speaker- or addressee-
oriented, reflecting we and you’s origins” (ibid.). The use of generalised we thus
enables us to identify writers and readers as part of a group of human beings (in
linguistics textbooks, language users), which serves an educational aim. Relating
the topic to students, showing how we as human beings use language, helps to make
the exposition more interesting, relevant and approachable.
Another use of we seems to be referring solely to the writer, and thus it would
seemingly be exclusive. This includes instances such as we have emphasized, we
have described, we can say that etc., and since in each textbook there was only one
author, it could be regarded as the so-called ‘authorial we’. However, as Rounds
(1987) has pointed out, these instances could potentially be interpreted by the
addressee as inclusive signs, since it is a common teacher’s strategy to talk about
our discussion, even though it is the teacher who makes the exposition. The teacher
wants to make students part of a potential dialogue, as he or she would do in a
classroom.
Furthermore, the pronoun we in the textbooks referred to the whole community
of ‘linguists’ or ‘experts’. Here writers were usually discussing concepts or speak-
ing as members of a disciplinary community (18).
(18) We use the term speech act to describe actions such as “requesting,”
“commanding,” “questioning” or “informing.” We can define a speech act
as the action performed by a speaker with an utterance. If you say, I’ll be
there at six, you are not just speaking, you seem to be performing the
speech act of “promising.” (TB3: 133)
Rounds (1987) interestingly observes that since we is actually ambiguous in

terms of reference, it can contribute to the equal relationships between participants.
On one hand, the teacher can identify him- or herself as a member of the group of
students and express cooperation; on the other hand, he or she could identify him- or
herself as a member of a discourse community, an expert that can perform defining
or naming. Therefore, a student may potentially feel as if s/he is a member of that
community, albeit a junior one (see Fig. 1). The writer then continues his or her
exposition with an example, addressing the students directly.
LINGUISTS + T + STUDENTS
T = teacher
Fig. 1 Semantic remappings (Rounds 1987)

The writers could have opted for a different means of expression – the previous
examples could have been formulated impersonally. Example (18) could have been
rephrased as The term speech act is used […]. It can be defined as […]. The writer
nevertheless decided to use inclusive we. He invites his readers to be part of a scien-
tific discourse, even though they do not belong to an expert disciplinary community
yet. Also, if the writers used passive voice, they would avoid expressing the agent;
however, by using a personal subject, they admit that it is the scientific community
that defines the notion in this way in this particular scientific paradigm, which is
important for students to realize.
Finally, in 5 cases the semantic reference of we seemed to be to students alone
(example 19). It is in fact the students who learned how to write the rules, not the
writer, but to meet the expectations of the learning process as cooperation, the writer
used the first person plural. The roles of the participants seem to be blurred to a
certain extent. This use of we can even be regarded as a little arrogant since it is only
superficial. For example, Quirk et al. (1985, p. 350) point out that in doctor/patient
communication the use of inclusive pronouns (How are we feeling today?) may be
understood as condescending since it actually refers to the patient alone and it is
only cosmetic.
(19) We have actually looked at some such restrictions in chapter 3 (section 3.2),
when we learned how to write lexeme formation rules. We learned that
there could be different sorts of restrictions on what sorts of base an affix
might attach to […]. (TB7: 64)
To summarize, the pronoun we accounts for 70% of all engagement markers in

the undergraduate textbooks. Their semantic referents varied and the inclusive/
exclusive distinction was not sufficient to describe all of them. The pronoun occurred
in several discourse functions4:
• discourse guide (example 10)
• showing solidarity with readers (example 13)
• illustrating arguments by examples or pictures (example 14)
• assuming shared knowledge, goals, beliefs etc. (example 15)
• giving exposition (we as language users) (example 16)
• naming, defining, classifying etc. (we as linguists/experts) (example 18).
Returning now to research articles, the corpus yielded 222 examples of we (our,
us). They were tentatively divided into inclusive and exclusive since, according to
Hyland’s methodology, only inclusive we is treated as an engagement marker, while
exclusive we belongs to the category of self mention (Hyland 2005, p. 53). This
division is far from ideal for the reasons mentioned above, as well as the fact that
inclusive we obviously includes the writer and can also be regarded as self mention.
4
These discourse functions are not connected with the pronoun we alone, but rather result from the
structure in which it occurs, i.e. the semantics of the verb phrase and the context (see Dontcheva-
Navrátilová 2013).
224 T. Guziurová
However, I decided to comply with the traditional division exclusive/inclusive and

thus 28 cases out of 222 were considered exclusive, so that only 194 instances of we
were counted as engagement markers. Example 20 contains an exclusive we, which
could be described as “proposing a theory, approach etc.” in Kuo’s classification of
discourse functions (1999, p. 130).
(20) If, therefore, we argue that face is always interactionally constituted, it will
be necessary to interpret the concept ‘interaction’ very broadly, so that it
includes not only synchronous, face-to-face interaction, but also
asynchronous communication and general public awareness. (RA6: 653)
The number of inclusive we in the corpus is surprisingly high, in comparison e.g.

with Zapletalová’s (2009) results based on the articles from the Journal of
Linguistics, in which inclusive we accounted for 8.2% only (p. 160). As mentioned
above, this could be partly explained by the character of the corpus; three articles
from the Journal of Pragmatics contained a high number of pronouns, including
inclusive we.
Inclusive we in academic writing has often been interpreted in terms of positive
and negative politeness. It can be seen as a manifestation of positive politeness in
cases where it stresses the solidarity with readers, acknowledging them as disciplin-
ary equals. It can build a bond with readers by indicating that “the argument of the
text is being built up by a collaborative writer/reader effort” (Harwood 2005, p. 346).
On the other hand, pronouns can be seen as negative politeness devices. Harwood
(2005) points out that the effect of inclusive we can be “to diminish writer responsi-
bility for an imperfect state of affairs” (p. 348). At least, it allows the writer to share
responsibility with his or her colleagues if he or she speaks on behalf of the whole
disciplinary community:
We have not fully understood the medical implications of snuff-taking. (Harwood 2005,
p. 348)
Another way of mitigating face-threatening acts (FTAs) is by using inclusive

pronouns when the writer is making a criticism. He or she tries to minimize the FTA
in order to ensure that his or her claims will be considered (or rather accredited)
(ibid.).
The corpus yielded 194 instances of we that could be regarded as engagement
markers. The semantic reference (see Table 7) ranges from the writer and his or her
readers, the discipline as a whole, to a generic we referring to language users (simi-
Table 7 Semantic references Semantic references of WE in RAs %

of WE in RAs
People (language users) 66 34
Writer + readers 96 49.5
Writer + other linguists 32 16.5
(discipline as a whole)
Total 194 100
larly to the textbooks). Nevertheless, the division is only tentative since there were
ambiguous cases resulting, for example, from the fact that the audience primarily
addressed are linguists, that is a part of a disciplinary community. It was sometimes
difficult to decide if the reference is to readers only or to the disciplinary members/
linguists as a whole. Again, there were differences between individual articles, with
the majority of instances occurring in the Journal of Pragmatics, while one article
from the ESP Journal did not contain any first person pronoun at all.
Inclusive we was again used as a discourse guide, referring forward to announce
what is going to come, or backward to remind readers of salient points, which also
allows the writer to make a summary or provide new relations between what was
said earlier in the text and the present point (21).
(21) By contrast, the proper subject for sociology – the social – would appear to
be almost limitless. (As an example germane to this paper, sociology feels
itself qualified to comment on the scientific – as we have seen in the sub-
discipline of ‘sociology of scientific knowledge’ – whereas for physics the
social is quite outside its ambit.) (RA2: 358)
Inclusive we referring to “the writer and readers” also serves the function of
assuming shared knowledge, goals, beliefs etc. (as identified by Kuo 1999). In (22),
the writer presupposes that readers understand the example similarly, engaging
them in the research process at the same time.
(22) In the example above, we understand that all members of the field know that
uniformity along the length of the superconducting wire is one of the key
factors and that this knowledge is shared by, but not limited to, the thesis
writer him/herself. (RA4: 315)
The authors of the articles from the Journal of Pragmatics also used inclusive we
in the examples the function of which is to illustrate their points. Again, it is prob-
ably the topic that enables them to engage readers – the analysis of ‘face’ in (23),
and intonation patterns in (24) which can be read aloud.
(23) Suppose, for example, a friend tries to force us to do something or ignores

our request for help (infringing, respectively, our sense of personal
entitlement to freedom and to association), we may simply feel irritated or
annoyed. (RA6: 652)
(24) In (2a–c), we can choose to place our main emphasis on any of the three
words, to reflect their relative importance in the situation in which we make
the utterance. (RA7: 1544)
Also, the pragmatics articles included we referring generally to people as lan-

guage users (25). This use was in a way similar to that in the undergraduate text-
226 T. Guziurová
books but it did not occur so often (there was not a single occurrence in the ESP
Journal) since the writers probably preferred impersonal expressions.
(25) Whatever language we speak, we are most unlikely to do so on a monotone.

(RA7: 1542)
Furthermore, the pronoun we in the articles referred to the discourse community

as a whole, describing the practices and beliefs of the members. Unlike in the text-
books, the pronoun does not serve the function of defining, classifying, or giving
labels since it would be unacceptable to say “we define a speech act as …” meaning
that it is generally defined as such. The authors could have stated, of course, that
they are using the term in this particular sense in the paper, but they resorted to
impersonal structures instead (“An integral citation is defined as a research report
[…]”). The pronoun we referring to the whole discourse community is rather used
as a negative politeness device since it occurs in cases where the author stresses
what we should do or need to do as linguists, including him- or herself in the obliga-
tion (26) and (27). Or if a disciplinary community is reduced to a homogeneous
entity whose practices are criticized, the writer makes sure he is seen as part of that
community as well (28), mitigating the imposition by an additional hedge (we may
draw).
(26) Among other things, it suggests to us, as applied linguists and teachers,
that we need always to keep an analytical eye not only on our texts, but
also on those who would engage with them […]. (RA2: 362)
(27) Clearly, we need to be open to a range of cases displaying variation in the

form and nature of the cooperation. (RA8: 1902)
(28) It is important to know more about these matters; for if we rely too much on
an exclusively textual approach, we may draw conclusions too readily
about the educational value of such texts, and also about the ways in
which disciplinary cultures are revealed to our students. (RA2: 362)
3.3 Discussion
This section has focused on the use of engagement markers in undergraduate text-
books and RAs. The most common device in the two genres is the pronoun we,
accounting for approximately 70% of engagement markers in both genres. The high
number of engagement markers in linguistics textbooks indicated their interactive
character. Textbook authors addressed readers directly (the pronoun you, direc-
tives), or engaged them in the discourse using inclusive we. The asymmetrical rela-
tionship between the writer as an expert and the readers as novices in the discipline
can be seen in the use of pronoun you, which might imply a certain distance between
participants, and the use of directives. On the other hand, inclusive we can mitigate
the distribution of power by drawing students into the shared world of disciplinary
understanding. Another reason for employing engagement markers is educational.
Relating the topic to students, showing how we as human beings use language, helps
to make the exposition more interesting, relevant and approachable.
Generally, the main difference in the use of pronoun we between undergraduate
textbooks and RAs seems to result from the character of the genres themselves.
According to Martin (1997), genres are social processes so they reflect the relation-
ship between the subjects that participate in them. The same forms of engagement
markers have different functions in the two genres. A high frequency of the pronoun
we in textbooks suggests their interactive character, engaging readers in the dis-
course, addressing them as discourse participants. Various uses of the pronoun may
suggest the atmosphere in the classroom, which enables the teacher to use we even
for the actions that he does himself (e.g. we emphasized that). The main aim seems
to be to engage and motivate readers/students since it complies with the main com-
municative purpose of the genre, which is educational.
On the other hand, the pronoun we seems to fulfil other functions besides engage-
ment in RAs. The writer engages the reader but with an additional aim of disguising
himself as the real agent. If the writer says we argue that (as in the example 20
above), when [s]he is the one making the claim, it seems much harder for the reader
to disagree. Similarly, in (29) it is the author who found 19 parallel examples, but
using we enables him to seek agreement.
(29) It is seen that all of the nominalizations of Motte’s translation were also
nominalized processes in Newton’s original Latin, where we find the 19
parallel examples for the 168 word Latin text. (RA1: 352)
As Mühlhäusler and Harré (1990) point out, “the use of ‘we’ instead of ‘I’ also
diminishes the responsibilities of the speaker, since he or she is portrayed as col-
laborating with the hearer” (p. 175). That is also probably why a number of cases of
the ‘authorial we’ function as a hedge at the same time:
(30) We could say that linguistic cooperation can expand like an accordion to
encompass what has been described as ‘extra-linguistic cooperation’ […].
(RA8: 1902)
(31) We may explain these differences by reference to differences in the ideology

and epistemology of the two disciplines. (RA4: 313)
Even though this strategy appears in textbooks as well, it is not so prominent and
mostly neutralized by other uses of the pronoun.
Different functions of the pronoun we also account for the fact that I decided not
to treat the examples of exclusive authorial we as engagement markers in RAs, but
I did so in textbooks. The division exclusive/inclusive seems too simplified to be
228 T. Guziurová
able to explain all the complex relations, and it can be argued that even these uses
of we for I may be interpreted by the addressee as inclusive signs, especially in the
textbooks (see Rounds 1987 above). On the other hand, they seem to fulfil a differ-
ent function in RAs since they are used rather to disguise the writer than to engage
readers in the discourse, which is why they have not been classified as engagement
markers. Generally, the use of pronoun we may diminish writer responsibility,
which can be shared with his colleagues, readers etc.
It remains to be said that there are other motivations for the use of pronouns in
academic writing which, however, are not the subject of this study. In studying lan-
guage as a meaning potential (Halliday 1978) we ask why certain expressions are
preferred over others in particular contexts. These choices might result not only
from the pragmatic interpretations of functions outlined above, but, more generally,
from the functioning of language itself. For example, the use of active/passive forms
may be conditioned by the information structure of the sentence, specifically by the
principles of end-focus and end-weight. Studies have shown that although the pas-
sive voice plays an important role in academic writing, it is the active voice that
prevails; Dušková, for example, found out that the passive voice accounted for
20.68% of finite verb forms in her corpus of scientific writing (1999, p. 140). She
also points out that “the use of we + the active voice of a transitive verb makes pos-
sible late placement of the rhematic element, the object, which in the passive would
come as the subject first” (ibid.). Similar cases can undoubtedly be found in my
corpus (in both genres), for example:
(32) In the list here, we also find the word acidity, plus four other derivatives of it
(subacidity, nonacidity, hypoacidity, hyperacidity). (TB7: 65)
Another tendency influencing the choice of active/passive voice is connected

with thematic progression and the tendency to maintain the same subject in certain
passages.
Finally, academic writing has its own conventions which also concern the use of
pronouns and have to be taken into account in the analysis. While authorial we is
still quite frequent in Czech academic discourse, it seems to be rejected in English
style manuals and certain scientific journals.5
4 The Integrative Approach to Metadiscourse Revisited
This chapter is going to end where it started – with the discussion of metadiscourse.
Applying the ‘integrative approach’ represented by Hyland’s model to two aca-
demic genres uncovered several important points. First, the integrative approach
enables the researcher to characterize the genre from different viewpoints since it is
5
See e.g. Wales (1996, p. 65) who discusses different views of the so-called ‘authorial we’.
According to Henry David Thoreau “it should only be used by royalty, editors, pregnant women
and people who ate worms” (ibid.).
not limited to text organizing elements or elements referring only to the text. Genres
can thus be studied from all the aspects that do not add to the propositional content
of the text (even though the term propositional content is controversial itself).
Metadiscursive categories are clearly interrelated and the actual expressions in the
text are multifunctional (e.g. it is now important to reflect on), signalling both tex-
tual and interpersonal meanings. Furthermore, the integrative approach covering
different interpersonal aspects seems suitable for the analysis of academic genres
since the expressions of interpersonality are not so frequent and expected in com-
parison with other genres.
On the other hand, the integrative approach has its drawbacks. As already men-
tioned at the end of Sect. 1, it has been criticized for covering different language
phenomena that cannot be easily put under one umbrella term. Hyland defined
metadiscourse as “the cover term for the self-reflective6 expressions used to negoti-
ate interactional meanings in a text, assisting the writer (or speaker) to express a
viewpoint and engage with readers as members of a particular community” (Hyland
2005, p. 37). Furthermore, he considers non-propositionality to be one of the key
principles characterizing metadiscourse, stating that “metadiscourse is distinct from
propositional aspects of discourse” (Hyland 2005, p. 38). However, both of these
criteria can be challenged.
If we look at all the categories in Hyland’s model, it might be possible to classify
them in the following way (see Table 8): endophoric markers, frame markers, code
glosses and transitions can be considered self-reflective in a narrow sense in that
they refer to the current text. However, they differ in the degree of explicitness of
Table 8 Hyland’s model of metadiscourse revisited

Self-reflective metadiscourse Endophoric markers (refer to other High explicitness
(textual reflexivity) parts of the text)
Frame markers (signal text boundaries)
Code glosses (elaborate propositional ↓
meanings)
Transitions (express relations between Low explicitness
clauses)
References to other texts Evidentials (refer to information from other text)
Stance (epistemic and Hedges (withhold complete commitment to a proposition)
attitudinal) Boosters (allow writers to express certainty in what they say,
close down alternatives)
Attitude markers (indicate the writer’s attitude to
propositions)
Writer-reader interaction Self mention (the explicit authorial presence in the text
expressed by first-person pronouns)
Engagement markers (explicitly address readers)
The emphasis is mine.

6
230 T. Guziurová
text reflexivity, with endophoric markers being probably the most explicit and tran-
sitions the least.7 Evidentials are not self-reflective but refer to other textual sources,
covering citations, paraphrases etc. Hedges, boosters and attitude markers can be
considered pragmatic categories commenting on the content of the propositions –
they express epistemic and affective stance, thus being self-reflective in a different
sense. It is the writer’s evaluation of the state of affairs expressed in the proposi-
tions. Finally, self mention and engagement markers are primarily addressing
writer-reader interaction.
The non-propositionality as another criterion of metadiscourse also seems prob-
lematic. ‘Proposition’ is a semantic term, which originated in logic, and as such it is
not easily transferable to discourse analysis. The traditional truth-conditional crite-
ria do not apply here because a number of metadiscursive statements can be
described as true or false. However, even if we loosen the criteria and distinguish
“things in the world and things in the discourse, propositions and metadiscourse”,
as Hyland proposed (Hyland 2005, p. 38), there are still many cases which remain
problematic. Considering attitude markers, for example, one of Hyland’s examples
reads: “The basis of the enormous productivity and affluence of modern industrial
societies is their fantastic store of technological information” (Hyland 2005, p. 164).
It is questionable whether “enormous” is an expression of the writer’s attitude.
While Hyland interpreted it as an attitude marker, i.e. as non-propositional (thus
qualifying as metadiscourse), it could equally be argued that it contributes to the
proposition expressed by the text. Similarly, certain hedges are believed to affect the
propositional content, e.g. approximators (somewhat, sort of, approximately).
What seems to be a common denominator in all of Hyland’s categories is the
writer’s explicit presence in a discourse. Although it is undoubtedly a matter of
degree, it can be argued that it is the writer who comments on the form of the text,
he or she also expresses stance towards its content and interacts with the reader.
Considering the Jakobsonian model of language functions, Ädel (2006) regards the
metalinguistic function as the indispensable one in her reflexive model. In Hyland’s
approach (and generally all broad approaches to metadiscourse) it would be the
expressive function which is crucial (although in a more general sense than formu-
lated by Jakobson [1980, p. 82], i.e. “a direct expression of the speaker’s attitude
toward what he is speaking about”).
Generally, it seems that the integrative approaches to metadiscourse have moved
away from text reflexivity as the capacity of a language to refer to or describe itself,
and instead foreground the interpersonal meanings. Whether this approach is justi-
fied today (when the interpersonal aspects of language are conceptualized under the
headings of stance, evaluation or positioning) is difficult to say. However, at least in
academic writing the concept seems to have been useful in showing how writers
7
Degrees of explicitness in text reflexivity are discussed in Mauranen (1993), who also considers
internal connectors to be of low explicitness.
project themselves into their discourses in order to structure them, negotiate mean-
ings and engage readers as discourse participants.
Acknowledgements The study was supported by the research project CZ.1.07/2.3.00/20.0222,

‘Posílení rozvoje Centra výzkumu odborného jazyka angličtiny a němčiny na FF OU’ [Centre for
the Research of Professional Language], funded by the Ministry of Education, Youth and Sports of
the Czech Republic.
References
Ädel, A. (2006). Metadiscourse in L1 and L2 English. Amsterdam/Philadelphia: John Benjamins

Publishing Company.
Boggel, S. (2009). Metadiscourse in middle English and early modern English religious texts. A
Corpus-based Study. Frankfurt am Main: Peter Lang.
Bondi, M. (2001). Small corpora and language variation. In M. Ghadessy, A. Henry, & R. L.
Roseberry (Eds.), Small corpus studies and ELT: Theory and practice (pp. 135–174).
Bunton, D. (1999). The use of higher level metatext in PhD theses. English for Specific Purposes,
18, 41–56.
Crismore, A., & Farnsworth, R. (1990). Metadiscourse in popular and professional science dis-
course. In W. Nash (Ed.), The writing scholar: Studies in academic discourse (pp. 118–136).
Newbury Park: Sage.
Crismore, A., Markkanen, R., & Steffensen, M. (1993). Metadiscourse in persuasive writing.
Written Communication, 10(1), 39–71.
Dontcheva-Navrátilová, O. (2013). Authorial presence in academic discourse: Function of author-
reference pronouns. Linguistica Pragensia, 23(1), 9–30.
Dušková, L. (1999). On some functional and stylistic aspects of the passive in the present-day
English. In Studies in English language. Part 1 (pp. 113–148). Prague: Karolinum.
Fløttum, K., Dahl, T., & Kinn, T. (2006). Academic voices: Across languages and disciplines.
Amsterdam: Benjamins.
Guziurová, T. (2014). Metadiscourse in undergraduate textbooks and research articles in applied
linguistics (Unpublished doctoral dissertation). Ostrava: University of Ostrava.
Halliday, M. A. K. (1978). Language as social semiotic. London: Edward Arnold.
Harwood, N. (2005). ‘We do not seem to have a theory… the theory I present here attempts to fill
this gap’: Inclusive and exclusive pronouns in academic writing. Applied Linguistics, 26(3),
343–375.
Hyland, K. (2000). Disciplinary discourses: Social interactions in academic writing. Ann Arbour:
The University of Michigan Press.
Hyland, K. (2002). Options of identity in academic writing. ELT Journal, 56(4), 351–358.
Hyland, K. (2005). Metadiscourse. London/New York: Continuum.
Jakobson, R. (1980). The framework of language. Ann Arbour: The University of Michigan Press.
Kuhi, D. (2012). Interpersonal resources in academic discourse: Research genres vs. instructional
genres. Saarbrücken: Lambert Academic Publishing.
Kuo, C.-H. (1999). The use of personal pronouns: Role relationships in scientific journal articles.
English for Specific Purposes, 18(2), 121–138.
Luukka, M. R. (1994). Metadiscourse in academic texts. In B. I. Gunnarsson & B. Nordberg
(Eds.), Text and talk in professional context. Uppsala: ASLA.
Martin, J. R. (1997). Analysing genre: Functional parameters. In F. Christie, & J. R. Martin, Genre
and institutions. Social processes in the workplace and school (pp. 3–39). London/New York:
Continuum.
232 T. Guziurová
Mauranen, A. (1993). Cultural differences in academic rhetoric: A textlinguistic study. Frankfurt

am Main: Peter Lang.
Mühlhäusler, P., & Harré, R. (1990). Pronouns and people. The linguistic construction of social
and personal identity. Oxford: Blackwell.
Myers, G. (1989). The pragmatics of politeness in scientific texts. Applied Linguistics, 10(4), 1–35.
Myers, G. (1992). Textbooks and the sociology of scientific knowledge. English for Specific
Purposes, 11(1), 3–17.
Rounds, P. L. (1987). Multifunctional personal pronouns use in an educational setting. English for
Specific Purposes, 6(1), 13–29.
Schiffrin, D. (1980). Metatalk: Organisational and evaluative brackets in discourse. Sociological
Inquiry: Language and Social Interaction, 50, 199–236.
Vande Kopple, W. (1985). Some exploratory discourse on metadiscourse. College Composition
and Communication, 36(1), 82–93.
Wales, K. (1996). Personal pronouns in present-day English. Cambridge: Cambridge University
Press.
Zapletalová, G. (2009). Academic discourse and the genre of research article. Banská Bystrica:
Matej Bel University and University of Ostrava.
Corpus
Textbook Chapters
(TB1) Meyer, C. F. (2009). Introducing English linguistics (pp. 2–15). Cambridge: Cambridge
University Press.
(TB2) Knowles, G. (1997). A cultural history of the English language (pp. 46–62). London:
Arnold.
(TB3) Yule, G. (2010). The study of language (4th ed., pp. 127–136; 156–166). Cambridge:
(TB4) Lyons, J. (2002). Language and linguistics (pp. 216–235). Cambridge: Cambridge

University Press. (Original work published 1981).
(TB5) Roach, P. (2006). English phonetics and phonology (3rd ed., pp. 8–11; 38–47). Cambridge:
(TB6) Penhallurick, R. (2003). Studying the english language (pp. 70–84; 58–60). Hampshire/
New York: Palgrave MacMillan.
(TB7) Lieber, R. (2010). Introducing morphology (pp. 59–73; 75–85). Cambridge: Cambridge
University Press.
Research Articles
(RA1) Banks, D. (2005). On the historical origins of nominalized process in scientific text. English
for Specific Purposes, 24(3), 347–357.
(RA2) Moore, T. (2002). Knowledge and agency: A study of ‘metaphenomenal discourse’ in text-
books from three disciplines. English for Specific Purposes, 21(4), 347–366.
(RA3) Flowerdew, J. (2003). Signalling nouns in discourse. English for Specific Purposes, 22(4),
329–346.
(RA4) Charles, M. (2006). Phraseological patterns in reporting clauses used in citation: A corpus-
based study of theses in two disciplines. English for Specific Purposes, 25(3), 310–331.
(RA5) Crossley, S. (2007). A chronotopic approach to genre analysis: An exploratory study.
English for Specific Purposes, 26(1), 4–24.
(RA6) Spencer-Oatey, H. (2007). Theories of identity and the analysis of face. Journal of

Pragmatics, 39, 639–656.
(RA7) House, J. (2006). Constructing a context with intonation. Journal of Pragmatics, 38,
1542–1558.
(RA8) Lumsden, D. (2008). Kinds of conversational cooperation. Journal of Pragmatics, 40,
1896–1908.

Contrastive Analysis of Discourse

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Contrastive Analysis of Discourse

Caricato da

Copyright:

Formati disponibili

Yearbook of Corpus Linguistics and Pragmatics

Advisory Editorial Board

ISSN 2213-6819 ISSN 2213-6827 (electronic)

Library of Congress Control Number: 2017936967

© Springer International Publishing AG 2017

Printed on acid-free paper

This Springer imprint is published by Springer Nature

Part I Contrastive Analysis with Parallel Corpora

Part II Contrastive Analysis with Comparable Corpora

Part III Contrastive Analysis Across Genres of English

Karin Aijmer University of Gothenburg, Gothenburg, Sweden

Abstract The aim of this issue of the Yearbook of Corpus Linguistics and

Keywords Contrastive linguistics • Parallel corpora • Comparable corpora • Genre

1 The ‘New’ Contrastive Analysis

© Springer International Publishing AG 2017 1

to focus on particular differences in the structural features of two languages with a

2 The Notion of Genre

Language needs to be studied in relation to aspects of the communication situation

3 Parallel Corpora and Comparable Corpora

monolingual corpora in the compared languages. Depending on the research aims,

4 An Overview of the Volume

5 Contrastive Analysis with Parallel Corpora

6 Contrastive Analysis with Comparable Corpora

languages. The best predictors of adverbial clause placement were shown to be

7 Contrastive Analysis Across Genres of English

The final section consists of three studies of particular English genres.

Keywords Obligation • Genre • Parallel corpus • must/måste

© Springer International Publishing AG 2017 13

has drawn attention to the specific distribution of obligation markers in ­political

It would be hard to give a dictionary description of the semantic notion of obligation

4 The Marking of Obligation in English and Swedish

4.1 English Obligation Markers in a Translation Perspective

Table 1 Epistemic and Fiction Non-fiction

Table 2 Epistemic and Fiction Non-fiction

Obligation can be expressed in many different (grammatical and lexical) ways

4.2 Swedish Obligation Markers in a Translation Perspective

The Swedish modal auxiliaries meaning obligation in my material are måste, få

Table 4 Correspondences of English must in the ESPC (EO-> ST)

15 examples were negated

5 Obligation Markers in English and Swedish Fiction

5.1 English Obligation Markers in Fiction

(2) Galleries are frightening places, places of evaluation, of judgment.

Obligation expressions have been associated with ‘performativity’ and situations

(7) “It ‘s electronic,” Annette said weakly.

(11) “She did n’t need to sit. (PDJ1)

5.2 Swedish Obligation Markers in Fiction

(14) Nu kom Torsten ut i Johans synfält.

The constraint imposed can be associated with something negative. ‘Going to

(16) - Gärna, svarade MacDuff på min inbjudan.

(19) - Jag behöver ta med mig kassetten, sa han. (HM2)

Behöver was found as a translation equivalent of have to but not of must.

Ska(ll) is also used in specific contexts of usage. As a deontic modal auxiliary

(21) Du ska hem och äta! (ARP1)

The ‘manufacturer’ tells the employee:

(22) - Du ska alltid ha en lista över personliga tillhörigheter i väsklocket. (RJ1)

must deontic imposition

evaluation evaluation strong weak

must should ’ll have to

Fig. 1 The meanings of modal obligation markers in fiction

6 Obligation Markers in English and Swedish Non-fiction

6.1 English Obligation Markers in Non-fiction

Not surprisingly it is often difficult to decide whether an obligation marker is

We need to is especially appropriate to express that the action imposed represents

(26) We need to see these plans implemented as quickly as possible. (EBOW1)

1 The ‘New’ Contrastive Analysis

2 The Notion of Genre

3 Parallel Corpora and Comparable Corpora

4 An Overview of the Volume

5 Contrastive Analysis with Parallel Corpora

6 Contrastive Analysis with Comparable Corpora

7 Contrastive Analysis Across Genres of English

has drawn attention to the specific distribution of obligation markers in political

4 The Marking of Obligation in English and Swedish

4.1 English Obligation Markers in a Translation Perspective

4.2 Swedish Obligation Markers in a Translation Perspective

5 Obligation Markers in English and Swedish Fiction

5.1 English Obligation Markers in Fiction

5.2 Swedish Obligation Markers in Fiction

6 Obligation Markers in English and Swedish Non-fiction

6.1 English Obligation Markers in Non-fiction

6.2 Swedish Obligation Markers in Non-fiction

3 Functional Translation Correspondence of Dus and So

3.1 Mark a Result

3.2 Mark Inferential Relations

3.3 Draw a Conclusion on a Textual Level

3.4 Marking Boundaries Between Discourse Sections

3.5 Start a New Sequence

4.3 Text Types

3 Downloading and Sorting the Data

3.1 Source (Original Speaker) Unknown

3.2 Source (Original Speaker) Known

4 Prý Across the Registers

4.1 Prý in the Fiction Subcorpus